Monitor: Throttle EnviroDIY values to 2 weeks #2710

rajadain · 2018-03-07T02:00:27Z

Overview

Fetching 96K values per variable can overload the system. By limiting EnviroDIY to 1 month instead of 1 year, we get a more manageable quantity ~12K values per variable. This was previously done for NWISUV in #2494.

Is this acceptable @ajrobbins @aufdenkampe?

Connects #2709

Demo

Testing Instructions

Check out this branch and bundle
Go to :8000/ and select a shape in the Philadelphia area. Proceed to Analyze.
Switch to the Monitor tab and search for EnviroDIY. Switch to the CUAHSI tab.
- If you don't see any results for EnviroDIY under CUAHSI, clear the cache and try again:
```
vagrant ssh services -c 'redis-cli -n 1 --raw KEYS ":1:bigcz*" | xargs redis-cli -n 1 DEL'
```
Open the Detail view of any result. Ensure it fetches values correctly and you can see them in the chart. Ensure the chart has 1 month of values.

rajadain · 2018-03-07T14:28:29Z

@azavea-bot rebuild

arottersman

Tested!

ajrobbins · 2018-03-07T16:26:37Z

Is there a way to test/know if this threshold is sufficient to prevent site crashes? That's my main criteria.

rajadain · 2018-03-07T16:31:29Z

When fetching a year of values, it would take forever and freeze the VM locally. After limiting to a month, most variables end up with values within the timeout (although it still takes almost the entirety of the 60 second limit). Those that don't make the limit just take too long, they don't freeze the VM. I don't expect to see any crashes like the ones we saw initially with this limit in place.

We can further restrict the values to 1 week or 1 day if necessary, based on response seen on staging.

aufdenkampe · 2018-03-07T16:45:12Z

@rajadain, thanks for testing how the EnviroDIY data loads. Indeed, the Water One Flow web service, and WaterML delivery, is fundamentally slow, and many EnviroDIY sensor stations are recording every 5 or 10 minutes (relative to USGS, which is typically 10 or 15 min). So it all make sense.

I agree with @ajrobbins that the primary concern is with crashing.

Limiting to 1 month seems reasonable, but perhaps 2 weeks might make sense in order keep things relatively snappy. @emiliom, what do you think?

Since yesterday, @emiliom and @horsburgh have been discussing a speedier web service for EnviroDIY, as a potential priority for Monitor. Not for this release, however!

ajrobbins · 2018-03-07T17:14:03Z

+1 for two weeks, for max performance and min crashing potential!

Previously we could only specify durations in unit lenths, e.g. 1 week, 1 month, 1 year. This allows the specification of integral lengths, e.g. 2 weeks, 3 months, etc.

Fetching 96K values per variable can overload the system. By limiting EnviroDIY to 2 weeks instead of 1 year, we get a more manageable quantity.

rajadain · 2018-03-07T19:01:01Z

@arottersman could you verify this again please? Just added a refactor that allows specifying integral (rather than unit) durations, and limited EnviroDIY to 2 weeks.

arottersman

+2, tested for NWISUV, NWISDV, and EnviroDIY

rajadain · 2018-03-07T19:43:59Z

Thanks for taking a look!

emiliom · 2018-03-07T22:21:29Z

Great to see this progress. I had noticed the timeouts yesterday. BTW, Hi @rajadain! Nice to interact with you again 😃

I'm intrigued by this: "When fetching a year of values, it would take forever and freeze the VM locally."

Not that I'm suggesting we insist of fetching a year, but if the VM is freezing it sounds like other mitigation / error-catching steps are in order, beyond simple service timeouts.

2 weeks seems overly short, but so be it. Longer term, it'd be good to explore if there are strategies for staging the time series data requests to happen in the background, sequentially or on demand (I'm assuming currently they're happening "in parallel" for all variables, all at once?).

Regarding @aufdenkampe's comment:

Since yesterday, @emiliom and @horsburgh have been discussing a speedier web service for EnviroDIY, as a potential priority for Monitor. Not for this release, however!

I suspect this is not just not for this release, but also not for the next month or two either!

rajadain · 2018-03-08T16:47:44Z

Hi @emiliom,

We spent some time looking into the performance bottlenecks, and found that the largest resource utilization comes from ulmo itself. To demonstrate, consider this simple script which fetches one year of EnviroDIY data for one variable:

Sample Script

from ulmo.cuahsi import wof

@profile
def get_values():
    wsdl = 'http://data.envirodiy.org/wofpy/soap/cuahsi_1_1/.wsdl'
    site = 'EnviroDIY:JRains1'
    variable = 'envirodiy:MaxBotix_MB7386_Distance'
    from_date = '03/18/2017'
    to_date = '02/15/2018'

    wof.get_values(wsdl, site, variable, from_date, to_date)

if __name__ == '__main__':
    get_values()

Using line_profiler, memory_profiler, and psutil, the CPU and RAM profiles are as follows:

CPU Profile

$ kernprof -l -v get_values.py

Wrote profile results to get_values.py.lprof
Timer unit: 1e-06 s

Total time: 84.4948 s
File: get_values.py
Function: get_values at line 4

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     4                                           @profile
     5                                           def get_values():
     6         1          5.0      5.0      0.0      wsdl = 'http://data.envirodiy.org/wofpy/soap/cuahsi_1_1/.wsdl'
     7         1          1.0      1.0      0.0      site = 'EnviroDIY:JRains1'
     8         1          0.0      0.0      0.0      variable = 'envirodiy:MaxBotix_MB7386_Distance'
     9         1          0.0      0.0      0.0      from_date = '03/18/2017'
    10         1          1.0      1.0      0.0      to_date = '02/15/2018'
    11
    12         1   84494802.0 84494802.0    100.0      wof.get_values(wsdl, site, variable, from_date, to_date)

RAM Profile

$ python -m memory_profiler get_values.py

Line #    Mem usage    Increment   Line Contents
================================================
     4   77.039 MiB   77.039 MiB   @profile
     5                             def get_values():
     6   77.039 MiB    0.000 MiB       wsdl = 'http://data.envirodiy.org/wofpy/soap/cuahsi_1_1/.wsdl'
     7   77.039 MiB    0.000 MiB       site = 'EnviroDIY:JRains1'
     8   77.039 MiB    0.000 MiB       variable = 'envirodiy:MaxBotix_MB7386_Distance'
     9   77.039 MiB    0.000 MiB       from_date = '03/18/2017'
    10   77.039 MiB    0.000 MiB       to_date = '02/15/2018'
    11
    12  241.746 MiB  164.707 MiB       wof.get_values(wsdl, site, variable, from_date, to_date)

Additional Profiling

$ python get_values.py & while sleep 1; do ps -p $! -o pcpu= -o pmem= ; done;

As can be seen, fetching data for just 1 variable taxes the CPU and RAM considerably. In the app, we fetch data for 4-6 variables simultaneously, which can max out the resources, possibly even denying requests by other users.

The current search implementation is designed for a simple request / response cycle, which these kinds of long-running processes are ill-suited for. This design works well for CINERGI and HydroShare, but not as well for CUAHSI which involves expensive interpretation via Ulmo of search results.

It would be great if CUAHSI WDC could develop a paginated, REST based API in the future, or if Ulmo could be tweaked to be more performant. Accomodating this performance in MMW would require considerable thought and rearchitecting.

aufdenkampe · 2018-03-08T18:03:53Z

@rajadain, thanks for providing this information. It is very helpful to see such results.

emiliom · 2018-03-08T19:39:12Z

Thank you @rajadain ! That'll be very useful.

emiliom · 2018-03-09T20:36:38Z

Pinging @lsetiawan just to point him to @rajadain's profiling work from yesterday (Mar 8). Don, please take a close look. We'll talk about this and follow up profiling later today.

rajadain added the Monitor label Mar 7, 2018

rajadain assigned arottersman Mar 7, 2018

rajadain requested a review from arottersman March 7, 2018 02:00

rajadain added the in progress label Mar 7, 2018

arottersman approved these changes Mar 7, 2018

View reviewed changes

arottersman assigned rajadain and unassigned arottersman Mar 7, 2018

rajadain added 2 commits March 7, 2018 13:56

Support more than unit durations

e36ca78

Previously we could only specify durations in unit lenths, e.g. 1 week, 1 month, 1 year. This allows the specification of integral lengths, e.g. 2 weeks, 3 months, etc.

Throttle EnviroDIY values to 2 weeks

c13e7ec

Fetching 96K values per variable can overload the system. By limiting EnviroDIY to 2 weeks instead of 1 year, we get a more manageable quantity.

rajadain force-pushed the tt/monitor-throttle-envirodiy branch from 24f8851 to c13e7ec Compare March 7, 2018 18:59

arottersman approved these changes Mar 7, 2018

View reviewed changes

rajadain changed the title ~~Monitor: Throttle EnviroDIY values to 1 month~~ Monitor: Throttle EnviroDIY values to 2 weeks Mar 7, 2018

rajadain merged commit 6484bad into develop Mar 7, 2018

hectcastro removed the in progress label Mar 7, 2018

rajadain deleted the tt/monitor-throttle-envirodiy branch March 7, 2018 19:43

rajadain mentioned this pull request Mar 13, 2018

Production Release 3.13 #2644

Closed

emiliom mentioned this pull request Mar 16, 2018

Batch of comments on Monitor MW portal BiG-CZ/BiG-CZ-Portal#9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitor: Throttle EnviroDIY values to 2 weeks #2710

Monitor: Throttle EnviroDIY values to 2 weeks #2710

rajadain commented Mar 7, 2018

rajadain commented Mar 7, 2018

arottersman left a comment

ajrobbins commented Mar 7, 2018

rajadain commented Mar 7, 2018

aufdenkampe commented Mar 7, 2018

ajrobbins commented Mar 7, 2018

rajadain commented Mar 7, 2018

arottersman left a comment

rajadain commented Mar 7, 2018

emiliom commented Mar 7, 2018

rajadain commented Mar 8, 2018

aufdenkampe commented Mar 8, 2018

emiliom commented Mar 8, 2018

emiliom commented Mar 9, 2018

Monitor: Throttle EnviroDIY values to 2 weeks #2710

Monitor: Throttle EnviroDIY values to 2 weeks #2710

Conversation

rajadain commented Mar 7, 2018

Overview

Demo

Testing Instructions

rajadain commented Mar 7, 2018

arottersman left a comment

Choose a reason for hiding this comment

ajrobbins commented Mar 7, 2018

rajadain commented Mar 7, 2018

aufdenkampe commented Mar 7, 2018

ajrobbins commented Mar 7, 2018

rajadain commented Mar 7, 2018

arottersman left a comment

Choose a reason for hiding this comment

rajadain commented Mar 7, 2018

emiliom commented Mar 7, 2018

rajadain commented Mar 8, 2018

aufdenkampe commented Mar 8, 2018

emiliom commented Mar 8, 2018

emiliom commented Mar 9, 2018