Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NSv3 : Strange NS web page behaviour if upload on charge set on phone (AAPS) / API v3 #7838

Closed
robertrub opened this issue Jan 17, 2023 · 36 comments

Comments

@robertrub
Copy link

robertrub commented Jan 17, 2023

We are several people having the same issue. AAPS Dev started using API v3 and this bug started with v3. If AAPS set to use API v1, the problem doesn't happen.

Setting: AAPS last Dev, use NS v3, NS tab, settings, do not upload if on battery, NS and Mongo db both running in containers on my QNAP NAS.

While AAPS is not uploading to NS, my NS web page is missing data. Normal behaviour. When phone is charging, the data is uploaded and it is no more "missing" in the Mongo db but there are holes in the NS graph.

On NSv3, once the data is uploaded, actualising my NS web page does NOT fill the holes in the graph. I need to restart the containers to have the graph filled (which proves that the data is in the Mongo db). But, the last connection data is still missing on the upper part.

Till now, when using v1, I didn't have the "holes" in the graph. But, I had not tested as thoroughly.

Graph after data update ans actualising.

Screenshot_20230117_084425_DuckDuckGo.jpg

Graph after restart of containers.

Screenshot_20230117_084842_DuckDuckGo.jpg

@sulkaharo
Copy link
Member

I have a hunch I know what's causing this. Will take a look.

@sulkaharo
Copy link
Member

Right - can you deploy Nightscout from this branch and send me a snapshot of the logs that includes lines with "Mongo collection requesting cache update event" https://github.com/nightscout/cgm-remote-monitor/tree/debug_log_cache_updates

@vanelsberg
Copy link

Interesting. Actually I can not confirm this issue?
Running my own deployment for NS v14.2.6 (prebuild Docker image release).

So I's say it could have something to do with the way NS is deployed?
My site is service through an Nginx portal server in my network, I think you are using a local Traefic instance?
And how about the access token: does it have "admin" rights?

@robertrub
Copy link
Author

@vanelsberg I'm using the certificate of QNAP, the containers are really basic. AAPS's token has admin rights. The data is in the Mongo db, thus it is not a data upload problem.

@sulkaharo Will try to test. I'm not very fluent in this kind of manipulation. Will see what the commands are in the yaml file.

@robertrub
Copy link
Author

@sulkaharo I tried to change the image in the yaml but it looks like that restarting the container used the last Image I had (14.2.6). Any ideas how I can change the image used in the container ? I'm not sure how to get the logs either (but I have an idea...). Sorry, really a noob on these subjects.

ns_debug

@sulkaharo
Copy link
Member

I have no idea how the installation process in QNAP works, sorry

@robertrub
Copy link
Author

Is there an update to 14.2.6 ? With your link, I get version 14.2.6 shown but if I ask, I get the message that an update to NS is available... I'm not sure if I have your debug version running or not. I'll try to find the logs if I can...

@sulkaharo
Copy link
Member

Ah right, if you're deploying docker images, no there's no new image available for branches, just dev and master.

@vanelsberg
Copy link

Correct. Apart from image pull_policy (https://docs.docker.com/compose/compose-file/#pull_policy) version 14.2.6 is "latest".
If want to deploy a newer version based on dev you'd need to build your own docker image.

@vanelsberg
Copy link

Actually can not think of anything in NSv3 that might cause your problems.
I'd look for the problem on the frontend: are you all are Traefik?

@vanelsberg
Copy link

@psonnera Ok? Yes, you are right. Was assuming (falsely) that latest was "latest release" :-|

@sulkaharo
Copy link
Member

Architecturally the only thing that can cause the gaps is either the Nightscout instance not getting the REST uploads directly or there being a bug in the REST API implementation. The gaps are an expected feature if you have multiple Nightscout instances running using the same database, where only the instance that gets the REST uploads will update the data correctly.

@vanelsberg
Copy link

@robertrub You can try force-reloading the image on startup by adding "imagePullPolicy: Always" (under containers:)

@robertrub
Copy link
Author

@vanelsberg Nice try but it didn't like it ;)
image pull
No, I'm NOT using traefic, connecting using the certificate and acces of the NAS. No errors on the connection part.

@sulkaharo I have only one db that is serving only one NS "client". I posted the issue as a bug in AAPS and Milos closed it saying that it was not AAPS and to see on the NS side.

If I do not stop the upload, all is working fine. If I set "upload only when charging", the upload stops when not charging, starts again once charging and catches up the unsent data. The data arrives to the db as a restart of the container fills up the chart at the buttom (but not the BGs and last AAPS data on the upper part). Actualising the page (without restarting container) does NOT fill in the gaps.

Who is supposed to send the "reread db data" command ?

@robertrub
Copy link
Author

Ok, strarted testing with latest_dev v15.0.0... Will keep you updated.

@vanelsberg
Copy link

@robertrub

...Nice try but it didn't like it ;)
My bad. Working mostly with Kubernetes, which is much alike - but not the same ;-)
I think, for docker compose it is "pull_policy"

See https://docs.docker.com/compose/compose-file/#pull_policy
Example:

nightscout:
    image: 'nightscout/cgm-remote-monitor:14.2.6'
    pull_policy: always

@robertrub
Copy link
Author

Thanks. That one was accepted👍 Will test v15.0.0 beta for the moment.

@robertrub
Copy link
Author

Victory (but I didn't do anything 🤣 )
The problem seems to solved in v15.0.0 beta.

I'm closing this issue. Thanks for your help 🙂

@vanelsberg
Copy link

vanelsberg commented Jan 18, 2023

Who is supposed to send the "reread db data" command ?

Don't know the details but I'd expect a database change triggers NS to refresh.

@vanelsberg
Copy link

vanelsberg commented Jan 18, 2023

The problem seems to solved in v15.0.0 beta.
Nice!

@robertrub robertrub reopened this Jan 18, 2023
@robertrub
Copy link
Author

Sorry folks. I tested again and, the problem is not solved in latest_dev v15.0.0

3h without charging and started charging. The holr is there, un filled 😭

Screenshot_20230118_215438_DuckDuckGo.jpg

@MilosKozak
Copy link
Contributor

MilosKozak commented Jan 21, 2023

image
It's enough to keep nsclient off and start after some time. it starts upload all missed data
nightscout.log

data is uploaded sequentially from oldest to newest
after server restart NS displays all data

@sulkaharo
Copy link
Member

@MilosKozak damn, the logger you're using doesn't store timestamps, so it's very hard to deduct how the uploads are working. Based on the log you have restarted the server just before generating this? What's curious is the data loader that loads the full dump of last 50 hours from the database finds 315 SGV entries, which should mean there's no gaps in the timeline (as per 288 entries / day).

@robertrub
Copy link
Author

@MilosKozak Your comment after server restart NS displays all data is correct. The data have been uploaded to the server.

The problem is that the NS page is "not informed" that it needs to reread the data from the db. As you said, only a server restart will populate the missing data.

The best option would be tell NS that there has been data upload that needs rereading the db.

The second best option would be to reread the data if the page is actualised.

@sulkaharo
Copy link
Member

Ah, I'm pretty sure I know what the bug is. Will probably take until tomorrow to implement a fix.

(Note the app explicitly doesn't do full database reads repeatedly as this doesn't work for anyone using the free cloud databases that have strict usage limits, so we can't change the app to just reload all data.)

@robertrub
Copy link
Author

robertrub commented Jan 21, 2023

@sulkaharo You said, app explicitly do not do full database reads repeatedly

You only need to read the amount of data shown on the screen.

If you can keep track of the last and one before last date stamps, it will be easy to solve. If one before last is "stale", read data since one before last till last ;)

@sulkaharo
Copy link
Member

Sorry unclear wording - meant to say even reading the last 48 hours is too much for free Atlas. We already have the mechanisms in place to minimize these reads, the bug is in the data uploaded through the REST api not being recognized correctly.

@sulkaharo
Copy link
Member

sulkaharo commented Jan 21, 2023

Also maybe worth noting, loading one before last is not enough as clients can upload changes to any record in the database, so changes can occur to records with an arbitrary timestamp. EDIT: And if you note the interface, many features in for visualisation rely on records that might be very old, such as last sensor/cannula/battery edit dates, so the actual query optimisation here is hot trivial.

@MilosKozak
Copy link
Contributor

@MilosKozak damn, the logger you're using doesn't store timestamps, so it's very hard to deduct how the uploads are working. Based on the log you have restarted the server just before generating this? What's curious is the data loader that loads the full dump of last 50 hours from the database finds 315 SGV entries, which should mean there's no gaps in the timeline (as per 288 entries / day).

I stopped server, removed log, started server, did upload of historic data resulting in the issue. So the log is from server start to final state with this issue

@sulkaharo
Copy link
Member

#7853 should fix the gaps, can someone test? :)

@sulkaharo
Copy link
Member

Fixed in dev

@robertrub
Copy link
Author

Thanks @sulkaharo . I updated my container tonight with the latest-des. Will test tomorrow morning after a couple of hours od paused ocnnection.

@sulkaharo
Copy link
Member

@robertrub you can repro the test with a short pause, anything longer than 15 minutes pause will cause CGM entries to go missing with v3 uploads without this fix.

@robertrub
Copy link
Author

@robertrub you can repro the test with a short pause, anything longer than 15 minutes pause will cause CGM entries to go missing with v3 uploads without this fix.

Great job :) Tested 2 times. Worked both times but only 2 data is "missing" or late.

Screenshot_20230125_054349_DuckDuckGo.jpg

Screenshot_20230125_082341_DuckDuckGo.jpg

@sulkaharo
Copy link
Member

The bug causing data go missing applies to device statuses as well. Based on the log Milos uploaded, almost no device status data is uploaded after the uploads have been paused. @MilosKozak how do the uploads work after a pause in regards to device status records? Are those retained and uploaded similarly to treatments and CGM entries?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants