-
Notifications
You must be signed in to change notification settings - Fork 176
Detailed data from BAG available #492
Comments
The links don't work any more. I think they are tied dynamically to particular instance of viewing (on my computer), and so now are gone. There is probably way to get this data without a session, or create a session programmatically. There are some information of HTTP REST API in Python here: https://github.com/tableau/server-client-python And tutorial here: https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_get_started_tutorial_part_1.htm The HTTP REST API itself also is documented, for example this is the method we are probably most interested in: https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_ref_datasources.htm#download_data_source and https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api_ref_workbooksviews.htm#get_view It looks like this: Needs more digging to figure this out. There is also JavaScript client, which is the same as used for the visualisations itself, and it might be possible to execute some of it in node.js, but it is possible that it has too many features, and will not work in node.js actually. So far by inspecting html and javascript, the "path" parameter is |
Hi @baryluk , thank you for working on this. I have found datasource as well some time ago and have started to feed it to Elasticsearch. So far I am downloading it manually via https://covid-19-schweiz.bagapps.ch/de-1.html. Have you found a way already to automate the download? If it is not possible by using the API, I could help by building a scraper based on synthetic-monitoring. |
Hi Bernhard,
Any input sensitization and cross reference to validate the data is welcome.
I didn't work on automatic it yet, as I had a soso week, but I do hope to
work on it later.
I suggest starting collecting the data in your own repo on GitHub, if you
have meams to do it.
Cheers,
Witold
…On Tue, 14 Apr 2020, 22:58 Bernhard Fluehmann, ***@***.***> wrote:
Hi @baryluk <https://github.com/baryluk> , thank you for working on this.
I have found datasource as well some time ago and have started to feed it
to Elasticsearch. So far I am downloading it manually via
https://covid-19-schweiz.bagapps.ch/de-1.html. Have you found a way
already to automate the download? If it is not possible by using the API, I
could help by building a scraper based on synthetic-monitoring.
In terms of the dataset, I am using the version with all columns and all
lines. As you mentioned already, the number of lines corresponds to the
number of confirmed cases, which is as detailed as it can get. There even
exists a column called f1, which seem to contain the case number what could
simplify updating the data. The problem is that theese numbers seem to
somehow change over time. On each of the updates I made, more than 1000
numbers did not exist on the new data anymore, still the amount of lines of
the new dataset was correct.
In addition, a lot of data is redundant, i.e. exists in german, french or
abbreviations. Right now I plan to implement some homologation on the
import. Although creating a sanitized, englisch version would be the better
solution. As well if we could store the data in public, ideally on this
repo, since it is used by a lot of people and solutions. If collaboration
is welcome, please let me know.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#492 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA254QFWAAZ5GAOXT3A6ZLRMTE63ANCNFSM4MES4LPQ>
.
|
Hi Witold, Regards |
Hi @baryluk , Cheers |
I just compared the number of deceased of the cantons and the BAG:
Looks to me that in many cantons the difference is small and is most likely depending on the reporting time. But in NE, VS, TI and VD I guess the difference has some other reason. Maybe NE can be an indication: The number of persons "Décès hospitalier" until now based on the data from the canton is 22 and the BAG reports 25, whereas the total number is 65 in the canton. |
It is possible that the difference between the numbers of the canton and the BAG is due to testing criteria. Some cantons declare a COVID-19 positive case also based on a CT scan. |
@BFLB Selenium scraper sounds like an interesting idea. Have you had a continues stream of archived data for last 2 weeks with it? |
@baryluk What I will do next is to provide a lean version of the converted csv file with only non zero data sets. This should drastically reduce size. For most use cases this should work. I adapted the new model last week and refactored the code today. If everything works smooth I will start running the scrapper as a scheduled task until the end of the week to fully automate the process. Finally I would like to scrap the number of tests done as soon as I have time, since these numbers are published as well since a week or two. |
It seems that no one is working on this issue currently (i.e. to cross-reference the data from here and BAG). I'm closing this for now, but feel free to re-open it if needed. |
I tried to get the confirmation from SH and BAG for the number of deceased, but no one wanted to confirm this issue. It seems that the BAG is working now also working on a fully digital version, that should be ready for the second wave. So maybe soon we will see more details. |
I just found BAG is now providing very detailed data dump, with breakdown by canton, age group, sex, fully historicized:
https://public.tableau.com/vizql/w/Covid19_15852360559170/v/Dashboard2d/vud/sessions/CA00DDC7BA8C45ED9B68A32E75D9D45E-0:0/views/8353695473959107859_14843029294595766421?csv=true (1.5MB in size)
https://public.tableau.com/vizql/w/Covid19_15852360559170/v/Dashboard2d/vud/sessions/CA00DDC7BA8C45ED9B68A32E75D9D45E-0:0/views/8353695473959107859_14843029294595766421?csv=true&summary=true (37kB in size)
Also, as far as I can see, there is no data for Principality of Lichtenstein there.
I got this links from https://covid-19-schweiz.bagapps.ch/de-2.html and https://covid-19-schweiz.bagapps.ch/de-1.html , but the interface requires me to select the end date, so they will most likely break and/or not have all data by tomorrow. So for completeness I am attaching archive with the files:
BAG_tableau_csv_2020-04-09.tar.gz
I guess it might be useful to develop a tool to cross reference the data, and compare with what we store in the repo? Or maybe even publish in a separate directory too in this repo?
The text was updated successfully, but these errors were encountered: