-
Notifications
You must be signed in to change notification settings - Fork 1
Architecture and workflow
Alberto Cottica edited this page Dec 28, 2015
·
1 revision
We need:
- a script to download fresh data dumps from Wikipedia
- server-side database maintenance and pre-processing
- visualization built in the browser
###A script to download fresh data dumps from Wikipedia
About 100 MB an hour. This means watching your ISP contract and making sure they do not throttle your bandwidth.
###Server-side database maintenance and pre-processing
To a first approximation, the database has 30,765 records (one for each article in the WikiProject Medicine). Each hour, for each article, we extract the number of pageviews that occurred in the last hour and append it as a new column in the database. The dump is then deleted.
###Visualization
TODO