This is a strategic document which describes how the Open Tree Of Life currently uses Git, current pain points and a transition plan to a future where OTOL leverages as many of Git's features as possible, for code, data, configuration files of servers and various other uses.
Github, Bitbucket, NCBI, GBIF, S3, EC2, Linode, etc...
Input taxonomies come in from NCBI, GBIF and the "patch system". These are synthesized by X, then processed by Y and then are accessible via the Argus browser and the Treemachine API.
The NCBI FTP server seems to be updated daily with the latest dumps.
GBIF has a new-ish API and a list of all their species datasets here.
NCBI currently has [http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=STATISTICS&uncultured=hide&unspecified=hide](387 thousand) nodes in their taxonomy tree.
Jonathan "Duke" Leto, Leto Labs LLC
GPLv3. See LICENSE file for details. If you would like to use this content under a different license, please feel free to contact [email protected]. I don't bite.