Welcome to the Encyclopedia of Life project. This project contains the PHP code used to harvest and manage the data made available through the www.eol.org website. The is the right place to be if you want to understand or expand this data. If you are interested in the frontend website code you should focus on github.com/EOL/eol. All code for both projects is made available to anyone for re-use, repurposing or for improvement. This is both an ambitious project and an ambitious codebase and we are excited to share it with the opensource community. The code has been under development since approximately September 2007, but has undergone many revisions and updates. There is much work to be done, both in adding new features, and in the ongoing process of code refactoring and performance improvements. If you see something you like, share it with your colleagues and friends and reuse it in your own projects. If you see something you don’t like, help us fix it or join the discussion on GitHub.
The full code base is released under the MIT License. Details are available in the “MIT-LICENSE.txt” file at the root of the code folder.
To get things up and running, these are the steps you need to take. If you actually run through this process, please update this list with any changes you find necessary!
Note that many of these steps require root access on your machine. You have been warned and may need to run them as “sudo” on a Mac/Linux or as an administrator on Windows.
Things you need to do:
-
Install the EOL Ruby on Rails Code Base
-
Install Correct Version of PHP
-
Install ImageMagick
-
Get EOL PHP Code Base
-
Get Backend Specific Ruby gems
-
Create database.yml
-
Tweak MySQL Options
-
Configure Apache
-
Add Webserver
-
Check It’s Alive
-
Tweak Your Local Environment
-
Get EOL Private Config
-
Run Tests
The backend currently depends on much of the infrastructure needed to run our frontend, so you should start with the installation process described here:
brew unlink virtuoso # Avoids conflict over isql in /usr/local/bin brew install homebrew/php/php53
You should now tweak you shell environment so it runs the php in /usr/local/bin. For bash you can do this by adding this line to your .bash_profile:
export PATH="/usr/local/bin:$PATH"
brew install imagemagick --build-from-source
cd $ROOT # From EOL/eol install git clone https://github.com/EOL/eol_php_code.git
cd $ROOT/eol_php_code bundle install
To create the database.yml file in the config copy the handy template to start with:
cp $ROOT/eol_php_code/config/database.sample.yml $ROOT/eol_php_code/config/database.yml
Edit the database.yml and use the root user and password from your MySQL install for all blocks.
vim /usr/local/Cellar/mysql/*/my.cnf
Remove the “STRICT_TRANS_TABLES” from “sql_mode”.
Restart your MySQL server. We recommend doing a reboot just to be sure.
cd /etc/apache2/ sudo cp httpd.conf httpd.conf.bak
Edit httpd.conf by uncommenting (or adding) the php load module line and setting it to the appropriate value. If you used brew to install PHP 5.3 the line should be:
LoadModule php5_module /usr/local/opt/php53/libexec/apache2/libphp5.so
The simplest approach is to provide a symbolic link in your Apache DocumentRoot, e.g.:
sudo ln -s $ROOT/eol_php_code /Library/WebServer/Documents/eol_php_code
This will allow you to test the local install using:
Other approaches include confirguing Apache VirtualHosts in the http.conf to put the EOL PHP code on another port.
Whenever you change the http.conf remember to restart Apache with:
sudo apachectl stop sudo apachectl start
You should now be able to go to:
localhost/eol_php_code/README.rdoc
and see this file.
cd $ROOT/eol_php_code chmod -R a+w log temp tmp applications/content_server/
If you are using the EOL/eol code base as well, you should run:
cp $ROOT/eol/config/environments/local.rb.sample $ROOT/eol/config/environments/local.rb
Currently this requires you have access to the private mbl-cli GitHub community. You must then create an SSH key and register it on GitHub as described here: help.github.com/articles/generating-ssh-keys
We plan to at least make this step unnecessary to get the tests passing.
cd $ROOT/eol_php_code/config/environments git clone [email protected]:mbl-cli/eol-private.git cp eol-private/php_config/environments/test.php test.php cp test.php development.php rm -rf eol-private
cd $ROOT/eol_php_code php tests/run_tests.php
PHP version 5.3.X (there are known issues with 5.4.0 and above) ImageMagick 'biodiversity' Ruby gem, as defined in the Gemfile An installation of the EOL Ruby on Rails codebase (https://github.com/EOL/eol), with all specs passing. This will ensure you have such things as: A working installation of MySQL A working installation of Apache Solr A working instance of Virtuoso
There are a few things you must do before using this code:
Ensure you have a working web server with PHP support. Ensure there is an available php.ini configuration file with desired settings. Update in /config/environment.php the constants for: WEB_ROOT - eg: 'http://localhost/eol_php_code/' In /config/environment.php check the values of: CONTENT_PARTNER_LOCAL_PATH CONTENT_LOCAL_PATH CONTENT_RESOURCE_LOCAL_PATH These are the locations where media will be downloaded to for viewing on the website Give write permission for the following directories to your web server user: /log /temp /applications/content_server/content (or changed value of CONTENT_LOCAL_PATH) /applications/content_server/content_partners (or changed value of CONTENT_PARTNER_LOCAL_PATH) /applications/content_server/resources (or changed value of CONTENT_RESOURCE_LOCAL_PATH) /applications/content_server/tmp /vendor/eol_content_schema_v2/extension_cache Install the biodiversity Ruby gem: you must fist have Ruby and Rubygems installed, then see https://github.com/GlobalNamesArchitecture/biodiversity for installation Create other files in /config/environments/ENV_NAME.php: these environment files will be loaded when boot.php is included, which is towards the TOP of environment.php Run the tests and make sure they all pass see the Test section for more information
You need to include /config/environment.php for any application that you want to be connected to the databases configured in database.yml and in the current environment
The default environment is ‘development’ unless you change the default in environment.php
The default environment can be overridden by:
including this line BEFORE including environment.php: $GLOBALS['ENV_NAME'] = $ENVIRONMENT; calling a script and including the GET parameter: http://localhost/eol_php_code/.../script.php?ENV_NAME=$ENVIRONMENT calling a command line script and including the argument: > php script.php ENV_NAME=$ENVIRONMENT
Tests are best initiated from the command line by running:
> php tests/run_tests.php
Or running a group with:
> php tests/run_tests.php web > php tests/run_tests.php unit
Or running an individual test with:
> php tests/run_tests.php unit/test_name.php
Fixture *.yml files can be added to /tests/fixtures. Any fields that don’t match the fields in your test database will be ignored. Test will only use fixtures if they have a public class attribute defined:
public $load_fixtures = true;
Fixture data is turned into mock objects which can be accessed within tests as such:
$this->fixtures->fixture_name->row_identifier->field e.g. $this->fixtures->agents->me->id
This PHP codebase was designed to compliment the EOL Ruby on Rails codebase (github.com/EncyclopediaOfLife/eol). The PHP code is used almost entirely for harvesting content, inserting harvested content and associated metadata into the database, and working out differences among taxonomies so we can present all content for a single species on a single EOL page. The Rails code is used almost entirely for presenting the content to the world, and providing interfaces for curators to cast judgement on the validity of the content EOL is presenting.
In order to have the website and harvesting code working together there are a few configuration options that need to be set:
The config/database.yml in this codebase must be configured to connect to the same MASTER database that the Rails codebase is. This codebase will connect to the eol_data_$ENVIRONMENT database which is one of three databases that the Rails codebase connects to. In the Rails codebase there are several /config/environment/$ENVIRONMENT.rb config files. For any enviornment that you want to be connected with the PHP code base you must change a few variables: # the domain of the server running the PHP code $CONTENT_SERVERS = ['http://localhost'] # corresponds to CONTENT_LOCAL_PATH - the path on the PHP server where media will be downloaded $CONTENT_SERVER_CONTENT_PATH = "/eol_php_code/applications/content_server/content/" # corresponds to CONTENT_RESOURCE_LOCAL_PATH - the path on the PHP server where resources will be downloaded $CONTENT_SERVER_RESOURCES_PATH = "/eol_php_code/applications/content_server/resources/" # the full URL to the PHP server to /applications/content_server/service.php which is used by the website to send uploads of content partner logos and resource XML files $WEB_SERVICE_BASE_URL="http://localhost/eol_php_code/applications/content_server/service.php?"
Once connected with the Rails codebase, there are a few scheduled tasks we run to ensure that harvesting is happening every day and the website has the data it need to present species pages efficiently:
# every hour on the hour reset permissions on important files 0 * * * * /data/www/eol_php_code/rake_tasks/permissions # 1pm download resource files that have connectors 00 13 * * * /usr/local/bin/php /data/www/eol_php_code/update_resources/update_connector_resources.php > /dev/null # 10.45pm download resource files that dont have connectors 45 22 * * * /usr/local/bin/php /data/www/eol_php_code/update_resources/update_downloadable_resources.php > /dev/null # 11.20pm do harvesting 20 23 * * * /usr/local/bin/php /data/www/eol_php_code/rake_tasks/harvest_resources_cron_task.php > /dev/null