-
Notifications
You must be signed in to change notification settings - Fork 62
v3_lustre_tuto
This section briefly list the steps for installing and configuring robinhood on a Lustre filesystem. This doesn't deal with policy configuration as they are described in other tutorials.
For more details about installation, software and hardware requirements, tunings, etc. refer to: Admin guide: Installation.
Lustre changelog feature makes it possible for robinhood to update its database incrementally without rescanning the filesystem. To activiate this feature:
- Enable Lustre MDT changelogs (remember the returned client id):
lctl --device <fsname>-MDT0000 changelog_register
- Make sure the changelog mask contains the following records: HSM CREAT UNLNK TRUNC SATTR CTIME MTIME CLOSE RENME RNMTO RMDIR HLINK LYOUT
lctl get_param mdd.<fsname>-MDT*.changelog_mask
- Robinhood must run on a Lustre client. For maximum compatibility, it is recommended to run the same major version of Lustre as the servers.
- Robinhood uses a MySQL or MariaDB database storage backend. It is recommended to install the DB server on the same host as robinhood to ensure a minimum latency for DB operations.
yum install mysql-server service mysqld start
Install and start MariaDB on RHEL 7:
yum install mariadb-server systemctl start mariadb.service
/!\ Default database configuration is not suitable for production and will result in very low performances. See Admin guide: database tunings for recommended database configuration.
- Download 'robinhood-lustre' and 'robinhood-adm' packages from http://sourceforge.net/projects/robinhood/files/robinhood and install them on the robinhood host.
- Make sure to get the 'robinhood-lustre' package for the version of Lustre you run, for example robinhood-lustre-3.0-1.'''lustre2.5'''.el6.x86_64.rpm for lustre 2.5.
- Create robinhood database, using rbh-config helper (provided by 'robinhood-adm' package).
rbh-config create_db <db_name> 'localhost' 'rbh_password'
- A common name for robinhood database name is 'rbh_fsname.
- Write the selected password to a file only readable by 'root' (600), for example in /etc/robinhood.d/.dbpassword.
- Create a robinhood configuration file, starting with a simple robinhood template:
cp /etc/robinhood.d/templates/basic.conf /etc/robinhood.d/<fsname>.conf
- Edit the configuration file:
- In 'General' block, set Lustre filesystem root path, and 'lustre' filesystem type:
fs_path = "/fs/root"; fs_type = lustre;
- In 'ListManager' block, set database connection parameters:
# database name passed to 'rbh-config create_db' db = <db_name>; password_file = "/etc/robinhood.d/.dbpassword" ;
- In 'ChangeLog' block, check that the specified 'reader_id' matches the id retuned by 'lctl changelog_register':
reader_id = "cl1";
It is recommended to define your fileclasses before running the initial filesystem scan:
- This way, you will get relevent information in 'rbh-report --class-info' report after the initial scan is completed.
- This will make some optimizations possible for running policies (e.g. skip processing of 'ignored' classes).
fileclass empty_file { definition { type == file and size == 0 } } fileclass small_file { definition { type == file and size > 0 and size <= 32MB } }
To populate robinhood DB, follow these steps:
- 1) Enable changelogs (this should have been done in installation steps above).
- 2) Run the initial scan
- 3) Run robinhood daemon to continuously read changelogs
- If you want to run the initial scan in a terminal and see the log messages in this terminal, run:
robinhood --scan --once -L stderr
- If you prefer running it in background (and display messages into robinhood log):
robinhood --scan --once -d
- You can run a changelog reader test by reading pending changelog records, then exit:
robinhood --readlog --once -L stderr
- To start a robinhood daemon to read changelog continuously:
- Edit /etc/sysconfig/robinhood to indicate that we just want robinhood daemon to read changelogs, not yet run policies:
RBH_OPT="--readlog"
- Start robinhood service:
# on RHEL 6: service robinhood start # on RHEL 7: systemctl start robinhood.service
On RHEL7, if you want to manage several filesystems on the same robinhood host, use 'robinhood@' service instead.
- Per-filesystem service is managed by systemctl [start|stop|status|restart|...] robinhood@''fsname''
- Per-filesystem service configuration is /etc/sysconfig/robinhood.''fsname''
You can monitor scan progress, or changelog reader activity by looking at robinhood statistics (dumped every 15min by default):
grep STATS /var/log/robinhood.log
Once you have run the initial scan and started a changelog reader, robinhood database reflects the filesystem state and is updated near real-time. Robinhood comes with several reporting and querying commands:
- rbh-report provides overall reports about filesystem contents (users and groups usage, file size profile, fileclasses...)
- rbh-find implements classic 'find' command, except that it queries robinhood database instead of the filesystem, which makes it faster. Moreover, it provides specific options to query entries per policy status and other Lustre-specific attributes.
- rbh-du is a enhanced version of classic 'du' command. It queries robinhood database instead of the filesystem, which makes it faster. It can also report details about entry types, count, etc.
# filesystem entries: # rbh-report --fs-info type , count, volume, avg_size dir, 1780074, 8.02 GB, 4.72 KB file, 21366275, 91.15 TB, 4.47 MB symlink, 496142, 24.92 MB, 53
# user info, split by group # rbh-report -u bar -S user , group, type, count, spc_used, avg_size bar , proj1, file, 4, 40.00 MB, 10.00 MB bar , proj2, file, 3296, 947.80 MB, 273.30 KB bar , proj3, file, 259781, 781.21 GB, 3.08 MB
# file size profile for a given user # rbh-report -u foo --szprof user, type, count, volume, avg_size, 0, 1~31, 32~1K-, 1K~31K, 32K~1M-, 1M~31M, 32M~1G-, 1G~31G, 32G~1T-, +1T foo , dir, 48, 1.48 MB, 31.67 KB, 0, 0, 0, 26, 22, 0, 0, 0, 0, 0 foo , file, 11055, 308.16 GB, 28.54 MB, 2, 0, 14, 23, 5276, 5712, 9, 17, 2, 0
# top disk space consumers # rbh-report --top-users rank, user , spc_used, count, avg_size 1, usr0021 , 11.14 TB, 116396, 100.34 MB 2, usr3562 , 5.54 TB, 575, 9.86 GB 3, usr2189 , 5.52 TB, 9888, 585.50 MB 4, usr2672 , 3.21 TB, 238016, 14.49 MB 5, usr7267 , 2.09 TB, 8230, 266.17 MB ...
But also:
- --top-size Report largest files in the filesystem.
- --entry-info Report all information about a given entry.
- Run rbh-report --help to get the full list of available reports.
# rbh-find /mnt/lustre/dir -u root -size +32M -mtime +1h -ost 2 -ls
rbh-du examples:
# rbh-du -H -u foo /mnt/lustre/dir.3 45.0G /mnt/lustre/dir.3
Back to wiki home