-
Notifications
You must be signed in to change notification settings - Fork 62
pipeline_tuning
When processing incoming information (from scan or changelogs), robinhood periodically dumps internal statistics to its log file (frequency for dumping stats is driven by config parameter 'logs::stats_interval). These log lines starts with 'STATS' tag which makes them easy to 'grep'.
grep STATS /var/log/robinhood.log
The more useful part for analyzing ingest performance is the EntryProcessor Pipeline Stats section.
Example of pipeline statistics:
==== EntryProcessor Pipeline Stats === Idle threads: 0 Id constraints count: 9561 (hash min=0/max=3/avg=0.6) Name constraints count: 9561 (hash min=0/max=4/avg=0.6) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 | 1: GET_INFO_DB | 9497 | 5 | 0 | 41881 | 0.17 | 2: GET_INFO_FS | 11 | 5 | 0 | 41870 | 0.00 | 3: REPORTING | 1 | 0 | 0 | 41869 | 0.00 | 4: PRE_APPLY | 1 | 0 | 0 | 41868 | 0.00 | 5: DB_APPLY | 51 | 6 | 0 | 41817 | 0.14 | 99.60% batched (avg batch size: 36.3) 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Pipeline stats show the following indicators for each pipeline stage:
- Wait: the number of entries waiting to be processed at this stage.
- Curr: the number of entries currently being processed at this stage.
- Done: the number of entries that have been processed at this stage but that cannot go to the next stage because an entry with the same 'id' or 'path' is currently being processed in next steps.
- Total: the total number of entries that were processed at this step since the program started.
- ms/op: the average duration of operations at this stage.
- Filesystem operations: GET_FID, GET_INFO_FS, CHGLOG_CLR.
- Database operations: GET_INFO_DB, DB_APPLY, RM_OLD_ENTRIES.
Some tuning can be done in EntryProcessor config block.
Main tuning is the nb_threads parameter: it specifies how many worker threads process pipeline operations.
If you notice that some pipeline stages are starving, you can also have a fine tuning of threads working at each pipeline stage by setting 'stage_stagename_threads_max' parameters (see example in the case studies below).
Robinhood is running filesystem scan. 32 threads have been configured to process incoming information (in EntryProcessor config block, nb_threads = 32).
Let's analyze the workflow stats and try to improve the scan speed:
==== EntryProcessor Pipeline Stats === Idle threads: 0 Id constraints count: 10000 (hash min=0/max=5/avg=0.6) Name constraints count: 10000 (hash min=0/max=5/avg=0.6) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 441 | 13.12 | 1: GET_INFO_DB | 9958 | 2 | 0 | 55892 | 0.61 | 2: GET_INFO_FS | 0 | 28 | 10 | 55864 | 17.27 | 3: REPORTING | 0 | 0 | 0 | 55854 | 0.00 | 4: PRE_APPLY | 0 | 0 | 0 | 55854 | 0.00 | 5: DB_APPLY | 0 | 2 | 0 | 55852 | 1.43 | 6: CHGLOG_CLR | 0 | 0 | 0 | 0 | 0.00 | 7: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Observations:
- Pipeline is full (constraints count=10000) which may slow down the scan speed (scan threads can't push entries in the pipeline when it is full). This indicates the current bottleneck is somewhere in the pipeline processing steps.
- Longer operations are filesystem operations: GET_FID (13.12 ms/op) and GET_INFO_FS (17.27 ms/op).
- According to 'Curr' column, most threads (28/32) are currently processing entries in 'GET_INFO_FS' stage, which may starve over pipeline steps.
This is what we get after that change:
==== EntryProcessor Pipeline Stats === Idle threads: 26 Id constraints count: 7 (hash min=0/max=1/avg=0.0) Name constraints count: 8 (hash min=0/max=1/avg=0.0) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 2011 | 4.25 | 1: GET_INFO_DB | 0 | 2 | 0 | 381537 | 0.47 | 2: GET_INFO_FS | 0 | 3 | 0 | 381534 | 6.90 | 3: REPORTING | 0 | 0 | 0 | 381534 | 0.00 | 4: PRE_APPLY | 0 | 0 | 0 | 381534 | 0.00 | 5: DB_APPLY | 0 | 2 | 0 | 381532 | 1.36 | 6: CHGLOG_CLR | 0 | 0 | 0 | 0 | 0.00 | 7: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Observations:
- Pipeline workflow looks better: it is no longer full so scan threads are no longer limited by robinhood processing rate to push entries to the pipeline.
- Latency of filesystem operations also decreased (4.25ms and 6.90ms). This may be explained by less concurrency on filesystem operations.
- Pipeline is almost empty. Entry processing is no longer a limiting point.
As previously, robinhood is scanning a filesystem. It is initially configured with 8 pipeline threads.
Let's see how it behaves:
==== EntryProcessor Pipeline Stats === Idle threads: 0 Id constraints count: 10000 (hash min=0/max=3/avg=0.6) Name constraints count: 10000 (hash min=0/max=5/avg=0.6) Stage | Wait | Curr | Done | Total | ms/op | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 | 1: GET_INFO_DB | 1418 | 4 | 0 | 105996 | 0.10 | 2: GET_INFO_FS | 2 | 0 | 0 | 105994 | 0.00 | 3: REPORTING | 0 | 0 | 0 | 105994 | 0.00 | 4: PRE_APPLY | 0 | 0 | 0 | 105994 | 0.00 | 5: DB_APPLY | 5576 | 3000 | 0 | 97418 | 0.09 | 99.99% batched (avg batch size: 608.8) 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 |
Scan runs at 30682 entries/sec.
Observations:
- Pipeline is full (10000 entries, as showed by constraints counts).
- Longer operations are database operations (GET_INFO_DB: 0.10 ms/op and DB_APPLY: 0.09 ms/op).
Running sar -d on the database device also indicates a significant load on it ("tps" column indicates 12k IOPS). However, it indicates the current load in only 25% of actual disk rate (%util). So there may be room for improving disk throughput.
02:35:53 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 02:35:53 PM dev253-1 12405.17 0.00 525926.25 42.40 1.30 0.10 0.02 24.69
After a few database tunings in '/etc/my.cnf' (setting innodb_flush_log_at_trx_commit = 2 and increasing innodb_log_file_size), we reach the following disk usage:
02:43:38 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 02:43:38 PM dev253-1 12461.11 0.00 741561.11 59.51 1.27 0.10 0.02 24.78
The number of disk IOPS is almost the same (~12K) but we can see the overall disk throughput is much better: 741561.11 blocks written/sec vs. 525926.25 previously.
Scan now runs at 38636 entries/sec.
After that, we can improve it a little by increasing the number of robinhood pipeline threads from 8 to 16:
02:53:42 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 02:53:42 PM dev253-1 12916.00 0.00 770389.60 59.65 2.07 0.16 0.02 25.51
Now the scan runs at 40989 entries/sec.
Back to wiki home