-
-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very high Disk read and write #3429
Comments
Is this disk write from the DT application, or from the database? Are they running on different hosts, such that you can pinpoint which of them is the problem? |
They are running on the same machine. currently i am collecting some sample data regarding the per process io on the machine. |
After 2 hours this is what iotop gives me: The systemd logs are caused by the health check from docker, which refers to this issue: docker/for-linux#679 |
@nscuro, seeing consistent disk write and read since 83 days -> ~8.5 TB write, ~ 4.6 TB read |
We're also seeing very high disk usage from DependencyTrack. We have deployed one instance each of the front-end and api Docker containers into a Kubernetes cluster in Azure, using a file storage account for the /data drive, and Azure SQL for the database - so the database is not using the storage account in this case. We've not changed any of the default Task Scheduler settings. We're currently running version 4.10.1, but from the logs it looks like the high disk usage started when we updated from 4.4.0 to 4.8.2 last year, and has continued after we updated to 4.10.1. I've recreated the same behaviour in another test deployment (same setup as above, but where I have direct access to the storage account) and looking at the storage stats for that, DependencyTrack has performed 6.9 million file storage transactions in the last 12 hours, with no BOM files being uploaded during that period, and no other users. From what I can see from the file storage, based on file properties: there is a set of files in the /data/index/vulnerability folder that all begin with the prefix "_3" which are being deleted and created constantly. It looks like they might be related to the Lucene search index. |
Thanks for narrowing it down, @symology! This particular search index should be updated under the following conditions:
Looking at your logs, are you able to spot frequent |
Hi @nscuro. I don't actually see the specific text There is an entry for There has been some NVD mirroring, and there are brief spikes in the actual volume of data being written/read to the file storage which correspond to those downloads, but the actual number of file storage transactions has stayed consistently high all the way through, so I don't think it's being triggered by the updates. The indexing seems to be running constantly, regardless of changes to the vulnerabilities. |
It could be that Lucene's background merge threads are causing that. Lucene segments are append-only, so with lots of writes (many index commits) it presumably has to do lots of housekeeping. Doing some surface-level research, it also seems like the writing patterns that DT uses (many individual index operations) are sub-optimal, and Lucene prefers batch operations instead: https://j.blaszyk.me/tech-blog/exploring-apache-lucene-index/ I think that is a reasonable explanation, but it needs to be verified. |
I don't think there's a way to monitor Lucene right now. But I'll have a look if there is something we can add / enable to make it more observable. It is a bit of a blindspot at the moment indeed. |
Collect basic metrics: * Total number of index operations (`add`, `update`, `delete`, `commit`), grouped by index * Number of index documents in RAM * Number of bytes used by the index * Total number of documents in the index Also, integrate Lucene's `InfoStream` with Dependency-Track's logging system. Lucene output will now be included when configuring `LOGGING_LEVEL=DEBUG`, or when the respective logger is explicitly configured in `logback.xml`. Relates to DependencyTrack#3429 Signed-off-by: nscuro <[email protected]>
Collect basic metrics: * Total number of index operations (`add`, `update`, `delete`, `commit`), grouped by index * Number of index documents in RAM * Number of bytes used by the index * Total number of documents in the index Also, integrate Lucene's `InfoStream` with Dependency-Track's logging system. Lucene output will now be included when configuring `LOGGING_LEVEL=DEBUG`, or when the respective logger is explicitly configured in `logback.xml`. Relates to DependencyTrack#3429 Signed-off-by: nscuro <[email protected]>
Collect basic metrics: * Total number of index operations (`add`, `update`, `delete`, `commit`), grouped by index * Number of index documents in RAM * Number of bytes used by the index * Total number of documents in the index Also, integrate Lucene's `InfoStream` with Dependency-Track's logging system. Lucene output will now be included when configuring `LOGGING_LEVEL=DEBUG`, or when the respective logger is explicitly configured in `logback.xml`. Relates to DependencyTrack#3429 Signed-off-by: nscuro <[email protected]>
Hi, we are trying to set up dependency-track and are facing the same issue. This is on a completely fresh installation of the latest version, without even ever logging in or adding any components. After dependency-track downloads and processes all the nist archives, the disk IO remains at 100% while the CPU is at around 50%. Using fatrace, I see about 5600 reads of System
CPU
Memory
Logs
I can reproduce the same on a completely clean AWS Lightsail instance with 8 GB RAM, 2 vCPUs, 160 GB SSD running Debian 12.5. I simply install docker, then install dependency-track using the manual quick start instructions, and I get the same problem. Installing 4.4.0, as suggested above, did not help. |
@DrummerB Please do not use the embedded H2 database (which writes to |
@nscuro Changing to PostgreSQL seems to have fixed this issue indeed, thanks a lot! |
We have had a spike in Disk IO with 4.11.7 @nscuro to the point I'm going to see if reverting to 4.11.6 fixes it. We're using EFS with Fargate containers for the frontend/api and RDS for a PostgreSQL database. From the logs I do see a lot of index rewrites back to back |
Sorry the above was me (forgot to change account). Reverting saw logs complaining about a file lock for a while. This may have been while two containers were running at the same time as things reverted.
But following that the continual re-indexing started again. :(
Given I'm already using PostgreSQL can anyone suggest any way to further troubleshoot this? I'm about to try disabling the NVD, GitHub and Google OSV mirrors to see if that helps isolate the problem. EDIT: |
Current Behavior
We see heavy Disk usage from a default installation of Deptrack:
The Total count in this screenshot is in the Timeframe of the uptime only!
We see also the postgres: checkpointer to produce permanent load.
Steps to Reproduce
Projects: 974
We upload BOMs via API, no integrations enabled.
Expected Behavior
less disk write
Dependency-Track Version
4.10.1
Dependency-Track Distribution
Container Image
Database Server
PostgreSQL
Database Server Version
13.11
Browser
N/A
Checklist
The text was updated successfully, but these errors were encountered: