Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spacesavers finds far less disk usage than df and du #102

Closed
kelly-sovacool opened this issue Jul 3, 2024 · 1 comment · Fixed by #106
Closed

spacesavers finds far less disk usage than df and du #102

kelly-sovacool opened this issue Jul 3, 2024 · 1 comment · Fixed by #106
Assignees
Milestone

Comments

@kelly-sovacool
Copy link
Member

kelly-sovacool commented Jul 3, 2024

For /data/CCBR as of July 2024, df reports a disk usage of 197.2 TiB, while spacesavers2 reports a disk usage of 161 TiB.

I checked 3 project directories with spacesavers2 and compared the results to du -s.
Note: df and du are not interchangeable. df cannot run on project directories (it just reports the overall usage for data mounts such as /data/CCBR), but du can. Also du -s /data/CCBR returns zero.
See code here: https://github.com/CCBR/spacesavers2/blob/98b1527c795fe97e9adb848d1a2109414fc4aec7/tests/debug_102/bin/main.sh

FolderPath usage_TiB_spacesavers usage_TiB_du
/data/CCBR/projects/ccbr1332 10.865996 10.869083
/data/CCBR/projects/ccbr783 4.829858 5.012015
/data/CCBR/projects/ccbr984 5.751336 5.945362

I ran find to audit group permissions in these projects, and found one directory that was not readable:

find: ‘/data/CCBR/projects/ccbr1332/exome/tumor_only/.singularity/cache’: Permission denied

So it is possible that singularity cache directories are taking up disk space but are not findable by spacesavers. But this was only the case for ccbr1332, and doesn't explain the discrepancy for the other projects.

@kopardev
Copy link
Member

kopardev commented Jul 4, 2024

Update: found a possible bug ... when spacesavers2 finds hardlinks ... it is supposed to designate 1 file as the original file ... count its size as nonduplicate bytes ... and also count files as non-duplicate files. But as of v0.13.0 it counts files as non-duplicate files but does not add non-duplicates bytes to the folder size. This may be the reason it is undercalling the folder sizes. Will be fixing this in v0.13.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants