Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the CLI summary data use a human readable format instead of raw bytes #2696

Merged

Conversation

scottchiefbaker
Copy link
Contributor

@scottchiefbaker scottchiefbaker commented Jun 5, 2021

This PR changes the CLI output summary from bytes to a more human readable string format. This should address issue #2694.

Before:

:zstd -f /tmp/foo.bin 
/tmp/foo.bin         :  2.94%   (280708256 => 8264707 bytes, /tmp/foo.bin.zst) 

After:

:./zstd -f /tmp/foo.bin 
/tmp/foo.bin         :  2.94%   (267.7M => 7.9M, /tmp/foo.bin.zst)      

This PR adds a humanSize() function to util.c. I wasn't sure where best to put it, but that seemed like a good place. Then it modifies the sprintf() calls for outputting the summary after a file is compressed. It also modifies the sprintf() benchmark code.

Things I checked before landing this code:

  1. Code is space indented not tabs
  2. Functions and variables are camelCase
  3. Variable declarations are grouped with other variables
  4. Code should be C89 compatible
  5. Make test runs without errors on x86_64
  6. Code compiles and outputs the correct format for summary, benchmark, and multiple files

I will probably need some help modifying this to work on ARM or other platforms.

Copy link
Contributor

@felixhandte felixhandte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @scottchiefbaker,

Thanks for this PR.

This makes sense to me as a nicer way to present this information. I guess there's a question of whether changing this format will break anyone trying to parse this output, but I can't really imagine why anyone would be doing that in the first place.

I included a few suggestions. Let me know what you think.

programs/util.c Outdated Show resolved Hide resolved
@@ -121,6 +121,27 @@ int UTIL_requireUserConfirmation(const char* prompt, const char* abortMsg,
* Functions
***************************************/

char* humanSize(unsigned long long size, char* str) {
if (size > 1125899906842624L) {
snprintf(str, 7, "%.1fP", (float)size / 1125899906842624L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snprintf was standardized in C99, so it's not guaranteed to be available in all the environments we want Zstd to build.

You could consider rewriting this to not require an intermediate buffer if you just output a rescaled float value and a (static) suffix string that are then passed to the DISPLAYLEVEL() call with "%.1f%s". But maybe that would lead to poor results for small inputs ("you compressed 123.0B" feels weird).

A worst case alternative is that you just wrap this in an #if POSIX_PLATFORM_VERSION >= 200112L and provide a fallback #else implementation.

Copy link
Contributor Author

@scottchiefbaker scottchiefbaker Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see references to fprintf()all over the code base so I just assumed snprintf() was valid. util.c. includes <stdio.h> for fprintf(), can we rely on that, but not snprintf()?

C is not my forte, so I may need some guidance on this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As C has evolved, they've introduced new functions into the same header. C90 has fprintf() but not necessarily snprintf(). snprintf() was only standardized (and guaranteed to be present) in C99. Zstd restricts itself to the features guaranteed to be present in C90 (or at least has fallbacks to handle that strict case), so that we can be sure it will compile on all C90-conforming platforms.

programs/util.c Outdated Show resolved Hide resolved
@@ -122,6 +122,8 @@ int UTIL_requireUserConfirmation(const char* prompt, const char* abortMsg, const
#define STRDUP(s) strdup(s)
#endif

char* humanSize(unsigned long long size, char* str);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to a more descriptive name? Maybe add a little comment here describing what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to naming suggestions. I will definitely add a description though, that's easy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe UTIL_renderHumanReadableByteSize()?

@felixhandte felixhandte self-assigned this Jun 7, 2021
@scottchiefbaker
Copy link
Contributor Author

I'm not a C expert, so I will need some help resolving the one failing test.

programs/util.c Outdated Show resolved Hide resolved
Copy link
Contributor

@felixhandte felixhandte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed up on a couple notes.

If at any point you want to hand this off, I'm happy to merge what you have into a working branch and I can take over and polish this up. I don't want you to feel like you have to put in any more work, especially if, by your admission, you are not all that comfortable working in C. Let me know what you wanna do!

@@ -121,6 +121,27 @@ int UTIL_requireUserConfirmation(const char* prompt, const char* abortMsg,
* Functions
***************************************/

char* humanSize(unsigned long long size, char* str) {
if (size > 1125899906842624L) {
snprintf(str, 7, "%.1fP", (float)size / 1125899906842624L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As C has evolved, they've introduced new functions into the same header. C90 has fprintf() but not necessarily snprintf(). snprintf() was only standardized (and guaranteed to be present) in C99. Zstd restricts itself to the features guaranteed to be present in C90 (or at least has fallbacks to handle that strict case), so that we can be sure it will compile on all C90-conforming platforms.

@@ -122,6 +122,8 @@ int UTIL_requireUserConfirmation(const char* prompt, const char* abortMsg, const
#define STRDUP(s) strdup(s)
#endif

char* humanSize(unsigned long long size, char* str);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe UTIL_renderHumanReadableByteSize()?

@scottchiefbaker
Copy link
Contributor Author

@felixhandte if you want to take this and run with it that would be totally fine with me. I just wanted to get the ball rolling. I think most of the hard work is done?

@felixhandte felixhandte changed the base branch from dev to human_size_output June 8, 2021 19:48
@felixhandte felixhandte merged commit fa57d25 into facebook:human_size_output Jun 8, 2021
@scottchiefbaker
Copy link
Contributor Author

@felixhandte when you get things massaged let me know. I'd like to see what it looks like when it's all done. I'm sure I'll have questions for you as I'd like to learn more C.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants