Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: Use HashStreamValidator for Data Validation #1951

Merged
merged 9 commits into from
May 19, 2022

Conversation

danielbankhead
Copy link
Member

Fixes #1910 🦕

@danielbankhead danielbankhead requested review from a team as code owners May 19, 2022 04:20
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: storage Issues related to the googleapis/nodejs-storage API. labels May 19, 2022
@danielbankhead danielbankhead changed the title feat! Use HashStreamValidator for Data Validation feat!: Use HashStreamValidator for Data Validation May 19, 2022
@@ -874,6 +876,9 @@ class File extends ServiceObject<File> {
pathPrefix: '/acl',
});

this.crc32cGenerator =
options.crc32cGenerator || this.bucket.crc32cGenerator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to look to the bucket if one is not passed as an option? Should there be a default within the File class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears all Files must have a bucket (looking at the construction) and all buckets have a generator (whether custom or the default) - this way customers can set things at a storage, bucket, or per-object level

// http://stackoverflow.com/questions/25096737/
// base64-encoding-of-crc32c-long-value
failed = !validateStream.test('crc32c', metadata.crc32c.substr(4));
failed = !validateStream.test('crc32c', metadata.crc32c);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you know of that this was previously removing the first 4 bytes? I think you mentioned it was only checking those bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea… unfortunately we weren’t checking the complete hash before - just the last byte

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why? Also any idea on the performance impact of now checking the entire range of bytes?

Copy link
Member Author

@danielbankhead danielbankhead May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short, invalid implementation - see the link from the removed comment:

This would result in inaccurate validation. E.g.:
Before:

crc32c: {crc32c: '####wA=='},

After:

crc32c: {crc32c: 'hqBywA=='},

The perf for validation is the same - the update method is the sensitive portion for performance

}
}

_flush(callback: () => void) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to also just mark these functions with private.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to, but TS doesn’t like it as the parent class has it public

@danielbankhead danielbankhead merged commit 6765c04 into dropNode10 May 19, 2022
@danielbankhead danielbankhead deleted the utilize-crc32c-utility branch May 19, 2022 16:36
ddelgrosso1 pushed a commit to ddelgrosso1/nodejs-storage that referenced this pull request May 23, 2022
* feat!: Use `HashStreamValidator` for stream validation

- Removes `fast-crc32c` support

* fix: remove logs

* test: `crc32cGenerator` option

* test: improve `compression` test reliability

* chore: add temp debug logs

* chore: more debug logs

* fix: use static gzip content to remove test variance

* chore: remove debug logs

* chore: lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/nodejs-storage API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants