-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential conflict for non alphabetic character leading schema #294
Comments
hi, @chris920820 |
durango
pushed a commit
to edms/parquet-go
that referenced
this issue
Apr 14, 2021
durango
pushed a commit
to edms/parquet-go
that referenced
this issue
Apr 14, 2021
zolstein
pushed a commit
to zolstein/parquet-go
that referenced
this issue
Jun 23, 2023
…itongsys#289) * refactor packages to use encoding.Values container * refactor page and dictionary creation to use encoding.Values * go vet fix * reduce memory footprint of encoding.Values * refactor encoding.Encoding to use simple Go types * port parquet-go package to use pair of values+offsets to represent byte arrays * add fuzz tests back * optimize DELTA_LENGTH_BYTE_ARRAY decoding (xitongsys#291) * optimize DELTA_LENGTH_BYTE_ARRAY decoding * add link to online documentation * fix * add a unit test for decodeByteArrayLengths * Update encoding/delta/length_byte_array_amd64.s Co-authored-by: Kevin Burke <[email protected]> * optimize DELTA_LENGTH_BYTE_ARRAY encoding (xitongsys#292) Co-authored-by: Kevin Burke <[email protected]> * account for size of offsets buffer when benchmarking throughput * optimize DELTA_BYTE_ARRAY decoding (xitongsys#294) * PR feedback Co-authored-by: Kevin Burke <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey, @xitongsys !
Per a7314a1,
Seems we are adding a prefix
P_
to the schema that is not leading by nonalphabetic character.However, if a parquet has the schema
P__x
and_x
, it will result in conflict, since we can no longer distinguish if data came fromP__x
or_x
. Also, it might be problematic for the consumer to know this convention. For example, if the consumer is expecting the column_x
exist, and try to read data using name_x
it will fail because it has internally converted toP__x
.Do we have some places that enforce this naming convention (no leading non alphabetic char)? Does Golang compiler enforce that in some places?
Is there any better we could handle this more gracefully? To avoid using non alphabetic leading characters as variable name, could we can add a global prefix instead of just add a prefix of certain columns?
The text was updated successfully, but these errors were encountered: