Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about "Column ID" in Parquet Sepc #9099

Closed
mapleFU opened this issue Nov 17, 2023 · 5 comments · Fixed by #9162
Closed

Question about "Column ID" in Parquet Sepc #9099

mapleFU opened this issue Nov 17, 2023 · 5 comments · Fixed by #9162

Comments

@mapleFU
Copy link
Member

mapleFU commented Nov 17, 2023

Query engine

Not engine specific. It's about the spec.

Question

Parquet spec [1] mentioned a "Column ID". It says that:

Column IDs are required.

However, I just found that in spec, we only having a "field ID" here. Are they the same thing?

And from the implemention, I think the meaning of here is also "field ID", am I right?

  1. https://iceberg.apache.org/spec/#parquet
@mapleFU
Copy link
Member Author

mapleFU commented Nov 17, 2023

Also, in ORC parquet, I found

column id = [iceberg.id](http://iceberg.id/)

Seems they're all equal to field id?

@emkornfield
Copy link
Contributor

Yes column IDs are meant to be field ids

@mapleFU
Copy link
Member Author

mapleFU commented Nov 24, 2023

Thanks!

@wgtmac
Copy link
Member

wgtmac commented Nov 24, 2023

@bitsondatadev @Fokko It seems that we need to clear the confusion here.

@emkornfield
Copy link
Contributor

emkornfield commented Nov 27, 2023

I can make a PR to clarify for parquet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants