Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/parquet: support tuples #103589

Merged
merged 1 commit into from
May 22, 2023
Merged

Conversation

jayshrivastava
Copy link
Contributor

@jayshrivastava jayshrivastava commented May 18, 2023

This change adds support for writing tuples. Implementation details below.

The standard way to write a tuple in parquet is to use a group:

message schema {                 -- toplevel schema
   optional group a (LIST) {
       optional T1 element;       -- physical column for the first field
       ...
       optional Tn element;       -- physical column for the nth field
   }
}

Because parquet has a very strict format, it does not write such groups
as one column with all the fields adjacent to each other. Instead, it
writes each field in the tuple as its own column. This 1:N mapping
from CRDB datum to physical column in parquet violates the assumption
used in this library that the mapping is 1:1.

This change aims to update the library to break that assumption. Firstly,
there is now a clear distiction between a "datum column" and a "physical
column". Also, the Writer is updated to be able to write to multiple
physical columns for a given datum, and the reader is updated
to "squash" physical columns into single tuple datums if needed. Finally,
randomized testing and benchmarking is extended to cover tuples.

Informs: #99028
Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jayshrivastava jayshrivastava changed the title Parquet tuple util/parquet: support tuples May 18, 2023
@jayshrivastava jayshrivastava force-pushed the parquet-tuple branch 6 times, most recently from e96a939 to 71f2e7c Compare May 19, 2023 18:28
@jayshrivastava jayshrivastava marked this pull request as ready for review May 19, 2023 18:28
@jayshrivastava jayshrivastava requested a review from a team as a code owner May 19, 2023 18:28
@jayshrivastava jayshrivastava mentioned this pull request May 19, 2023
13 tasks
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava)


pkg/util/parquet/schema.go line 54 at r1 (raw file):

	// Typically, there is a 1:1 mapping between columns and physical columns in
	// a file. However, if there is at least one tuple in the schema, then the
	// mapping is not 1:1 since each field in a tuple in which case each field

reads a bit strange... maybe drop "in which case each field"?


pkg/util/parquet/schema.go line 62 at r1 (raw file):

	// start index of datumColumn k is 3.
	numPhysicalCols      uint32
	physicalColsStartIdx uint32

just use integers? no need to cast to uint?
In general, use uints to represent bits; indexes -- use regular int.


pkg/util/parquet/schema.go line 115 at r1 (raw file):

// makeColumn constructs a datumColumn. It does not populate
// datumColumn.physicalColsStartIdx.

thank you for mentioning physicalColStartIdx bit.


pkg/util/parquet/schema.go line 129 at r1 (raw file):

			defaultTypeLength, defaultSchemaFieldID)
		if err != nil {
			return datumColumn{}, err

thanks for making this change too.


pkg/util/parquet/schema.go line 397 at r1 (raw file):

		scalarColWriter, ok := elementCol.colWriter.(scalarWriter)
		if !ok {
			return datumColumn{}, errors.AssertionFailedf("expected scalar datumColumn writer")

nit: I think error message should keep "column" instead of datumColumn.


pkg/util/parquet/schema.go line 416 at r1 (raw file):

		if len(contents) == 0 {
			return result, pgerror.Newf(pgcode.FeatureNotSupported,
				"parquet writer does not support tuples with zero fields")

nit: does not support empty tuple types?


pkg/util/parquet/schema.go line 424 at r1 (raw file):

			if innerTyp.Family() == types.ArrayFamily {
				return result, pgerror.Newf(pgcode.FeatureNotSupported,
					"parquet writer does not support arrays in tuples")

is it parquet that doesn't support nested arrays/tuples or is it... us?


pkg/util/parquet/schema.go line 432 at r1 (raw file):

			var label string
			if labels == nil {
				label = fmt.Sprintf("%d", i)

maybe use "f#" (field #) or "field#"?
Do you think that, perhaps you should use the name of the top level tuple, and suffix it with "f#"?
(my_tuple_col_f1, my_tuple_col_f2 OR, if labels set, my_tuple_col_first_name, my_tuple_col_last_name)


pkg/util/parquet/testutils.go line 166 at r1 (raw file):

type ReadDatumsMetadata struct {
	// KVMeta is the kv metadata read from the file.
	KVMeta map[string]string

can we not use "KV" -- has overloaded meaning. Anything other than that. FileMetadata; FieldMeta, etc.


pkg/util/parquet/testutils.go line 553 at r1 (raw file):

//
// Behavior is undefined if the intervals are not sorted, not disjoint,
// not ascending, or out of bounds.

neat; I think behavior is defined: boom! :)


pkg/util/parquet/write_functions.go line 144 at r1 (raw file):

// A colWriter is responsible for encoding a datum and writing it to
// a file.ColumnChunkWriter. In some cases, the datum will be divided
// and written separately to different file.ColumnChunkWriters.

just say that array is usually single element array for scalar types, and tuple types have more...


pkg/util/parquet/write_functions.go line 770 at r1 (raw file):

}

// batchWriter is an interface representing parquet datumColumn chunk writers such as

i hate goland refactoring to rename things... I think you should keep column in the comment because that's what you meant.


pkg/util/parquet/writer.go line 141 at r1 (raw file):

	currentRowGroupWriter file.BufferedRowGroupWriter
	// Caches the file.ColumnChunkWriters for each datumColumn in the
	// schema definition.

neat... but requires expanded comment :)
Add examples/explanations why array is not okay and you need a map[int] -> []writer


pkg/util/parquet/writer.go line 189 at r1 (raw file):

	}
	colChunkWriters := make([]file.ColumnChunkWriter, 0, w.sch.cols[colIdx].numPhysicalCols)
	for i := w.sch.cols[colIdx].physicalColsStartIdx; i < w.sch.cols[colIdx].physicalColsStartIdx+w.sch.cols[colIdx].numPhysicalCols; i += 1 {

nit: perhaps do

startIdx := w.sch....
numFields := w.....numPhysicalCols

and use those in this array to shorten/improve readability.


pkg/util/parquet/writer.go line 237 at r1 (raw file):

		w.currentRowGroupSize = 0
		// Evict all entries from the cache.
		w.columnChunkWriterCache = make(map[int][]file.ColumnChunkWriter, len(w.sch.cols))

we have to recreate the map every time we reach group lenght?
Why? (And that makes me sad somewhat)


pkg/util/parquet/writer_test.go line 478 at r1 (raw file):

					types.MakeTuple([]*types.T{types.Int, types.Int}),
				},
				columnNames: []string{"a", "b", "c", "d", "e", "f", "g", "h"},

let's add the test that doesn't have tuple labels.

This change adds support for writing tuples. Implementation details below.

The standard way to write a tuple in parquet is to use a group:
```
message schema {                 -- toplevel schema
   optional group a (LIST) {
       optional T1 element;       -- physical column for the first field
       ...
       optional Tn element;       -- physical column for the nth field
   }
}
```

Because parquet has a very strict format, it does not write such groups
as one column with all the fields adjacent to each other. Instead, it
writes each field in the tuple as its own column. This 1:N mapping
from CRDB datum to physical column in parquet violates the assumption
used in this library that the mapping is 1:1.

This change aims to update the library to break that assumption. Firstly,
there is now a clear distiction between a "datum column" and a "physical
column". Also, the `Writer` is updated to be able to write to multiple
physical columns for a given datum, and the reader is updated
to "squash" physical columns into single tuple datums if needed. Finally,
randomized testing and benchmarking is extended to cover tuples.

Informs: cockroachdb#99028
Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Release note: None
@jayshrivastava
Copy link
Contributor Author

bors r=miretskiy

@craig
Copy link
Contributor

craig bot commented May 22, 2023

Build succeeded:

@craig craig bot merged commit 7ed8248 into cockroachdb:master May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants