-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-1434] Serialize and deserialize RowVector #250
Conversation
4a38461
to
2696a36
Compare
9e57e79
to
ee02bea
Compare
Can you help review this one, it will replace current netty serialize Arrow ColumnarBatch to bytes. https://github.com/oap-project/gluten/blob/main/gluten-data/src/main/scala/io/glutenproject/backendsapi/glutendata/GlutenSparkPlanExecApi.scala#L227 |
@@ -0,0 +1,761 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any other proper name instead of SparkXXX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about SingleVectorSerializer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would RowVectorSerializer
make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The super class is following, it is really RowVectorSerializer
, other serializer can append several RowVectorPtr
, but this serializer can only append one non-empty vector, so I named it SingleVectorSerializer
class VectorSerializer {
public:
virtual ~VectorSerializer() = default;
virtual void append(
const RowVectorPtr& vector,
const folly::Range<const IndexRange*>& ranges) = 0;
// Writes the contents to 'stream' in wire format
virtual void flush(OutputStream* stream) = 0;
};
#include "velox/common/memory/ByteStream.h" | ||
// #include "velox/functions/prestosql/types/TimestampWithTimeZoneType.h" | ||
// #include "velox/type/Date.h" | ||
// #include "velox/vector/BiasVector.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
|
||
template <TypeKind kind> | ||
void serializeFlatVector(const BaseVector* vector, SparkVectorStream* stream) { | ||
// using T = typename TypeTraits<kind>::NativeType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
// time, call appendNull or appendNonNull first. Then call | ||
// appendLength if the type has a length. A null value has a length of | ||
// 0. Then call appendValue if the value was not null. | ||
class SparkVectorStream { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider renaming.
ee02bea
to
2ede5ad
Compare
Fixed all the comments. Can you help review again? @rui-mo |
@jinchengchenghh LGTM. Is this PR verified on CI and Jenkins? |
The API does not used in Gluten now, it is only tested by velox unit tests |
Support serialize and deserialize RowVector.
Support serialize and deserialize RowVector.
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
Support serialize and deserialize RowVector.
Support serialize and deserialize RowVector.
Support serialize and deserialize RowVector.
Support serialize and deserialize RowVector.
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
relative pr: Serialize and deserialize RowVector oap-project#250
relative pr: Support more data types for read filter oap-project#139 Fix cast double to decimal oap-project#179 Fix casting from string to decimal oap-project#281 Support cast decimal to int oap-project#177 Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169 Disable tokenizing the path by dot oap-project#109 Serialize and deserialize RowVector oap-project#250 Support datetime pattern in spark oap-project#94
…ct#250) * fix backend velox codestyle * change remark codestyle to compact mode and add intellij codestyle xml
No description provided.