Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-1434] Serialize and deserialize RowVector #250

Merged
merged 2 commits into from
May 17, 2023

Conversation

jinchengchenghh
Copy link
Collaborator

No description provided.

@jinchengchenghh
Copy link
Collaborator Author

Can you help review this one, it will replace current netty serialize Arrow ColumnarBatch to bytes. https://github.com/oap-project/gluten/blob/main/gluten-data/src/main/scala/io/glutenproject/backendsapi/glutendata/GlutenSparkPlanExecApi.scala#L227

@rui-mo

@@ -0,0 +1,761 @@
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other proper name instead of SparkXXX?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about SingleVectorSerializer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would RowVectorSerializer make sense?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The super class is following, it is really RowVectorSerializer, other serializer can append several RowVectorPtr, but this serializer can only append one non-empty vector, so I named it SingleVectorSerializer

class VectorSerializer {
 public:
  virtual ~VectorSerializer() = default;

  virtual void append(
      const RowVectorPtr& vector,
      const folly::Range<const IndexRange*>& ranges) = 0;

  // Writes the contents to 'stream' in wire format
  virtual void flush(OutputStream* stream) = 0;
};

#include "velox/common/memory/ByteStream.h"
// #include "velox/functions/prestosql/types/TimestampWithTimeZoneType.h"
// #include "velox/type/Date.h"
// #include "velox/vector/BiasVector.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?


template <TypeKind kind>
void serializeFlatVector(const BaseVector* vector, SparkVectorStream* stream) {
// using T = typename TypeTraits<kind>::NativeType;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

// time, call appendNull or appendNonNull first. Then call
// appendLength if the type has a length. A null value has a length of
// 0. Then call appendValue if the value was not null.
class SparkVectorStream {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming.

@jinchengchenghh
Copy link
Collaborator Author

Fixed all the comments. Can you help review again? @rui-mo

@rui-mo
Copy link
Collaborator

rui-mo commented May 17, 2023

Fixed all the comments. Can you help review again? @rui-mo

@jinchengchenghh LGTM. Is this PR verified on CI and Jenkins?

@jinchengchenghh
Copy link
Collaborator Author

The API does not used in Gluten now, it is only tested by velox unit tests

@rui-mo rui-mo merged commit bde7b6a into oap-project:main May 17, 2023
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request May 18, 2023
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request May 31, 2023
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request May 31, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request May 31, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request May 31, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request May 31, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request Jun 25, 2023
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request Jun 25, 2023
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request Jun 26, 2023
zhejiangxiaomai pushed a commit to zhejiangxiaomai/velox that referenced this pull request Jun 27, 2023
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 3, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 3, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 4, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 4, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 11, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 11, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 12, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 12, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 12, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 12, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 17, 2023
relative pr:

Serialize and deserialize RowVector oap-project#250
zhejiangxiaomai added a commit to zhejiangxiaomai/velox that referenced this pull request Jul 17, 2023
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
marin-ma pushed a commit to marin-ma/velox-oap that referenced this pull request Dec 15, 2023
…ct#250)

* fix backend velox codestyle

* change remark codestyle to compact mode and add intellij codestyle xml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants