-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIR - Datasets] Encode number of dimensions in variable-shaped tensor extension type. #29281
[AIR - Datasets] Encode number of dimensions in variable-shaped tensor extension type. #29281
Conversation
dc64bb1
to
d97c6ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, although not sure if I have enough familiarity with the tensor extension code to approve
dc53d80
to
14dc07e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
c7d8566
to
a85626a
Compare
if ndim is not None and a.ndim != ndim: | ||
raise ValueError( | ||
"ArrowVariableShapedTensorArray only supports tensor elements that " | ||
"all have the same number of dimensinos, but got tensor elements " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "dimensinos" seems not fixed yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh weird, I thought I fixed that, good catch!
Signed-off-by: Clark Zinzow <[email protected]>
…r extension type. (ray-project#29281) Knowing the number of dimensions in a variable-shaped tensor column is useful for e.g. inferring a ragged tensor spec when constructing a tf.data Dataset; by encoding this ndim data in the extension type, we can do this type inference base on Dataset metadata, which is required. Note that this change will disallow variable-shaped tensor columns containing tensor elements that have a variable number of dimensions. This isn't supported by TensorFlow and Torch ragged tensors, so sacrificing this feature seems tenable. Signed-off-by: Weichen Xu <[email protected]>
Knowing the number of dimensions in a variable-shaped tensor column is useful for e.g. inferring a ragged tensor spec when constructing a
tf.data
Dataset; by encoding thisndim
data in the extension type, we can do this type inference based on Dataset metadata, which is required.Note that this change will disallow variable-shaped tensor columns containing tensor elements that have a variable number of dimensions. This isn't supported by TensorFlow and Torch ragged tensors, so sacrificing this feature seems tenable.
Related issue number
Closes #29135
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.