Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIR][Serve] Add windows check for pd.DataFrame comparison #26889

Merged
merged 1 commit into from
Jul 22, 2022

Conversation

jiaodong
Copy link
Member

Why are these changes needed?

In previous implementation #26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.

  | unpacked_list = BatchingManager.split_dataframe(batched_df, 1)
  | assert len(unpacked_list) == 1
  | >       assert unpacked_list[0]["a"].equals(split_df["a"])
  | E       assert False
  | E        +  where False = <bound method NDFrame.equals of 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32>(0    1\n1    2\n2    3\n3    4\nName: a, dtype: int64)
  | E        +    where <bound method NDFrame.equals of 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32> = 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32.equals

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@@ -90,8 +91,13 @@ def test_dataframe_with_tensorarray(self):

unpacked_list = BatchingManager.split_dataframe(batched_df, 1)
assert len(unpacked_list) == 1
assert unpacked_list[0]["a"].equals(split_df["a"])
assert unpacked_list[0]["b"].equals(split_df["b"])
# On windows, conversion dtype is not preserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sad.... nice find!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @clarkzinzow for awareness .. not sure if its related to TensorArray, but in the worst case this means AIR could mess up user's tensor dtype among full / half / mixed precision training =.=

@scv119 scv119 merged commit a03716e into ray-project:master Jul 22, 2022
Rohan138 pushed a commit to Rohan138/ray that referenced this pull request Jul 28, 2022
…t#26889

n previous implementation ray-project#26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.

Signed-off-by: Rohan138 <[email protected]>
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
…t#26889

n previous implementation ray-project#26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.

Signed-off-by: Stefan van der Kleij <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants