Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P0][AIR] Fix train to serve notebooks #26821

Merged
merged 12 commits into from
Jul 22, 2022

Conversation

jiaodong
Copy link
Member

@jiaodong jiaodong commented Jul 21, 2022

Why are these changes needed?

This is the last PR needed before we have ALL AIR notebooks fixed and enabled on CI for Ray 2.0

Notable changes

Use TensorArray for incremental learning's images

def preprocess_images(df: pd.DataFrame) -> pd.DataFrame:
    """Preprocess images by scaling each channel in the image."""

    torchvision_transforms = transforms.Compose(
      [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]
    )

    df["image"] = TensorArray([torchvision_transforms(image) for image in df["image"]])
    return df

mnist_normalize_preprocessor = BatchMapper(fn=preprocess_images)

Conditional on dropped column for tabular train to serve

    def concat_for_tensor(dataframe):
        from ray.data.extensions import TensorArray
        result = {}
        feature_cols = [col for col in dataframe.columns if col != LABEL]
        result["input"] = TensorArray(dataframe[feature_cols].to_numpy(dtype=np.float32))
        if LABEL in dataframe.columns:
            result[LABEL] = dataframe[LABEL]
        return  pd.DataFrame(result)

Since AIR's BatchPredictor returns at most pd.DataFrame({col: TensorArray}) with no nesting, simply iterate all columns and call to_numpy for TensorArray and TensorArrayElement.

Related issue number

Closes #26410

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serve part LGTM

python/ray/serve/air_integrations.py Outdated Show resolved Hide resolved
Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm. We should use the new Concatenator preprocessor instead of implementing a custom one, but we can do this in a follow up PR!

python/ray/serve/air_integrations.py Show resolved Hide resolved
@jiaodong
Copy link
Member Author

jiaodong commented Jul 21, 2022

There's a failed serve test that i'm retrying, but repro-ci worked on the aws instance.

//python/ray/serve:test_standalone2                                      PASSED in 451.1s

Executed 1 out of 1 test: 1 test passes.
(13:21:07) INFO: Build completed successfully, 6 total actions

The test is also flaky on master
Screen Shot 2022-07-21 at 1 36 25 PM

Copy link
Contributor

@richardliaw richardliaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stamp

@richardliaw richardliaw merged commit db027d8 into ray-project:master Jul 22, 2022
@scv119
Copy link
Contributor

scv119 commented Jul 22, 2022

image

seems this breaks windows://python/ray/serve:test_air_integrations not sure how important that one is. @jiaodong to triage.

@jiaodong
Copy link
Member Author

image

seems this breaks windows://python/ray/serve:test_air_integrations not sure how important that one is. @jiaodong to triage.

Ah my bad. I will fix it asap.

@jiaodong
Copy link
Member Author

jiaodong commented Jul 22, 2022

Error is windows only that suggests on windows, unpacking TensorArray changes underlying data type from int64 to int32

________ TestBatchingFunctionFunctions.test_dataframe_with_tensorarray ________
--
  |  
  | self = <com_github_ray_project_ray.python.ray.serve.tests.test_air_integrations.TestBatchingFunctionFunctions object at 0x00000175A3B57460>
  |  
  | def test_dataframe_with_tensorarray(self):
  | batched_df = pd.DataFrame(
  | {
  | "a": TensorArray([1, 2, 3, 4]),
  | "b": TensorArray([5, 6, 7, 8]),
  | }
  | )
  | split_df = pd.DataFrame(
  | {
  | "a": [1, 2, 3, 4],
  | "b": [5, 6, 7, 8],
  | }
  | )
  |  
  | unpacked_list = BatchingManager.split_dataframe(batched_df, 1)
  | assert len(unpacked_list) == 1
  | >       assert unpacked_list[0]["a"].equals(split_df["a"])
  | E       assert False
  | E        +  where False = <bound method NDFrame.equals of 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32>(0    1\n1    2\n2    3\n3    4\nName: a, dtype: int64)
  | E        +    where <bound method NDFrame.equals of 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32> = 0    1\n1    2\n2    3\n3    4\nName: a, dtype: int32.equals

scv119 pushed a commit that referenced this pull request Jul 22, 2022
n previous implementation #26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.
Rohan138 pushed a commit to Rohan138/ray that referenced this pull request Jul 28, 2022
Rohan138 pushed a commit to Rohan138/ray that referenced this pull request Jul 28, 2022
…t#26889

n previous implementation ray-project#26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.

Signed-off-by: Rohan138 <[email protected]>
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
Co-authored-by: Simon Mo <[email protected]>
Signed-off-by: Stefan van der Kleij <[email protected]>
Stefan-1313 pushed a commit to Stefan-1313/ray_mod that referenced this pull request Aug 18, 2022
…t#26889

n previous implementation ray-project#26821 we have windows failure suggesting we behave differently on windows regarding datatype conversion.

In our https://sourcegraph.com/github.com/ray-project/ray/-/blob/python/ray/data/tests/test_dataset.py?L577 regarding use of TensorArray we seem to rely on pd'sassert_frame_equal rather than manually comparing frames.

This PR adds a quick conditional on windows only to ignore dtype for now.

Signed-off-by: Stefan van der Kleij <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AIR/Tune/Docs] Mass documentation testing failure
6 participants