Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIR] Build model in TensorflowPredictor.predict #25136

Merged

Conversation

bveeramani
Copy link
Member

@bveeramani bveeramani commented May 24, 2022

Important: This PR is stacked on:

Why are these changes needed?

TensorflowPredictor.predict doesn't work right now. For more information, see #25125.

Related issue number

Fixes #25125

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@bveeramani bveeramani assigned amogkam and xwjiang2010 and unassigned amogkam May 24, 2022
# the shape along the axis to 0.
input_shape[0] = 0

model.build(input_shape=input_shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding.. Is it needed for all kinds of models or just Sequential model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it's for all kinds of models, although I'm not super familiar with TensorFlow so I'm not certain.

The build method is bound to the Model class: https://github.com/keras-team/keras/blob/v2.9.0/keras/engine/training.py#L354

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm in that case our current test_tensorflow_predictor test should not be passing right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what the model definition is for the test

model = tf.keras.Sequential(
        [
            tf.keras.layers.InputLayer(input_shape=(1,)),
            tf.keras.layers.Dense(1),
        ]
    )

    return model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see it's only needed if we pass in model_weights is that right?

Copy link
Contributor

@amogkam amogkam May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks @bveeramani.

Can we add a test for this before merging this in?

Also, just a note, it seems like this is not necessary for if the keras model has an input layer already defined or if the first layer in the model already accepts an input shape. Do you know if there's any side effects to calling build if it's not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. The error doesn't occur if the input layer is defined.

model = models.Sequential()
model.add(layers.Conv2D(10, (3, 3)), input_shape=(32, 32, 3)))

weights = [np.zeros((3, 3, 3, 10)), np.zeros((10,))]
model.set_weights(weights)  # Okay!

I don't think there are any side effects from calling build twice.

import numpy as np
from tensorflow.keras import layers, models


model = models.Sequential()
model.add(layers.InputLayer(input_shape=(32, 32, 3)))
model.add(layers.Conv2D(10, (3, 3), input_shape=(32, 32, 3)))

weights = [np.zeros((3, 3, 3, 10)), np.zeros((10,))]

model.build(input_shape=(32, 32, 3))
model.build(input_shape=(32, 32, 3))  # Okay!
model.set_weights(weights)

The reason why the error was occurring for me is because I introduced a Lambda layer to handle the extra axis created by to_tf. However, once we merge #25133, I'll be able to remove the Lambda layer.

def build_model():
    model = models.Sequential()

    def squeeze(input):
        return tf.squeeze(input, axis=1)

    model.add(layers.Lambda(squeeze))
    ...

Given that this PR is no longer necessary for the image examples, do you think we should still add this? I don't think it'll cause any problems, but I'm okay closing this PR to avoid adding complexity to TensorflowPredictor.predict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question, are we expecting a user to do the following?

def build_model():
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Dense(8))
  model.add(tf.keras.layers.Dense(4))
  return model

predictor = BatchPredictor.from_checkpoint(cp, TensorflowPredictor, model_definition=build_model)

If the above is legit, then we should keep this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question, are we expecting a user to do the following?

I'm not entirely sure, but it seems like the input shape isn't specified in at least one TensorFlow example. This PR shouldn't directly hurt us, so maybe we should keep the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for this before merging this in?

@amogkam fixed

# the shape along the axis to 0.
input_shape[0] = 0

model.build(input_shape=input_shape)
Copy link
Contributor

@amogkam amogkam May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks @bveeramani.

Can we add a test for this before merging this in?

Also, just a note, it seems like this is not necessary for if the keras model has an input layer already defined or if the first layer in the model already accepts an input shape. Do you know if there's any side effects to calling build if it's not needed?

@bveeramani bveeramani added this to the Ray AIR milestone May 25, 2022
Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! Will merge in after the #25208 is merged

@amogkam amogkam merged commit f623c60 into ray-project:master May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AIR] TensorflowPredictor doesn't create model weights
3 participants