Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIR] Build model in TensorflowPredictor.predict #25136

Merged
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,14 @@ def build_model(self):
# a callable that returns the model and initialize it here,
# instead of having an initialized model object as an attribute.
model = self.model_definition()

if self.model_weights:
input_shape = list(tensor.shape)
# The batch axis can contain varying number of elements, so we set
# the shape along the axis to 0.
input_shape[0] = 0

model.build(input_shape=input_shape)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding.. Is it needed for all kinds of models or just Sequential model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that it's for all kinds of models, although I'm not super familiar with TensorFlow so I'm not certain.

The build method is bound to the Model class: https://github.com/keras-team/keras/blob/v2.9.0/keras/engine/training.py#L354

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm in that case our current test_tensorflow_predictor test should not be passing right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what the model definition is for the test

model = tf.keras.Sequential(
        [
            tf.keras.layers.InputLayer(input_shape=(1,)),
            tf.keras.layers.Dense(1),
        ]
    )

    return model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see it's only needed if we pass in model_weights is that right?

Copy link
Contributor

@amogkam amogkam May 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it thanks @bveeramani.

Can we add a test for this before merging this in?

Also, just a note, it seems like this is not necessary for if the keras model has an input layer already defined or if the first layer in the model already accepts an input shape. Do you know if there's any side effects to calling build if it's not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. The error doesn't occur if the input layer is defined.

model = models.Sequential()
model.add(layers.Conv2D(10, (3, 3)), input_shape=(32, 32, 3)))

weights = [np.zeros((3, 3, 3, 10)), np.zeros((10,))]
model.set_weights(weights)  # Okay!

I don't think there are any side effects from calling build twice.

import numpy as np
from tensorflow.keras import layers, models


model = models.Sequential()
model.add(layers.InputLayer(input_shape=(32, 32, 3)))
model.add(layers.Conv2D(10, (3, 3), input_shape=(32, 32, 3)))

weights = [np.zeros((3, 3, 3, 10)), np.zeros((10,))]

model.build(input_shape=(32, 32, 3))
model.build(input_shape=(32, 32, 3))  # Okay!
model.set_weights(weights)

The reason why the error was occurring for me is because I introduced a Lambda layer to handle the extra axis created by to_tf. However, once we merge #25133, I'll be able to remove the Lambda layer.

def build_model():
    model = models.Sequential()

    def squeeze(input):
        return tf.squeeze(input, axis=1)

    model.add(layers.Lambda(squeeze))
    ...

Given that this PR is no longer necessary for the image examples, do you think we should still add this? I don't think it'll cause any problems, but I'm okay closing this PR to avoid adding complexity to TensorflowPredictor.predict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question, are we expecting a user to do the following?

def build_model():
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Dense(8))
  model.add(tf.keras.layers.Dense(4))
  return model

predictor = BatchPredictor.from_checkpoint(cp, TensorflowPredictor, model_definition=build_model)

If the above is legit, then we should keep this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question, are we expecting a user to do the following?

I'm not entirely sure, but it seems like the input shape isn't specified in at least one TensorFlow example. This PR shouldn't directly hurt us, so maybe we should keep the change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for this before merging this in?

@amogkam fixed

model.set_weights(self.model_weights)

prediction = model(tensor).numpy().ravel()
Expand Down