[AIR] `TensorflowPredictor` doesn't create model weights #25125

bveeramani · 2022-05-24T05:02:46Z

What happened + What you expected to happen

I tried classifying an image, but my program errored.

❯ python module.py
2022-05-23 21:59:07.729578: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-05-23 21:59:10,010 INFO services.py:1478 -- View the Ray dashboard at http://127.0.0.1:8265
(raylet) E0523 21:59:11.602878000 4562591232 fork_posix.cc:76]                  Other threads are currently calling into gRPC, skipping fork() handlers
2022-05-23 21:59:12,088 WARNING read_api.py:252 -- The number of blocks in this dataset (4) limits its parallelism to 4 concurrent tasks. This is much less than the number of available CPU slots in the cluster. Use `.repartition(n)` to increase the number of dataset blocks.
[dataset]: Run `pip install tqdm` to enable progress reporting.
(BlockWorker pid=7430) 2022-05-23 21:59:16.252786: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
(BlockWorker pid=7430) To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "module.py", line 40, in <module>
    batch_predictor.predict(dataset)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/ml/batch_predictor.py", line 93, in predict
    return data.map_batches(
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/dataset.py", line 332, in map_batches
    return Dataset(plan, self._epoch, self._lazy)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/dataset.py", line 140, in __init__
    self._plan.execute(allow_clear_input_blocks=False)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/plan.py", line 257, in execute
    blocks, stage_info = stage(blocks, clear_input_blocks)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/plan.py", line 436, in __call__
    blocks = compute._apply(
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/compute.py", line 266, in _apply
    new_metadata = ray.get(new_metadata)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
    return func(*args, **kwargs)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/worker.py", line 1843, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::BlockWorker.map_block_nosplit() (pid=7430, ip=127.0.0.1, repr=<ray.data.impl.compute.BlockWorker object at 0x194a80490>)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/compute.py", line 185, in map_block_nosplit
    return _map_block_nosplit(block, fn, input_files)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/compute.py", line 341, in _map_block_nosplit
    for new_block in fn(block):
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/dataset.py", line 308, in transform
    applied = fn(view)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/data/impl/compute.py", line 300, in _fn
    return ray.data._cached_fn(item)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/ml/batch_predictor.py", line 83, in __call__
    return self.predictor.predict(batch, **predict_kwargs)
  File "/private/tmp/.venv/lib/python3.8/site-packages/ray/ml/predictors/integrations/tensorflow/tensorflow_predictor.py", line 170, in predict
    model.set_weights(self.model_weights)
  File "/private/tmp/.venv/lib/python3.8/site-packages/keras/engine/base_layer.py", line 1614, in set_weights
    params = self.weights
  File "/private/tmp/.venv/lib/python3.8/site-packages/keras/engine/training.py", line 2829, in weights
    return self._dedup_weights(self._undeduplicated_weights)
  File "/private/tmp/.venv/lib/python3.8/site-packages/keras/engine/training.py", line 2834, in _undeduplicated_weights
    self._assert_weights_created()
  File "/private/tmp/.venv/lib/python3.8/site-packages/keras/engine/sequential.py", line 472, in _assert_weights_created
    super(functional.Functional, self)._assert_weights_created()  # pylint: disable=bad-super-call
  File "/private/tmp/.venv/lib/python3.8/site-packages/keras/engine/training.py", line 3027, in _assert_weights_created
    raise ValueError(f'Weights for model {self.name} have not yet been '
ValueError: Weights for model sequential_3 have not yet been created. Weights are created when the Model is first called on inputs or `build()` is called with an `input_shape`.

This error only occurs if you patch #25124 first!

Versions / Dependencies

Ray: 4444150
Python: 3.8.12
OS: MacOS

Reproduction script

import ray
from ray.ml.predictors.integrations.tensorflow import TensorflowPredictor
import tensorflow as tf
from tensorflow.keras import models, layers
from ray.ml.batch_predictor import BatchPredictor
from ray.ml.checkpoint import Checkpoint

batch_size = 4
height = 32
width = 32
num_channels = 3
num_classes = 10

def build_model():
    model = models.Sequential()

    model.add(layers.Lambda(lambda tensor: tf.squeeze(tensor, axis=1)))

    model.add(layers.Conv2D(6, (5, 5), activation='relu', input_shape=(height, width, num_channels)))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(16, (5, 5), activation='relu'))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Flatten())
    model.add(layers.Dense(120, activation='relu'))
    model.add(layers.Dense(84, activation='relu'))
    model.add(layers.Dense(num_classes))
    return model

model = build_model()
model.build(input_shape=(0, 1, 32, 32, 3))
checkpoint = Checkpoint.from_dict({"model": model.get_weights()})

batch_predictor = BatchPredictor.from_checkpoint(
    checkpoint=checkpoint,
    predictor_cls=TensorflowPredictor,
    model_definition=build_model,
)

dataset = ray.data.range_tensor(batch_size, shape=(1, height, width, num_channels))
batch_predictor.predict(dataset)

Issue Severity

High: It blocks me from completing my task.

The text was updated successfully, but these errors were encountered:

`TensorflowPredictor.predict` doesn't work right now. For more information, see #25125. Co-authored-by: Amog Kamsetty <[email protected]>

bveeramani added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) air labels May 24, 2022

bveeramani changed the title ~~[AIR] TorchPredictor doesn't create model weights~~ [AIR] TensorflowPredicotr doesn't create model weights May 24, 2022

bveeramani changed the title ~~[AIR] TensorflowPredicotr doesn't create model weights~~ [AIR] TensorflowPredictor doesn't create model weights May 24, 2022

bveeramani mentioned this issue May 24, 2022

[AIR] Build model in TensorflowPredictor.predict #25136

Merged

7 tasks

bveeramani self-assigned this May 24, 2022

bveeramani mentioned this issue May 24, 2022

[AIR] TensorflowPredictor incorrectly ravels predictions #25137

Closed

bveeramani added this to the Ray AIR milestone May 24, 2022

bveeramani mentioned this issue May 24, 2022

[AIR] Add TensorFlow image example #24633

Closed

10 tasks

amogkam closed this as completed in #25136 May 26, 2022

amogkam added a commit that referenced this issue May 26, 2022

[AIR] Build model in TensorflowPredictor.predict (#25136)

f623c60

`TensorflowPredictor.predict` doesn't work right now. For more information, see #25125. Co-authored-by: Amog Kamsetty <[email protected]>

hora-anyscale added P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIR] `TensorflowPredictor` doesn't create model weights #25125

[AIR] `TensorflowPredictor` doesn't create model weights #25125

bveeramani commented May 24, 2022

[AIR] TensorflowPredictor doesn't create model weights #25125

[AIR] TensorflowPredictor doesn't create model weights #25125

Comments

bveeramani commented May 24, 2022

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

[AIR] `TensorflowPredictor` doesn't create model weights #25125

[AIR] `TensorflowPredictor` doesn't create model weights #25125