Unexpected breaking change: Optimizer.get_weights() removed #442

adriangb · 2022-09-04T17:17:48Z

It seems like Optimizer.get_weights() is being removed. SciKeras was using it to serialize optimizer weights since SavedModel silently fails to do so (see tensorflow/tensorflow#44670 and other linked issues, this is a longstanding bug that hasn't been fixed). Could someone fill me in on what the plans are going forward? Pickling models is an essential part how Scikit-Learn operates and hence SciKeras gets completely broken if TensorFlow models can't be serialized.

The text was updated successfully, but these errors were encountered:

tilakrayal · 2022-09-05T10:06:01Z

@gowthamkpr,
I was able to reproduce the issue on tensorflow v2.8, v2.9 and nightly. Kindly find the gist of it here.

mattdangerw · 2022-09-08T17:11:49Z

@chenmoneygithub can you take a look here?

chenmoneygithub · 2022-09-08T18:16:47Z

@adriangb Thanks for reporting the issue!

There has not been any change on get_weights() for months. For loading optimizer weights, please make sure you call load_weights() if you want to get the optimizer weights. For example:

def roundtrip(model: keras.Model) -> keras.Model:
    save_dir = "/tmp/mymodel"
    model.save(save_dir)
    restored = keras.models.load_model(save_dir)
    restored.load_weights(save_dir)
    return restored

adriangb · 2022-09-08T18:46:17Z

On tensorflow==2.8.0:

>> import tensorflow as tf
>> tf.keras.optimizers.get("rmsprop").get_weights
<bound method OptimizerV2.get_weights of <keras.optimizer_v2.rmsprop.RMSprop object at 0x7f9f580e9360>>

On tf-nightly (9/8/2022):

>> import tensorflow as tf
>> tf.keras.optimizers.get("rmsprop").get_weights
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'RMSprop' object has no attribute 'get_weights'

This is because the new keras.optimizers.optimizer_experimental.Optimizer does not have a get_weights() method while the old keras.optimizers.optimizer_v2.optimizer_v2.OptimizerV2 does.

adriangb · 2022-10-14T16:40:48Z

@chenmoneygithub any updates on this? This is breaking SciKeras (and presumably other downstream things)

Here's notebooks proving this is broken:

Until https://github.com/keras-team/keras/issues/16983 gets resolved

adriangb · 2022-11-18T06:26:38Z

@mattdangerw would you mind giving an update now that 2.11.0 was released? Please let me know if I am doing something wrong or if there are alternatives, but as far as I can tell this was an unannounced breaking change with no alternative API.

chenmoneygithub · 2022-11-18T07:07:28Z

get_weights method is removed in the new optimizer, if you need to access the weights, please call variables() method.

But I don't think the issue is caused by this deprecation, at the time then issue got created the new optimizer was still in experimental namespace, so it should be an issue with serializing the old optimizer.

At the current version, I would encourage reworking the optimizer serialization strategy to not rely on the get_weights method. If you can point me the exact code you are using, we can check what could be an alternative solution. thanks!

adriangb · 2022-11-18T07:09:22Z

Digging a bit I see that there at least is a public .variables API. I can use that to get the weights, but setting them for Adam does not work as expected:

import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(1,)),
        tf.keras.layers.Dense(1, activation="softmax"),
    ]
)
model.compile(optimizer="adam", loss="categorical_crossentropy")
model.fit([[1]], [0])

optimizer = model.optimizer
variables = optimizer.variables()
print(len(variables))  # 5

new = tf.keras.optimizers.Adam()
print(len(new.variables()))  # 1
new.build(variables)
print(len(new.variables()))  # 11!
new.set_weights(variables)  # fails
# ValueError: Optimizer variable m/iteration_568 has shape () not compatible with provided weight shape (1, 1)

Maybe that's an unrelated bug? I would expect this roundtrip to work

adriangb · 2022-11-18T07:11:34Z

Here's the current code: https://github.com/adriangb/scikeras/blob/master/scikeras/_saving_utils.py

It's the only reliable way that I know of to serialize optimizers with state like Adam.

adriangb · 2022-11-18T07:12:27Z

at the time then issue got created the new optimizer was still in experimental namespace

I reported the issue because I test with tf-nightly precisely so I can catch these sorts of breaking changes before they are released

chenmoneygithub · 2022-11-18T07:13:21Z

There is not set_weights method. The current workaround is to loop over the variable list and set each variable individually.

Some more context - we no longer keep the set_weights and get_weights because optimizer variables are now stored as common TF variables, so the checkpoint saving/restoring is the same as Keras layers. Can you try using the same code for serializing Keras layer on optimizer? Theoretically it should work.

adriangb · 2022-11-18T07:14:41Z

set_weights() is definitely there: https://github.com/keras-team/keras/blob/e6784e4302c7b8cd116b74a784f4b78d60e83c26/keras/optimizers/optimizer_experimental/optimizer.py#L777

(from the 2.11.0 tag)

adriangb · 2022-11-18T07:16:43Z

the checkpoint saving/restoring is the same as Keras layers. Can you try using the same code for serializing Keras layer on optimizer? Theoretically it should work.

That's good news! It may "just work" then and I can remove these workarounds. Let me try.

adriangb · 2022-11-18T07:23:18Z

Exciting news! I think all of the bugs with optimizer serialization are fixed. So I think all of the following tickets are resolved:
keras-team/keras#15661
tensorflow/tensorflow#44670
#70

I'll update SciKeras and run tests in CI to confirm

adriangb · 2022-11-18T07:37:16Z

Nope, I got my hopes up too soon. The bug reported in tensorflow/tensorflow#44670 is still there.

So yeah @chenmoneygithub can you think of a way to serialize and deserialize stateful optimizers in 2.11.0?

chenmoneygithub · 2022-11-18T19:34:01Z

You need to call load_weights() to restore the weights, otherwise it will be lazily loaded to my knowledge.

For your code snippet, you want to do new.load_weights("model") after the load_model() call.

adriangb · 2022-11-18T19:44:05Z

I think Model.save_weights() and Model.load_weights() deal with the model weights, not the optimizer variables/state/weights. Also Model.save() and load_model() do save/load the model weights (but not the optimizer state/variables/weights).

If I'm missing something, maybe you can give me a self-contained example where a model is saved and re-loaded preserving the optimizer state?

chenmoneygithub · 2022-11-18T20:14:07Z

import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(1,)),
        tf.keras.layers.Dense(1, activation="softmax"),
    ]
)
model.compile(optimizer="adam", loss="categorical_crossentropy")
model.fit([[1]], [0])

model.save("model")
new = tf.keras.models.load_model("model")
new.load_weights("model")

It loads the optimizer state. You cannot do the length equal assertion because iteration somehow is no longer a variable right after restoring, but you can still access it with the right value by new.optimizer.iterations.

adriangb · 2022-11-18T20:23:34Z

Here's what I'm getting:

import tensorflow as tf

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(1,)),
        tf.keras.layers.Dense(1, activation="softmax"),
    ]
)
model.compile(optimizer="adam", loss="categorical_crossentropy")
model.fit([[1]], [0])

model.save("model")
new = tf.keras.models.load_model("model")
new.load_weights("model")

print([v.name for v in model.optimizer.variables()])  # ['iteration:0', 'Adam/m/dense/kernel:0', 'Adam/v/dense/kernel:0', 'Adam/m/dense/bias:0', 'Adam/v/dense/bias:0']
print([v.name for v in new.optimizer.variables()])  # ['iteration:0']

My understanding is that iteration corresponds to the first variable (of shape ()).
The issue here is that the rest of the variables are not restored at all. Notice how model.optimizer.variables() has 5 variables while new.optimizer.variables() has just 1 variable (iteration). This means that the optimizer state/weights/variables were not restored and attempting to resume training would result in no errors but probably really bad or at least unexpected results.

So I think that snipped you posted does not work.

adriangb · 2022-11-29T01:29:33Z

@chenmoneygithub looping back here. Am I missing something or does your suggestion indeed not work? Thanks

chenmoneygithub · 2022-11-29T01:34:38Z

@adriangb What's your TF and Keras version? The snippet works as expected on my testing.

adriangb · 2022-11-29T01:41:08Z

https://colab.research.google.com/drive/1p9XOAE9SwU3ZATKVHmzxWIDHK1BGF7S3?usp=sharing

This notebook confirms my results. The TF and Keras versions are printed out as well (2.11.0 for both).

Is there something I'm missing or wrong with this notebook?

chenmoneygithub · 2022-11-29T04:33:43Z

In 2.11 the optimizer does lazy loading, if you want to explicitly restore the variable values, you need to call optimizer.build(model.trainable_variables), which is automatically called at the first time of updating variable value.

A little more context - Keras team made a new-version optimizer, and is available via tf.keras.optimizers.experimental.XXX in 2.9/2.10 release, and we have made that default in 2.11. The legacy optimizer is moved under legacy namesapce. Please see more details here: https://github.com/tensorflow/tensorflow/releases

For serialization/deserialization purpose, I don't know what the current approach is. One potential solution I am thinking about is to explicitly call optimizer.variables to get all variables during serialization, and set variables one by one during deserialization. Could you point me to the code in SciKeras that does the work? I can try taking a closer look, thanks!

chenmoneygithub · 2022-11-29T04:57:55Z

import tensorflow as tf

print(tf.__version__)
print(tf.keras.__version__)

model = tf.keras.Sequential(
    [
        tf.keras.Input(shape=(1,)),
        tf.keras.layers.Dense(1, activation="softmax"),
    ]
)
model.compile(optimizer="adam", loss="categorical_crossentropy")
model.fit([[1]], [0], verbose=0)

model.save("model")
new = tf.keras.models.load_model("model")
new.load_weights("model")

new.optimizer.build(model.trainable_variables)

print([v.name for v in model.optimizer.variables()])  # ['iteration:0', 'Adam/m/dense/kernel:0', 'Adam/v/dense/kernel:0', 'Adam/m/dense/bias:0', 'Adam/v/dense/bias:0']
print([v.name for v in new.optimizer.variables()])  # ['iteration:0', 'm/dense_2/kernel:0', 'v/dense_2/kernel:0', 'm/dense_2/bias:0', 'v/dense_2/bias:0']

Please check if this works for you, thx!

adriangb · 2022-11-29T08:41:13Z

Yes, I think that works! I’ll give it some more in depth testing and confirm. Thank you for your help.

… Keras team has temporarily broken cross-validation in their current release's optimizers (https://github.com/keras-team/keras/issues/16983). Open question: why are the CV results for the neural network model consistently 1.5-2x better than the single-random-test results?

ValerioSpenn · 2024-09-02T16:19:12Z

Hello everyone, i made some tries. (keras and tf ==2.15)

import tensorflow as tf
import pickle
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# https://github.com/keras-team/tf-keras/issues/442

# Prepare MNIST DATASET
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 28*28).astype('float32') / 255.0
x_test = x_test.reshape(-1, 28*28).astype('float32') / 255.0
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)


def create_model():
    model = tf.keras.Sequential([
        tf.keras.Input(shape=(28*28,)),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax'),
    ])
    model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
    return model

def save_model_and_optimizer(model, save_path):
    model.save(save_path)
    optimizer_weights = [v.numpy() for v in model.optimizer.variables()]
    with open(f"{save_path}_optimizer.pkl", "wb") as f:
        pickle.dump(optimizer_weights, f)
    print(f"Model and optimizer state saved to {save_path} and {save_path}_optimizer.pkl")

def restore_model_and_optimizer(model, restore_path):
    model = tf.keras.models.load_model(restore_path)
    with open(f"{restore_path}_optimizer.pkl", "rb") as f:
        optimizer_weights = pickle.load(f)
    model.optimizer.build(model.trainable_variables)
    for var, weight in zip(model.optimizer.variables(), optimizer_weights):
        var.assign(weight)
    print(f"Model and optimizer state restored from {restore_path} and {restore_path}_optimizer.pkl")
    return model

import numpy as np

def compare_optimizer_variables(model, restored_model):
    # Get optimizer variables from both models
    original_vars = model.optimizer.variables()
    restored_vars = restored_model.optimizer.variables()
    
    # Define how many variables to compare
    num_vars_to_compare = min(3, len(original_vars), len(restored_vars))
    
    for i in range(num_vars_to_compare):
        orig = original_vars[i]
        rest = restored_vars[i]
        
        # Print variable details
        print(f"Original variable: {orig.name}, shape: {orig.shape}, dtype: {orig.dtype}, numpy: {orig.numpy()}")
        print(f"Restored variable: {rest.name}, shape: {rest.shape}, dtype: {rest.dtype}, numpy: {rest.numpy()}")
        
        # Check for mismatches
        if not np.allclose(orig.numpy(), rest.numpy()):
            print(f"Mismatch found in {orig.name}")
        else:
            print(f"{orig.name} matches.")
    
    print("Optimizer variables comparison done.")

model = create_model()
history = model.fit(x_train, y_train, epochs=5, validation_split=0.2, verbose=1)

save_model_and_optimizer(model, 'model_checkpoint')

final_loss_original = history.history['loss'][-1]
final_accuracy_original = history.history['accuracy'][-1]
print(f"Final loss after initial training: {final_loss_original}")
print(f"Final accuracy after initial training: {final_accuracy_original}")

new_model = create_model()
new_model = restore_model_and_optimizer(new_model, 'model_checkpoint')

compare_optimizer_variables(model, new_model)

new_history = new_model.fit(x_train, y_train, epochs=5, validation_split=0.2, verbose=1, initial_epoch=0)

initial_loss_restored_model = new_history.history['loss'][0]
initial_accuracy_restored_model = new_history.history['accuracy'][0]
print(f"Initial loss after resuming training: {initial_loss_restored_model}")
print(f"Initial accuracy after resuming training: {initial_accuracy_restored_model}")

Actually i think that i made a comparison between LAST LOSS of the first model and the loss at the end of the first epoch of the second model. They are similar. It's seems that tha training is continuing :)

Epoch 1/5
1500/1500 [==============================] - 13s 8ms/step - loss: 0.2866 - accuracy: 0.9175 - val_loss: 0.1617 - val_accuracy: 0.9527
Epoch 2/5
1500/1500 [==============================] - 11s 8ms/step - loss: 0.1294 - accuracy: 0.9619 - val_loss: 0.1105 - val_accuracy: 0.9681
Epoch 3/5
1500/1500 [==============================] - 11s 7ms/step - loss: 0.0885 - accuracy: 0.9739 - val_loss: 0.1074 - val_accuracy: 0.9669
Epoch 4/5
1500/1500 [==============================] - 11s 7ms/step - loss: 0.0644 - accuracy: 0.9804 - val_loss: 0.0947 - val_accuracy: 0.9730
Epoch 5/5
1500/1500 [==============================] - 11s 7ms/step - loss: 0.0489 - accuracy: 0.9850 - val_loss: 0.0924 - val_accuracy: 0.9730
INFO:tensorflow:Assets written to: model_checkpoint/assets
INFO:tensorflow:Assets written to: model_checkpoint/assets
Model and optimizer state saved to model_checkpoint and model_checkpoint_optimizer.pkl
Final loss after initial training: 0.048920888453722
Final accuracy after initial training: 0.9850000143051147
Model and optimizer state restored from model_checkpoint and model_checkpoint_optimizer.pkl
Original variable: iteration:0, shape: (), dtype: <dtype: 'int64'>, numpy: 7500
Restored variable: iteration:0, shape: (), dtype: <dtype: 'int64'>, numpy: 7500
iteration:0 matches.
Original variable: Adam/m/dense_10/kernel:0, shape: (784, 128), dtype: <dtype: 'float32'>, numpy: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
Restored variable: m/dense_10/kernel:0, shape: (784, 128), dtype: <dtype: 'float32'>, numpy: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
Adam/m/dense_10/kernel:0 matches.
Original variable: Adam/v/dense_10/kernel:0, shape: (784, 128), dtype: <dtype: 'float32'>, numpy: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
Restored variable: v/dense_10/kernel:0, shape: (784, 128), dtype: <dtype: 'float32'>, numpy: [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
Adam/v/dense_10/kernel:0 matches.
Optimizer variables comparison done.
Epoch 1/5
1500/1500 [==============================] - 12s 8ms/step - loss: 0.0386 - accuracy: 0.9883 - val_loss: 0.0894 - val_accuracy: 0.9749
Epoch 2/5
1500/1500 [==============================] - 12s 8ms/step - loss: 0.0288 - accuracy: 0.9916 - val_loss: 0.0910 - val_accuracy: 0.9758
Epoch 3/5
1500/1500 [==============================] - 11s 7ms/step - loss: 0.0242 - accuracy: 0.9924 - val_loss: 0.0984 - val_accuracy: 0.9731
Epoch 4/5
1500/1500 [==============================] - 10s 7ms/step - loss: 0.0195 - accuracy: 0.9940 - val_loss: 0.0918 - val_accuracy: 0.9735
Epoch 5/5
1500/1500 [==============================] - 11s 7ms/step - loss: 0.0161 - accuracy: 0.9951 - val_loss: 0.0973 - val_accuracy: 0.9756
Initial loss after resuming training: 0.03859472647309303
Initial accuracy after resuming training: 0.9883124828338623

federicocaroli · 2024-09-13T08:41:07Z

@ValerioSpenn Hi Valerio, is it expected that the compared variables are only full of zeros? It seems to me that the comparison mechanism is missing some variables.

google-ml-butler bot assigned tilakrayal Sep 4, 2022

adriangb changed the title ~~Optimizer.get_weights() removed~~ Unexpected breaking change: Optimizer.get_weights() removed Sep 4, 2022

tilakrayal assigned gowthamkpr and unassigned tilakrayal Sep 5, 2022

adriangb mentioned this issue Sep 6, 2022

BUG: Keras SaveModel does not properly save optimizer state tensorflow/tensorflow#44670

Open

gowthamkpr added the keras-team-review-pending label Sep 8, 2022

mattdangerw assigned chenmoneygithub and unassigned gowthamkpr Sep 8, 2022

mattdangerw removed the keras-team-review-pending label Sep 8, 2022

adriangb referenced this issue in adriangb/scikeras Oct 14, 2022

Disable testing on TF nightly

e950571

Until https://github.com/keras-team/keras/issues/16983 gets resolved

adriangb mentioned this issue Oct 14, 2022

Disable testing on TF nightly adriangb/scikeras#286

Merged

adriangb referenced this issue in adriangb/scikeras Oct 14, 2022

Disable testing on TF nightly (#286)

eb88be8

Until https://github.com/keras-team/keras/issues/16983 gets resolved

owenvallis mentioned this issue Dec 2, 2022

Not benefiting from checkpointing tensorflow/similarity#297

Open

chrislrobert mentioned this issue Dec 6, 2022

For the neural network model, shifted to a legacy optimizer since the… higherbar-ai/ml4qc#24

Merged

hjalmarheld mentioned this issue Jan 12, 2023

Tensorflow 2.11 update breaks use of Keras optimizers, quick fix included. hansbuehler/deephedging#2

Open

sgwhat mentioned this issue Feb 3, 2023

Update usage of deprecated API (get_weights) in TensorFlow 2 Optimizer. intel-analytics/ipex-llm#7319

Open

mansishr mentioned this issue Mar 9, 2023

Edits to TF 3D Unet BraTS workspace securefederatedai/openfl#757

Open

mrdbourke mentioned this issue May 12, 2023

Notebook 05: load_weights results in "Incompatible tensor with shape (1280, 10)..." mrdbourke/tensorflow-deep-learning#544

Open

fchollet transferred this issue from keras-team/keras Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected breaking change: Optimizer.get_weights() removed #442

Unexpected breaking change: Optimizer.get_weights() removed #442

adriangb commented Sep 4, 2022

tilakrayal commented Sep 5, 2022

mattdangerw commented Sep 8, 2022

chenmoneygithub commented Sep 8, 2022

adriangb commented Sep 8, 2022

adriangb commented Oct 14, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022 •

edited

Loading

adriangb commented Nov 29, 2022

chenmoneygithub commented Nov 29, 2022

adriangb commented Nov 29, 2022 •

edited

Loading

chenmoneygithub commented Nov 29, 2022 •

edited

Loading

chenmoneygithub commented Nov 29, 2022

adriangb commented Nov 29, 2022

ValerioSpenn commented Sep 2, 2024 •

edited

Loading

federicocaroli commented Sep 13, 2024

Unexpected breaking change: Optimizer.get_weights() removed #442

Unexpected breaking change: Optimizer.get_weights() removed #442

Comments

adriangb commented Sep 4, 2022

tilakrayal commented Sep 5, 2022

mattdangerw commented Sep 8, 2022

chenmoneygithub commented Sep 8, 2022

adriangb commented Sep 8, 2022

adriangb commented Oct 14, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022

chenmoneygithub commented Nov 18, 2022

adriangb commented Nov 18, 2022 • edited Loading

adriangb commented Nov 29, 2022

chenmoneygithub commented Nov 29, 2022

adriangb commented Nov 29, 2022 • edited Loading

chenmoneygithub commented Nov 29, 2022 • edited Loading

chenmoneygithub commented Nov 29, 2022

adriangb commented Nov 29, 2022

ValerioSpenn commented Sep 2, 2024 • edited Loading

federicocaroli commented Sep 13, 2024

adriangb commented Nov 18, 2022 •

edited

Loading

adriangb commented Nov 29, 2022 •

edited

Loading

chenmoneygithub commented Nov 29, 2022 •

edited

Loading

ValerioSpenn commented Sep 2, 2024 •

edited

Loading