Unable to classify with trained model #11

JoejynWan · 2020-05-18T09:46:39Z

Thank you for your previous help regarding the training of the model with GPU. After training a new model (Resnet34) with my images, I tried to classify another set of images using that model, and I encounter a whole long bunch of errors. Although, one main error that stood out was:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint.

What may be causing this? I have tried only keeping the most recent snapshot in the training folder, but the error still persists. I would also like to note that after the new model was trained, the predictions.csv file was not saved anywhere in the system, even after I tried defining an absolute path for save_predictions. I am not sure if this may be a result of the do_evaluate function not running properly during train(), or maybe its a Windows directory pathing problem?

Just for some extra information that might be helpful, these are my inputs/args of the classify() run:

Namespace(LR_details='19, 30, 44, 53, 0.01, 0.005, 0.001, 0.0005, 0.0001', LR_policy='piecewise_linear', WD_details='30, 0.0005, 0.0', WD_policy='piecewise_linear', architecture='resnet', batch_size=128, chunked_batch_size=128, command='eval', delimiter=',', depth=34, log_debug_info=False, log_device_placement=False, log_dir='C:/Users/jmwan/Documents/MLWIC2/MLWIC2_helper_files/resnet_Run-18-05-2020_14-42-19', max_to_keep=5, num_batches=-1, num_classes=1000, num_epochs=55, num_gpus=1, num_prefetch=2000, num_threads=1, optimizer='momentum', path_prefix='C:/Users/jmwan/Documents/MLWIC2/Extracted-Images/n_train_test=8000-train_prop=0.5/testingset-n_test=4000', processed_size=[224, 224, 3], raw_size=[256, 256, 3], retrain_from=None, run_metadata=None, run_name='Run-18-05-2020_17-26-54', run_options=None, save_predictions='C:/Users/jmwan/Documents/MLWIC2/MLWIC2_helper_files//model_predictions.txt', shuffle=True, snapshot_prefix='C:/Users/jmwan/Documents/MLWIC2/MLWIC2_helper_files/resnet_Run-18-05-2020_14-42-19', top_n=2, train_info=None, transfer_mode=[0], val_info='C:/Users/jmwan/Documents/MLWIC2/Extracted-Images/n_train_test=8000-train_prop=0.5/MLWIC2_testing_datasheet-n_test=4000.csv')

Thank you!

The text was updated successfully, but these errors were encountered:

mikeyEcology · 2020-05-18T09:59:47Z

I changed something in the classify function that might help. Can you try re-installing the package and re-running classify with your trained model?

JoejynWan · 2020-05-18T10:36:32Z

Hi @mikeyEcology

Thank you so much, that helped with regards to classify(). However, I don't think this would solve the missing predictions.csv during train()? Since num_classes was defined during train().

mikeyEcology closed this as completed May 18, 2020

mikeyEcology reopened this May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to classify with trained model #11

Unable to classify with trained model #11

JoejynWan commented May 18, 2020 •

edited

Loading

mikeyEcology commented May 18, 2020

JoejynWan commented May 18, 2020

Unable to classify with trained model #11

Unable to classify with trained model #11

Comments

JoejynWan commented May 18, 2020 • edited Loading

mikeyEcology commented May 18, 2020

JoejynWan commented May 18, 2020

JoejynWan commented May 18, 2020 •

edited

Loading