Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Issue with ImageNet training set. #2428

Open
g12bftd opened this issue Jun 11, 2023 · 2 comments
Open

[BUG] Issue with ImageNet training set. #2428

g12bftd opened this issue Jun 11, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@g12bftd
Copy link

g12bftd commented Jun 11, 2023

🐛🐛 Bug Report

⚗️ Current Behavior

I tried using ImageNet-1k directly from Active Loop. After validating on PyTorch's pre-trained ResNet-18, I get 82% validation accuracy, which is way too high.

Input Code

  • REPL or Repo link if applicable:
import deeplake
from PIL import Image
import numpy as np
import os, time
import torch
from torchvision import transforms, models

# Connect to the training and testing datasets
ds_train = deeplake.load("hub://activeloop/imagenet-train", token="my token")
ds_test = deeplake.load("hub://activeloop/imagenet-val", token="my token")

from torch.utils.data import DataLoader
from torchvision import transforms
import torch
import torchvision
from tqdm import tqdm

def convert_to_rgb(image):
    if image.mode != 'RGB':
        image = image.convert('RGB')
    return image


mean = (0.485, 0.456, 0.406)
std = (0.229, 0.224, 0.225)
tform= transforms.Compose(
            [
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.Lambda(convert_to_rgb),
                transforms.ToTensor(),
                transforms.Normalize(mean, std)
            ]
        )

batch_size = 128

# Since torchvision transforms expect PIL images, we use the 'pil' decode_method for the 'images' tensor. This is much faster than running ToPILImage inside the transform
train_loader = ds_train.pytorch(num_workers = 0, shuffle = True, transform = {'images': tform, 'labels': None}, batch_size = batch_size, decode_method = {'images': 'pil'})
test_loader = ds_test.pytorch(num_workers = 0, transform = {'images': tform, 'labels': None}, batch_size = batch_size, decode_method = {'images': 'pil'})

model = torchvision.models.resnet18(weights="DEFAULT")
device = torch.device("cuda")
model.to(device)
model.eval().cuda()  # Needs CUDA, don't bother on CPUs
correct = 0
total = 0
with torch.no_grad():
    for x, y in tqdm(test_loader):
        y_pred = model(x.cuda())
        correct += (y_pred.argmax(axis=1) == y.cuda()).sum().item()
        total += len(y)
print(correct / total)

Expected behavior/code
The ResNet-18 pre-trained model is taken directly from the PyTorch hub. The expected validation accuracy is 69.76% (and I verified this using the Kaggle version of ImageNet). Check this PyTorch link for evidence: https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html.
Note: In my transforms, I include a "convert_to_rgb" transform because some of the images from the training and testing sets from the Active Loop hub are grayscale.

@g12bftd g12bftd added the bug Something isn't working label Jun 11, 2023
@pranith7
Copy link

Hey @g12bftd i want to work on this issue.

@g12bftd
Copy link
Author

g12bftd commented Aug 11, 2023

Hey @g12bftd i want to work on this issue.

Hey @pranith7, thank you! Please do try to replicate my code and results. Let me know if you find a solution, or whether there was a mistake on my end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants