Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue #1903 path variable #2016

Merged
merged 2 commits into from
Aug 23, 2022
Merged

Conversation

victor1cea
Copy link
Contributor

What changes are proposed in this pull request?

Adds support for path variable in dataset.yaml for YOLOv5 datasets, as mentioned in Issue #1903.
The docs have been updated too.

How is this patch tested? If it is not, please explain why.

Created a dataset and tested the new possible file structures e.g. dataset.yaml located outside of the actual dataset directory.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks! 🚀

@brimoor brimoor merged commit 1885f27 into voxel51:develop Aug 23, 2022
@brimoor brimoor mentioned this pull request Aug 23, 2022
6 tasks
@alexk-ede
Copy link

I'm not sure if I should open a separate issue or not.

I updated to fiftyone==0.16.6
and
I'm basically exporting it via
split_view.export(
export_dir=export_dir,
dataset_type=dataset_type,
label_field=label_field,
split=split_out,
classes=final_labels_list)

(dataset_type is fo.types.COCODetectionDataset ofc)

But it's still not writing the path to the dataset.yaml file.

@victor1cea
Copy link
Contributor Author

@alexk-ede the changes I made are only for YOLOv5 datasets. Also, how are you using dataset.yaml with COCO datasets? because COCO uses .json files afaik.

@alexk-ede
Copy link

alexk-ede commented Aug 31, 2022

@victor1cea Oh, sry for the confusion, I copy pasted the wrong dataset_type.
I'm just getting some labels out of coco and oi6 and exporting them to yolov5.

So in my custom converter function it's currently set as a default argument dataset_type=fo.types.YOLOv5Dataset.
And yolov5 accepts the dataset only when I manually add the path variable. Even after the 0.16.6 update.
I'll create a minimal example to test this.

@alexk-ede
Copy link

alexk-ede commented Aug 31, 2022

So here is a minimal example (based on the yolov5 export example),
which highlights the problem (actually 2).

import fiftyone as fo
import fiftyone.zoo as foz
from fiftyone import ViewField as F

coco_classes = ["car"]
coco2017_dataset1 = foz.load_zoo_dataset(
    "coco-2017",
    classes=coco_classes,
)

classes_filter = coco_classes # using all

## adapted from the example https://voxel51.com/docs/fiftyone/user_guide/export_datasets.html#yolov5dataset
export_dir = "yolov5-dataset"
label_field = "ground_truth"  # for example

# The splits to export
splits = ["train", "val"]

# All splits must use the same classes list
classes = classes_filter

matching_view = coco2017_dataset1.filter_labels(label_field, F("label").is_in(classes_filter))

# The dataset or view to export
# We assume the dataset uses sample tags to encode the splits to export  
dataset_or_view = matching_view


# Export the splits
for split in splits:
    split_view = dataset_or_view.match_tags(split)
    split_view.export(
        export_dir=export_dir,
        dataset_type=fo.types.YOLOv5Dataset,
        label_field=label_field,
        split=split,
        classes=classes,
        export_media="symlink", # speedup
    )

This yields

 100% |█████████████| 12251/12251 [27.0s elapsed, 0s remaining, 455.3 samples/s]      
Directory 'yolov5-dataset' already exists; export will be merged with existing files
 100% |█████████████████████| 0/0 [86.6ms elapsed, ? remaining, ? samples/s] 

and a dataset.yaml with only

names:
- car
nc: 1
train: .\images\train\
val: .\images\val\

So 1. this still doesn't export the path, so this dataset just doesn't work with yolov5 unless it's placed in the correct folder where yolov5 expects it to be.
(and the second problem is that the example code doesn't properly export the val part of the dataset, bc it's usually called validation in the datasets, but it's unrelated to this issue)

@victor1cea
Copy link
Contributor Author

@alexk-ede Oh, I see now. The path can be read, but will never be exported. I could fix it this week.
Regarding the second problem, you could change the tags of the samples from the validation split like this:

for sample in dataset:
    if 'validation' in sample.tags:
        sample.tags = ['val']
        sample.save()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants