Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding new_ids=True option to Dataset.add_collection() #1927

Merged
merged 4 commits into from
Aug 18, 2022

Conversation

brimoor
Copy link
Contributor

@brimoor brimoor commented Jul 6, 2022

Adds an optional new_ids=True option to Dataset.add_collection() that generates new sample/frame IDs when adding the samples to the collection.

This is useful, for example, to quickly exponentiate the size of a dataset for testing:

# Make dataset 2 ** 4 times larger
for _ in range(4):
    dataset.add_collection(dataset, new_ids=True)

Image example

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")
new_ids = dataset.add_collection(dataset, new_ids=True)

assert len(dataset) == 400
assert len(set(dataset.values("id"))) == 400

Video example

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart-video")
new_ids = dataset.add_collection(dataset, new_ids=True)

assert len(dataset) == 20
assert len(set(dataset.values("frames.id", unwind=True))) == 2558

@brimoor brimoor added the enhancement Code enhancement label Jul 6, 2022
@brimoor brimoor requested a review from a team July 6, 2022 02:00
@brimoor brimoor self-assigned this Jul 6, 2022
Copy link
Contributor

@manivoxel51 manivoxel51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work Brian! 💯

it mostly makes sense. specially after reading the tests

self.assertIsNone(dataset.last()["foo"])

@drop_datasets
def test_add_collection_new_ids(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests 💥

Copy link
Member

@ehofesmann ehofesmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this would have definitely come in handy in the past when benchmarking datasets. Also love the tests!

@brimoor brimoor merged commit 75afffe into develop Aug 18, 2022
@brimoor brimoor deleted the feature/add-collection branch August 18, 2022 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Code enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants