Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CVAT optimizations #1944

Merged
merged 14 commits into from
Aug 18, 2022
Merged

CVAT optimizations #1944

merged 14 commits into from
Aug 18, 2022

Conversation

ehofesmann
Copy link
Member

There are a few instances in which the CVAT integration scaled poorly in regard to the number of tasks that exist on the CVAT server and the number of annotation runs on a dataset. This PR implements optimizations to improve these issues
Specifically:

  1. When loading annotations, the existing tasks were explicitly listed only to determine if a given task id exists. This can take a few minutes if there are thousands of tasks on the CVAT server, which is much longer than any other operation for a small annotation run on only a few samples.
  2. When performing operations on the CVATAnnotationResults like get_status(), it was previously connecting to the API and closing the connection every single time. Now, you can pass in an API object which will be used directly, allowing you to avoid needing to authenticate CVAT credentials every single time you want to get_status(), which can be a lot of there are hundreds of annotation runs on a dataset.

@ehofesmann ehofesmann added enhancement Code enhancement annotation Issues related to FiftyOne's annotation API labels Jul 14, 2022
@ehofesmann ehofesmann requested a review from a team July 14, 2022 20:20
@ehofesmann ehofesmann self-assigned this Jul 14, 2022
docs/source/integrations/cvat.rst Outdated Show resolved Hide resolved
docs/source/integrations/cvat.rst Outdated Show resolved Hide resolved
@brimoor
Copy link
Contributor

brimoor commented Jul 21, 2022

@ehofesmann can we get this merged this week?

@ehofesmann ehofesmann requested a review from brimoor August 5, 2022 21:04
@brimoor
Copy link
Contributor

brimoor commented Aug 11, 2022

@ehofesmann LMK what you think about adding #1997 to this PR before merging it 📈

Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ehofesmann I recall you mentioning that tests might need to be updated? Just a reminder in case you're not ready to merge yet

@ehofesmann
Copy link
Member Author

LGTM

@ehofesmann I recall you mentioning that tests might need to be updated? Just a reminder in case you're not ready to merge yet

I looked into it more and the existing tests were sufficient, thank for checking on that though!

@ehofesmann ehofesmann merged commit 44fcf0d into develop Aug 18, 2022
@ehofesmann ehofesmann deleted the feature/cvat-optimize branch August 18, 2022 17:53
kaixi-wang pushed a commit that referenced this pull request May 11, 2023
* bson fix

* dataset 404s

* fix selection css and setting

* bumps

* release notes

* bump teams app

* 404 fix

* bump os compat

* Fix custom_parser implementation (get_label should return an instance of the label_cls)

* bump helm charts for 0.1.7 (#79)

* more get_field() fixes

* adding support for serializing aggregations

* adding serialization tests for aggregations

* linting

* adding __eq__ methods

* linting

* more unit testy

* bug fixes

* adding set_values() and exclude_fields() tests

* adding get_field() tests

* adding label ID tests

* reverting to v0.15.1 implementation

* fixing #1893

* id updates

* adding id back to repr

* fixing more ID bugs

* adding rel_dir option

* finishing implementation

* Fix bug when loading group ids in CVAT video tasks (#1917)

* catch tags in app aggregation data flow (#1924)

* adding weighted_sample() and balanced_sample() utils

* moving random_split() to random module

* updating docstrings

* bug

* adding new_ids option to add_collection()

* adding tests for add_collection()

* Resolve bug when uploading to projects in CVAT (#1926)

* proper handling of db fields in mixin methods

* proper handling of db fields in view stages/saving

* fixing clips media type bug

* removing duplicate method definition

* fixing edge case and adding tests

* fixing typo

* view refresh fix

* fix result alignment

* adding quantile aggregation

* adding unit tests for quantile aggregation

* bug fix

* handling string vs numeric

* adding bad values test

* adding quantiles to docs

* implementing kwargs

* relaxing numpy version requirement

* updating nan tests

* supporting serializing and deserializing of arbitrary ObjectId data

* handling SON serialization

* adding stats() method to sample collections

* move all auth0 Teams App configuration to env config

* add targets conversion to graphql datasets (#1943)

* use correct dict default

* fixing bug

* db field is no longer necessary

* fixing one part of #1945

* fixing #1945

* adding object ID tests

* removing default_classes usage

* removing outdated syntax

* adding missing docs

* raising informative error when trying to combine splits

* fixing frames bug

* testing ID fields/db fields

* make ObjectIdField handle str/ObjectId conversions

* validating sample_id field

* documenting stats() method

* tweaking langugage

* more robust implementation

* optimizing convert datasets method

* adding test to cover expected ObjectId behavior

* fixing ObjectId bugs in to_dict()/from_dict()

* unnecessary

* fixing set indexing bug

* unique

* only use patches selection mode for spatial label fields

* fixing bool bugs

* adding bool unit test

* adding numpy bool support

* added field name validation

* finishing implementation of ObjectIdField handling for DatasetViews

* fixing bug

* handling edge case

* proper handling of private fields when serializing collections

* adding dataset serialization tests

* updating unit test

* adding dynamic/BSON field tests

* tweaking todos

* fixing tests

* adding persistent option to clone()

* clarifying that add field methods are idempotent

* production defaults

* be more explicit

* removes helm chart from fiftyone-teams repo(#81)

* helm charts moved to fiftyone-teams-app-deploy repo

* removes deployment/helm from .gitignore

* removes helm chart from fiftyone-teams repo(#81)

* helm charts moved to fiftyone-teams-app-deploy repo

* removes deployment/helm from .gitignore

* adding missing brain call

* fix session views

* gracefully casting to numpy array

* updating documentation

* lifting plotly<5 requirement

* adding missing brain call

* update tabulate to 0.8.10 for source install

* Added clarification to tie attribute argument to label_schema (#1973)

* trying looser requirements

* making tags optional

* fixing bug

* documenting explanation of #1748 in all relevant places

* passing attributes through during label coercions

* linting

* fixing test

* adding field_names, iter_fields(), and merge() to SerializableDocument

* CVAT optimizations (#1944)

* avoid listing tasks at download time

* pass api into results methods

* document api optimization

* lint

* lint pro tip

* add api close context

* refactoring connect_to_api() into an interface concept

* refactoring connect_to_api() into an interface concept

* updating documentation

* cleanup

* moving context manager to AnnotationAPI class

Co-authored-by: brimoor <[email protected]>
Co-authored-by: Brian Moore <[email protected]>

* noting that dataset-level metadata is excluded

* Label studio integration (#1848)

* add label studio integration for classification, detection and polyline

* handle export and import of other label types

* upload predictions to label studio for classification

* upload predictions for all label types

* update labelstudio docstrings and formatting

* store existing labels in id_map

* read api key from env or prompt, small fixes

* add supports attrs property

* add label studio integration for classification, detection and polyline

* handle export and import of other label types

* upload predictions to label studio for classification

* upload predictions for all label types

* update labelstudio docstrings and formatting

* store existing labels in id_map

* read api key from env or prompt, small fixes

* raise error if label studio version is below 1.5,
prompt for a specific label studio sdk version if not installed

* add warning about ignoring attributes

* add labelstudio docs

* fix tests

Co-authored-by: Eric Hofesmann <[email protected]>

* adding support for writing transformed images to a new location

* adding support for writing transformed videos to a new location

* adding unique filename utils

* updating CLI

* Label studio updates (#2006)

* add label studio integration for classification, detection and polyline

* handle export and import of other label types

* upload predictions to label studio for classification

* upload predictions for all label types

* update labelstudio docstrings and formatting

* store existing labels in id_map

* read api key from env or prompt, small fixes

* add supports attrs property

* add label studio integration for classification, detection and polyline

* handle export and import of other label types

* upload predictions to label studio for classification

* upload predictions for all label types

* update labelstudio docstrings and formatting

* store existing labels in id_map

* read api key from env or prompt, small fixes

* raise error if label studio version is below 1.5,
prompt for a specific label studio sdk version if not installed

* add warning about ignoring attributes

* add labelstudio docs

* fix tests

* update docstrings

* add supports video flag to annotation backend

* update connect_to_api

* parse config parameters for labelstudio tests

* track label ids to merge properly

* support unsubmitted tasks

* note supported scalar types

* currently doesnt support classifications plural

* linting

* fixing typos

* linting docs

* linting

* one more linting pass

* final pass

* fix labelstudio tests

Co-authored-by: rusteam <[email protected]>
Co-authored-by: brimoor <[email protected]>

* bumps, release notes

* add to release notes

* linting

* converting to public SaveContext class

* adding missing _save_replacements implementation

* adding public save_context() method

* adding logging configuration

* documenting

* finishing work

* generated views don't support save contexts

* setting default behavior

* compute ops in real-time

* adding ipywidgets<8 requirement

* documenting save contexts

* refactoring into a deferred=True option

* fixing docs warnings

* adding unit tests

* unnecessary

* adding more examples, standardizing default batch_size logic

* adding compatible DB versions

* using client version terminology

* making DatasetDocument dynamic

* optimizing get dataset version

* read path variable from dataset.yaml

* update docs for path in dataset.yaml

* tweaking docs

* tweaks

* avoid deserializing extra fields

* updating unit test

* using strict=False in more places

* more non-strict

* showing available logging levels

* adding normpath

* updating compatibility version

* documenting compatible versions

* fixing LegacyFiftyOneDataset import bug

* removes artifacts I should not have added (#88)

* clarifying

* oops

* updating pkg versions

* removing unnecessary pull

* removes helm chart from fiftyone-teams repo(#81)

* helm charts moved to fiftyone-teams-app-deploy repo

* removes deployment/helm from .gitignore

* removes artifacts I should not have added (#88)

* remove editable flag

* move App build to end

* updating release notes

* always migrate when user is admin

* moving legacy troubleshooting to another page

* adding docs on coordinating migrations

* adding helpful error when a dataset fails to load

* pinning max requirement

* using more cloud-friendly layout

* cloud-friendly updates

* adding upgrade note

* reverting premature version changes

* packages bumps

Co-authored-by: Benjamin Kane <[email protected]>
Co-authored-by: brimoor <[email protected]>
Co-authored-by: Brian Moore <[email protected]>
Co-authored-by: idow09 <[email protected]>
Co-authored-by: Eric Hofesmann <[email protected]>
Co-authored-by: imanjra <[email protected]>
Co-authored-by: Geoffrey Keating <[email protected]>
Co-authored-by: Rustem Galiullin <[email protected]>
Co-authored-by: Victor Oancea <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
annotation Issues related to FiftyOne's annotation API enhancement Code enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants