Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter fields by description or info #2898

Merged
merged 22 commits into from
May 8, 2023

Conversation

nebulae
Copy link
Contributor

@nebulae nebulae commented Apr 13, 2023

What changes are proposed in this pull request?

add meta_filter to select_fields

How is this patch tested? If it is not, please explain why.

added additional unit tests, tested manually.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

an optional parameter meta_filter added to dataset.select_fields which allows for a string or dict to be passed along, which will be used to filter to the fields that have a match to this filter in their description or info.

to select only fields that have a description that contains the string "my description", you can use meta_filter="my description". To select only fields that have a specific key in their info, you can use meta_filter=dict(key="value")

(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

@codecov
Copy link

codecov bot commented Apr 13, 2023

Codecov Report

Patch coverage: 64.84% and project coverage change: +0.15 🎉

Comparison is base (36eb527) 62.10% compared to head (4889c2c) 62.25%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2898      +/-   ##
===========================================
+ Coverage    62.10%   62.25%   +0.15%     
===========================================
  Files          260      264       +4     
  Lines        44032    45165    +1133     
  Branches       355      356       +1     
===========================================
+ Hits         27344    28119     +775     
- Misses       16688    17046     +358     
Flag Coverage Δ
app 48.90% <55.40%> (+0.21%) ⬆️
python 99.44% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ackages/components/src/components/Popout/index.tsx 33.33% <0.00%> (ø)
app/packages/core/src/components/Actions/utils.tsx 40.57% <0.00%> (ø)
app/packages/looker/src/overlays/heatmap.ts 16.06% <0.00%> (ø)
app/packages/state/src/recoil/looker.ts 29.70% <0.00%> (-0.61%) ⬇️
app/packages/looker/src/overlays/base.ts 39.15% <2.32%> (-10.09%) ⬇️
app/packages/state/src/hooks/useUpdateSamples.ts 19.51% <19.51%> (ø)
.../packages/state/src/hooks/useSessionColorScheme.ts 20.48% <20.48%> (ø)
app/packages/state/src/recoil/color.ts 43.06% <23.91%> (-9.02%) ⬇️
.../components/src/components/TabOption/TabOption.tsx 47.05% <25.00%> (-2.42%) ⬇️
app/packages/looker/src/elements/common/tags.ts 15.62% <25.00%> (+3.24%) ⬆️
... and 31 more

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@nebulae nebulae changed the title [DRAFT/WIP] - filter fields by description or info filter fields by description or info Apr 18, 2023
Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is looking on the right track to me 💪 I left some stylistic comments for ya

fiftyone/core/collections.py Outdated Show resolved Hide resolved
fiftyone/core/stages.py Outdated Show resolved Hide resolved
fiftyone/core/stages.py Outdated Show resolved Hide resolved
fiftyone/core/stages.py Outdated Show resolved Hide resolved
fiftyone/core/collections.py Show resolved Hide resolved
tests/unittests/dataset_tests.py Outdated Show resolved Hide resolved
tests/unittests/dataset_tests.py Outdated Show resolved Hide resolved
tests/unittests/dataset_tests.py Outdated Show resolved Hide resolved
fiftyone/core/stages.py Outdated Show resolved Hide resolved
Copy link
Contributor

@manivoxel51 manivoxel51 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! 🍨

your test cases were very helpful. I'll have to hook this up to the UI next

@nebulae nebulae requested a review from brimoor April 28, 2023 20:27
Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nebulae FYI I implemented one necessary update to this implementation here: #2939. I recommend merging that before making any further changes to avoid merge conflicts.

I have two other enhancement requests:

  1. Can you support meta_filter={"info.key": "value"} with dot notation as shorthand for meta_filter=dict(info=dict(key="value"))? There's an existing convention of using embedded.field.name notation throughout the SDK when referring to nested fields. In fact, it would be fine with me to require the dot notation and not support nested dicts at all (for user-facing syntax), but if you want to keep nested dicts as an undocumented syntax, that would be fine with me
  2. Currently meta_filter is only applied to top-level fields, but we could expand to nested fields as well

Perhaps we should discuss 2 offline with @manivoxel51. In the UI he's building, in manual field selection mode, there's a toggle to control whether nested fields are included or not. When using the search UI (meta_filter), there could be a similar toggle to control whether nested fields are included. For example, multiple users may have populated custom attributes on the Detection instances within a Detections field. Including nested fields in meta_filter would allow for only selecting the attributes that I added while excluding attributes that someone else added. But it might be the case that we need to expose this toggle on the view stages themselves to avoid undesirable behavior when one only intends to process top-level fields.

@nebulae
Copy link
Contributor Author

nebulae commented May 2, 2023

  1. Currently meta_filter is only applied to top-level fields, but we could expand to nested fields as well

Perhaps we should discuss 2 offline with @manivoxel51.

@brimoor for this one, is it safe to say that if we have a toggle to include nested fields, then those fields are filtered and returned in this way:

  • if none of the nested fields have a match, return them all (if the parent matched the filter and is being returned)
  • if one of the nested fields has a match, only return that one, regardless of if the parent is being returned

or, would the toggle to include nested fields lead the user to believe that the nested fields will be returned / included regardless of if there is a match, if the parent is being returned?

@brimoor
Copy link
Contributor

brimoor commented May 2, 2023

@brimoor for this one, is it safe to say that if we have a toggle to include nested fields, then those fields are filtered and returned in this way:

  • if none of the nested fields have a match, return them all (if the parent matched the filter and is being returned)
  • if one of the nested fields has a match, only return that one, regardless of if the parent is being returned

or, would the toggle to include nested fields lead the user to believe that the nested fields will be returned / included regardless of if there is a match, if the parent is being returned?

discussed offline, we believe we can wire up an include_nested_fields=True/False param to get_field_schema(flat=include_nested_fields) and it will... just work! 🤞

@nebulae nebulae requested a review from brimoor May 3, 2023 01:16
Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for all the tests! 💪

``meta_filter={"any": "2023"}`` to exclude fields that have
the string "2023" anywhere in their name, type,
description, or info
- Use ``meta_filter={"type": "StringField"}`` or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I removed "type": fo.StringField from the docstring because ViewStage kwargs need to be JSON serializable in order to be saved, loaded in the App, etc.

I didn't touch the actual logic in the type_filter though, so the syntax is still available as an undocumented option.

Note: other ViewStage classes accept non JSON values, but they are careful to convert the relevant data into JSON serializable values internally in such a way that they can be serialized/deserialized.


if self._meta_filter is not None:
paths = _get_meta_filtered_fields(
sample_collection, self._meta_filter, frames=True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the parsing of include_nested_fields into _get_meta_filtered_fields() for better encapsulation.

@nebulae nebulae merged commit 5529fbe into develop May 8, 2023
@nebulae nebulae deleted the feature/2804-filter-fields-by-description branch May 8, 2023 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants