-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter fields by description or info #2898
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## develop #2898 +/- ##
===========================================
+ Coverage 62.10% 62.25% +0.15%
===========================================
Files 260 264 +4
Lines 44032 45165 +1133
Branches 355 356 +1
===========================================
+ Hits 27344 28119 +775
- Misses 16688 17046 +358
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is looking on the right track to me 💪 I left some stylistic comments for ya
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! 🍨
your test cases were very helpful. I'll have to hook this up to the UI next
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nebulae FYI I implemented one necessary update to this implementation here: #2939. I recommend merging that before making any further changes to avoid merge conflicts.
I have two other enhancement requests:
- Can you support
meta_filter={"info.key": "value"}
with dot notation as shorthand formeta_filter=dict(info=dict(key="value"))
? There's an existing convention of usingembedded.field.name
notation throughout the SDK when referring to nested fields. In fact, it would be fine with me to require the dot notation and not support nested dicts at all (for user-facing syntax), but if you want to keep nested dicts as an undocumented syntax, that would be fine with me - Currently
meta_filter
is only applied to top-level fields, but we could expand to nested fields as well
Perhaps we should discuss 2 offline with @manivoxel51. In the UI he's building, in manual field selection mode, there's a toggle to control whether nested fields are included or not. When using the search UI (meta_filter
), there could be a similar toggle to control whether nested fields are included. For example, multiple users may have populated custom attributes on the Detection
instances within a Detections
field. Including nested fields in meta_filter
would allow for only selecting the attributes that I added while excluding attributes that someone else added. But it might be the case that we need to expose this toggle on the view stages themselves to avoid undesirable behavior when one only intends to process top-level fields.
@brimoor for this one, is it safe to say that if we have a toggle to include nested fields, then those fields are filtered and returned in this way:
or, would the toggle to include nested fields lead the user to believe that the nested fields will be returned / included regardless of if there is a match, if the parent is being returned? |
discussed offline, we believe we can wire up an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for all the tests! 💪
``meta_filter={"any": "2023"}`` to exclude fields that have | ||
the string "2023" anywhere in their name, type, | ||
description, or info | ||
- Use ``meta_filter={"type": "StringField"}`` or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I removed "type": fo.StringField
from the docstring because ViewStage
kwargs need to be JSON serializable in order to be saved, loaded in the App, etc.
I didn't touch the actual logic in the type_filter
though, so the syntax is still available as an undocumented option.
Note: other ViewStage
classes accept non JSON values, but they are careful to convert the relevant data into JSON serializable values internally in such a way that they can be serialized/deserialized.
|
||
if self._meta_filter is not None: | ||
paths = _get_meta_filtered_fields( | ||
sample_collection, self._meta_filter, frames=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the parsing of include_nested_fields
into _get_meta_filtered_fields()
for better encapsulation.
What changes are proposed in this pull request?
add meta_filter to select_fields
How is this patch tested? If it is not, please explain why.
added additional unit tests, tested manually.
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
an optional parameter
meta_filter
added todataset.select_fields
which allows for a string or dict to be passed along, which will be used to filter to the fields that have a match to this filter in their description or info.to select only fields that have a description that contains the string "my description", you can use
meta_filter="my description"
. To select only fields that have a specific key in their info, you can usemeta_filter=dict(key="value")
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changes