-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] fix HNSW param defaults in new configuration logic & require batch_size < sync_threshold #2526
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
81cb3cf
to
17e3cc4
Compare
chromadb/db/mixins/sysdb.py
Outdated
|
||
return configuration | ||
|
||
return CollectionConfigurationInternal.from_json_str(json_str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered making migrations a generic concept for CollectionConfigurations, but after starting to implement that it seemed like a lot of abstraction overhead for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's punt on that
chromadb/api/configuration.py
Outdated
): | ||
return (False, "batch_size must be less than or equal to sync_threshold") | ||
|
||
return super().validator() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just return nothing if we don't except, if we make this something which raises exceptions but otherwise does nothing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can probably take a parameter list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can probably take a parameter list?
sorry, what do you mean by this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now this is an instance method that reads self.parameter_map
but what we actually want is to see if a given parameter_list
on creation corresponds to a valid configuration, similar to how ParameterValidator
works for individual value
s.
This is mostly a stylistic choice though and I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it does kinda make sense for it to be a static method but finding the params you want from the list would be a little annoying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved pending nits
It doesn't really make sense to have a config where
hnsw:batch_size > hnsw:sync_threshold
, and although (I think) we don't depend on this invariant today we could accidentally depend on it in the future without enforcing it.