⚡️ Speed up _hamming_distance()
by 50% in libs/langchain/langchain/evaluation/embedding_distance/base.py
#7
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄
_hamming_distance()
inlibs/langchain/langchain/evaluation/embedding_distance/base.py
📈 Performance went up by
50%
(0.50x
faster)⏱️ Runtime went down from
749.61μs
to500.81μs
Explanation and details
(click to show)
One way to optimize the given Python function is by avoiding the usage of Python's built-in function
np.mean()
, and instead, using the direct calculation of the mean. Moreover, the '!=' operator will return a boolean array. By summing up this array we then get all True values (which are interpreted as 1) and then normalize it by the size of the array (which is equivalent to calculating the mean).Consider the rewrite below:
This new version of the function avoids the usage of the
np.mean()
function, which reduces the time complexity from O(3n) to O(2n), thus making the program run faster.Correctness verification
The new optimized code was tested for correctness. The results are listed below.
✅ 0 Passed − ⚙️ Existing Unit Tests
✅ 0 Passed − 🎨 Inspired Regression Tests
✅ 11 Passed − 🌀 Generated Regression Tests
(click to show generated tests)