Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up resolve_pairwise_criteria() by 21% in libs/langchain/langchain/evaluation/comparison/eval_chain.py #34

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Feb 16, 2024

📄 resolve_pairwise_criteria() in libs/langchain/langchain/evaluation/comparison/eval_chain.py

📈 Performance went up by 21% (0.21x faster)

⏱️ Runtime went down from 66.70μs to 55.00μs

Explanation and details

(click to show)

To optimize your program for increased runtime performance, I would suggest avoiding repetitive calls to isinstance() and reducing the function call resolve_pairwise_criteria(criterion) inside the list comprehension.

Here's the optimized version of the code.

In the optimized version, we store the type of criteria in a variable to avoid repeated calls which helps run the code faster. In the case where criteria is a list or tuple, instead of doing a double for loop directly in a dictionary comprehension, I have broken it down into a for loop which updates a dictionary. This can speed up the code because dictionary comprehensions are recomputed and rebuilt at runtime, whereas with dict.update(), the dictionary is updated in-place which makes it more efficient. Also, the resolution of nested criteria in case of a list is now handled by direct dict updates in place of nested dictionary comprehension, further improving speed and readability.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

✅ 18 Passed − ⚙️ Existing Unit Tests

✅ 0 Passed − 🎨 Inspired Regression Tests

✅ 4 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import pytest  # used for our unit tests
from typing import Optional, Union, List
from enum import Enum

# Assuming the existence of these types and variables for the test cases
class Criteria(Enum):
    HELPFULNESS = 'helpfulness'
    RELEVANCE = 'relevance'
    CORRECTNESS = 'correctness'
    DEPTH = 'depth'

class ConstitutionalPrinciple:
    def __init__(self, name, critique_request):
        self.name = name
        self.critique_request = critique_request

# Mocked _SUPPORTED_CRITERIA for testing purposes
_SUPPORTED_CRITERIA = {
    Criteria.HELPFULNESS: 'Helps a lot',
    Criteria.RELEVANCE: 'Very relevant',
    Criteria.CORRECTNESS: 'Factually correct',
    Criteria.DEPTH: 'In-depth analysis',
}

# The function to test is inserted here as provided

# Unit tests

# Test when criteria is None
def test_resolve_pairwise_criteria_none():
    expected = {
        Criteria.HELPFULNESS.value: _SUPPORTED_CRITERIA[Criteria.HELPFULNESS],
        Criteria.RELEVANCE.value: _SUPPORTED_CRITERIA[Criteria.RELEVANCE],
        Criteria.CORRECTNESS.value: _SUPPORTED_CRITERIA[Criteria.CORRECTNESS],
        Criteria.DEPTH.value: _SUPPORTED_CRITERIA[Criteria.DEPTH],
    }
    assert resolve_pairwise_criteria(None) == expected

# Test when criteria is a single instance of Criteria
def test_resolve_pairwise_criteria_single_instance():
    for criterion in Criteria:
        expected = {criterion.value: _SUPPORTED_CRITERIA[criterion]}
        assert resolve_pairwise_criteria(criterion) == expected

# Test when criteria is a string that is a valid key in _SUPPORTED_CRITERIA
def test_resolve_pairwise_criteria_valid_string():
    for criterion in Criteria:
        expected = {criterion.value: _SUPPORTED_CRITERIA[criterion]}
        assert resolve_pairwise_criteria(criterion.value) == expected

# Test when criteria is a string that is not a key in _SUPPORTED_CRITERIA
def test_resolve_pairwise_criteria_invalid_string():
    assert resolve_pairwise_criteria("invalid") == {"invalid": ""}

# Test when criteria is an instance of ConstitutionalPrinciple
def test_resolve_pairwise_criteria_constitutional_principle():
    principle = ConstitutionalPrinciple("Equality", "Must treat everyone equally")
    expected = {"Equality": "Must treat everyone equally"}
    assert resolve_pairwise_criteria(principle) == expected

# Test when criteria is a list of valid Criteria instances
def test_resolve_pairwise_criteria_list_of_criteria():
    criteria_list = [Criteria.HELPFULNESS, Criteria.RELEVANCE]
    expected = {
        Criteria.HELPFULNESS.value: _SUPPORTED_CRITERIA[Criteria.HELPFULNESS],
        Criteria.RELEVANCE.value: _SUPPORTED_CRITERIA[Criteria.RELEVANCE],
    }
    assert resolve_pairwise_criteria(criteria_list) == expected

# Test when criteria is a list with a mix of valid Criteria instances and strings
def test_resolve_pairwise_criteria_mixed_list():
    criteria_list = [Criteria.HELPFULNESS, "DEPTH"]
    expected = {
        Criteria.HELPFULNESS.value: _SUPPORTED_CRITERIA[Criteria.HELPFULNESS],
        "DEPTH": "",
    }
    assert resolve_pairwise_criteria(criteria_list) == expected

# Test when criteria is an empty list
def test_resolve_pairwise_criteria_empty_list():
    with pytest.raises(ValueError):
        resolve_pairwise_criteria([])

# Test when criteria is an empty string
def test_resolve_pairwise_criteria_empty_string():
    with pytest.raises(ValueError):
        resolve_pairwise_criteria("")

# Test when criteria is an empty dictionary
def test_resolve_pairwise_criteria_empty_dict():
    assert resolve_pairwise_criteria({}) == {}

# Test when criteria is a dictionary with valid keys and values
def test_resolve_pairwise_criteria_valid_dict():
    criteria_dict = {Criteria.HELPFULNESS.value: "Helps a lot"}
    assert resolve_pairwise_criteria(criteria_dict) == criteria_dict

# Test when criteria is of an unexpected type (e.g., integer)
def test_resolve_pairwise_criteria_unexpected_type():
    with pytest.raises(TypeError):
        resolve_pairwise_criteria(42)  # Assuming the function raises TypeError for non-iterable non-Criteria

@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by CodeFlash AI label Feb 16, 2024
@codeflash-ai codeflash-ai bot requested a review from aphexcx February 16, 2024 18:17
Copy link

@aphexcx aphexcx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct: yes
Performant: yes? seems to only have been microbenchmarked. would have wanted to see tests with large number of values for more rigorous benchmarking
Readable: yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by CodeFlash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant