Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up is_truthy() by 49% in posthog/hogql/database/schema/util/where_clause_extractor.py #42

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jun 29, 2024

📄 is_truthy() in posthog/hogql/database/schema/util/where_clause_extractor.py

📈 Performance improved by 49% (0.49x faster)

⏱️ Runtime went down from 14.1 microseconds to 9.50 microseconds

Explanation and details

Why these changes?

  • The initial code was simplified by merging the logic of is_not_truthy into is_truthy.
  • The code was further optimized by using set membership for a concise and efficient check.
  • Finally, separate condition checks were used in the better micro-optimized code to ensure optimal performance.

Correctness

  • The logic remains the same: to determine if a value is truthy or not.
  • All branches (False, None, 0) are still correctly identified.

How is this faster?

  • Using a set for membership tests leverages average O(1) time complexity.
  • Checking conditions separately can avoid unnecessary overhead in some cases.
  • Overall, the code is both more readable and performant without changing the results or side effects.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 36 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import pytest  # used for our unit tests
from posthog.hogql.database.schema.util.where_clause_extractor import is_truthy

# unit tests

def test_basic_truthy_values():
    # Positive integers
    assert is_truthy(1) == True
    assert is_truthy(42) == True
    # Non-empty strings
    assert is_truthy("hello") == True
    assert is_truthy("0") == True
    # Non-empty lists
    assert is_truthy([1, 2, 3]) == True
    assert is_truthy(["a", "b", "c"]) == True
    # Non-empty dictionaries
    assert is_truthy({"key": "value"}) == True
    assert is_truthy({"a": 1, "b": 2}) == True
    # Non-empty sets
    assert is_truthy({1, 2, 3}) == True
    assert is_truthy({"a", "b", "c"}) == True

def test_basic_not_truthy_values():
    # Explicit False
    assert is_truthy(False) == False
    # Explicit None
    assert is_truthy(None) == False
    # Zero values
    assert is_truthy(0) == False
    assert is_truthy(0.0) == False

def test_other_falsy_values():
    # Empty string
    assert is_truthy("") == True
    # Empty list
    assert is_truthy([]) == True
    # Empty dictionary
    assert is_truthy({}) == True
    # Empty set
    assert is_truthy(set()) == True

def test_edge_cases():
    # Boolean True
    assert is_truthy(True) == True
    # Negative integers
    assert is_truthy(-1) == True
    assert is_truthy(-100) == True
    # Negative floating-point numbers
    assert is_truthy(-0.1) == True
    assert is_truthy(-3.14) == True
    # Special floating-point values
    assert is_truthy(float('inf')) == True
    assert is_truthy(float('-inf')) == True
    assert is_truthy(float('nan')) == True

def test_complex_data_structures():
    # Nested lists
    assert is_truthy([[1, 2], [3, 4]]) == True
    assert is_truthy([[], [1, 2, 3]]) == True
    # Nested dictionaries
    assert is_truthy({"outer": {"inner": "value"}}) == True
    assert is_truthy({"a": 1, "b": {"c": 2}}) == True

def test_large_scale_cases():
    # Large list
    large_list = list(range(1000000))
    assert is_truthy(large_list) == True
    # Large dictionary
    large_dict = {i: i for i in range(1000000)}
    assert is_truthy(large_dict) == True

def test_unusual_but_valid_inputs():
    # Custom object with __bool__ method returning True
    class CustomTrue:
        def __bool__(self):
            return True
    assert is_truthy(CustomTrue()) == True
    
    # Custom object with __bool__ method returning False
    class CustomFalse:
        def __bool__(self):
            return False
    assert is_truthy(CustomFalse()) == False
    
    # Function object
    def some_function():
        pass
    assert is_truthy(some_function) == True
    
    # Lambda function
    assert is_truthy(lambda x: x) == True

🔘 (none found) − ⏪ Replay Tests

#### Why these changes?
- The initial code was simplified by merging the logic of `is_not_truthy` into `is_truthy`.
- The code was further optimized by using set membership for a concise and efficient check.
- Finally, separate condition checks were used in the better micro-optimized code to ensure optimal performance.

#### Correctness
- The logic remains the same: to determine if a value is truthy or not.
- All branches (`False`, `None`, `0`) are still correctly identified.

#### How is this faster?
- Using a set for membership tests leverages average O(1) time complexity.
- Checking conditions separately can avoid unnecessary overhead in some cases.
- Overall, the code is both more readable and performant without changing the results or side effects.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 29, 2024
@codeflash-ai codeflash-ai bot requested a review from aphexcx June 29, 2024 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants