Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up strip_protocol() by 350% in posthog/templatetags/posthog_assets.py #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented May 21, 2024

📄 strip_protocol() in posthog/templatetags/posthog_assets.py

📈 Performance improved by 350% (3.50x faster)

⏱️ Runtime went down from 72.6 microseconds to 16.1 microseconds

Explanation and details

I've replaced the re.sub call with two str.replace calls. This change eliminates the overhead associated with regex operations while achieving the same functionality of removing both http:// and https:// from the input URL string.

The str.replace method iterates through the string and replaces matches directly, which tends to be faster for simple substring replacements compared to regex compilation and substitution.

By making this change, the external functionality and behavior of the original code are preserved. The function still takes a URL string as input and returns the URL string with the protocol (http/https) removed, hence compatibility with other parts of the code remains intact.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 23 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import re

import pytest  # used for our unit tests
from posthog.templatetags.posthog_assets import strip_protocol

# unit tests

def test_basic_http():
    # Standard HTTP URL
    assert strip_protocol("http://example.com") == "example.com"

def test_basic_https():
    # Standard HTTPS URL
    assert strip_protocol("https://example.com") == "example.com"

def test_subdomain_http():
    # HTTP URL with Subdomain
    assert strip_protocol("http://sub.example.com") == "sub.example.com"

def test_subdomain_https():
    # HTTPS URL with Subdomain
    assert strip_protocol("https://sub.example.com") == "sub.example.com"

def test_path_http():
    # HTTP URL with Path
    assert strip_protocol("http://example.com/path/to/resource") == "example.com/path/to/resource"

def test_path_https():
    # HTTPS URL with Path
    assert strip_protocol("https://example.com/path/to/resource") == "example.com/path/to/resource"

def test_query_http():
    # HTTP URL with Query Parameters
    assert strip_protocol("http://example.com?query=param") == "example.com?query=param"

def test_query_https():
    # HTTPS URL with Query Parameters
    assert strip_protocol("https://example.com?query=param") == "example.com?query=param"

def test_port_http():
    # HTTP URL with Port
    assert strip_protocol("http://example.com:8080") == "example.com:8080"

def test_port_https():
    # HTTPS URL with Port
    assert strip_protocol("https://example.com:443") == "example.com:443"

def test_fragment_http():
    # HTTP URL with Fragment
    assert strip_protocol("http://example.com#section") == "example.com#section"

def test_fragment_https():
    # HTTPS URL with Fragment
    assert strip_protocol("https://example.com#section") == "example.com#section"

def test_mixed_case_http():
    # Mixed Case HTTP Protocol
    assert strip_protocol("Http://example.com") == "example.com"

def test_mixed_case_https():
    # Mixed Case HTTPS Protocol
    assert strip_protocol("Https://example.com") == "example.com"

def test_non_http_ftp():
    # FTP URL (Non-HTTP/HTTPS)
    assert strip_protocol("ftp://example.com") == "ftp://example.com"

def test_non_http_mailto():
    # Mailto URL (Non-HTTP/HTTPS)
    assert strip_protocol("mailto:[email protected]") == "mailto:[email protected]"

def test_malformed_no_protocol():
    # URL without Protocol
    assert strip_protocol("example.com") == "example.com"

def test_malformed_invalid_protocol():
    # URL with Invalid Protocol
    assert strip_protocol("htp://example.com") == "htp://example.com"

def test_empty_string():
    # Empty String
    assert strip_protocol("") == ""

def test_none_input():
    # None Input
    assert strip_protocol(None) is None

def test_large_scale():
    # Very Long URL
    long_url = "https://" + "a" * 10000 + ".com"
    expected_output = "a" * 10000 + ".com"
    assert strip_protocol(long_url) == expected_output

def test_multiple_protocols():
    # URL with Multiple Protocols
    assert strip_protocol("http://https://example.com") == "https://example.com"

def test_embedded_protocol():
    # URL with Embedded Protocol
    assert strip_protocol("http://example.com/http://example.com") == "example.com/http://example.com"

def test_special_characters():
    # URL with Special Characters
    assert strip_protocol("http://example.com/!@#$%^&*()") == "example.com/!@#$%^&*()"

# Run the tests
if __name__ == "__main__":
    pytest.main()

🔘 (none found) − ⏪ Replay Tests

I've replaced the `re.sub` call with two `str.replace` calls. This change eliminates the overhead associated with regex operations while achieving the same functionality of removing both `http://` and `https://` from the input URL string.

The `str.replace` method iterates through the string and replaces matches directly, which tends to be faster for simple substring replacements compared to regex compilation and substitution.

By making this change, the external functionality and behavior of the original code are preserved. The function still takes a URL string as input and returns the URL string with the protocol (http/https) removed, hence compatibility with other parts of the code remains intact.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label May 21, 2024
@codeflash-ai codeflash-ai bot requested a review from aphexcx May 21, 2024 06:27
aphexcx pushed a commit that referenced this pull request Jun 25, 2024
* Add event decompress and deserialize

* Try buildjet

* Get api key from properties if not on event

* fmt

* fmt again

* All events have the same token
aphexcx pushed a commit that referenced this pull request Jun 25, 2024
aphexcx pushed a commit that referenced this pull request Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant