Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh: improve read perf of http accessor by using requests.Session #33

Merged
merged 3 commits into from
Jul 21, 2023

Conversation

xgui3783
Copy link
Collaborator

@xgui3783 xgui3783 commented Jul 13, 2023

According to https://requests.readthedocs.io/en/latest/user/advanced/#session-objects , it appears that by using requests.Session, the underlying TCP connection will be reused, which should drastically improve repeated get requests to the same host.

From the (admittedly) rough test script[1], we see promising improvement (execution time cut by ~50% in sync reads and ~30% in threaded reads) [2]

[1] rough script used for testing both sync and threaded operations, we used time python <script_name>.py and manually recorded the result

from neuroglancer_scripts.http_accessor import HttpAccessor
from concurrent.futures import ThreadPoolExecutor
from itertools import repeat

multiple = 2

def get_args():
    return [
        (x, x+64, y, y+64, z, z+64)
        for x in range(0, 64*multiple, 64)
        for y in range(0, 64*multiple, 64)
        for z in range(0, 64*multiple, 64)
    ]

def main():
    accessor = HttpAccessor("https://neuroglancer.humanbrainproject.eu/precomputed/BigBrainRelease.2015/8bit")
    for arg in get_args:
        accessor.fetch_chunk("20um",arg)

def threaded():
    accessor = HttpAccessor("https://neuroglancer.humanbrainproject.eu/precomputed/BigBrainRelease.2015/8bit")
    with ThreadPoolExecutor(max_workers=4) as ex:
        for val in ex.map(
            accessor.fetch_chunk,
            repeat("20um"),
            get_args()
        ): pass

[2] truncated result

multiple sync or threaded previous current
2 sync 0m0.766s 0m0.315s
10 sync 1m24.788s 0m24.446s
2 threaded 0m0.336s 0m0.207s
10 threaded 0m24.379s 0m6.099s

@xgui3783
Copy link
Collaborator Author

failed tests does not seem to be caused by this PR, but rather a more strict parsing of gifti file. see trace https://github.com/HumanBrainProject/neuroglancer-scripts/actions/runs/5541279128/jobs/10114475201#step:6:54

I can fix the issue in this PR too, or would you prefer to fix it in a separate PR, and I can rebase/merge this PR?

@ylep ylep self-requested a review July 21, 2023 13:38
Copy link
Collaborator

@ylep ylep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks a lot @xgui3783 ! :-)

@codecov
Copy link

codecov bot commented Jul 21, 2023

Codecov Report

Merging #33 (47fd3b1) into master (b7b5047) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #33   +/-   ##
=======================================
  Coverage   92.61%   92.61%           
=======================================
  Files          25       25           
  Lines        1489     1490    +1     
  Branches      219      219           
=======================================
+ Hits         1379     1380    +1     
  Misses         63       63           
  Partials       47       47           
Impacted Files Coverage Δ
src/neuroglancer_scripts/http_accessor.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ylep ylep merged commit a6bce3f into HumanBrainProject:master Jul 21, 2023
9 checks passed
@xgui3783 xgui3783 deleted the feat_improveHttpAccessor branch August 28, 2023 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants