-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use socketfile to improve perf #441
Conversation
The benchmark i used - running on the I switched between the new In this example i see a 50% performance improvement In [1]: # Use auto-reload to switch between `read_bytes` and `read_bytes_using_file`
...: %load_ext autoreload
...: %autoreload 2
In [2]: import vertica_python
...:
...: conn_info = {
...: "host": "127.0.0.1",
...: "port": 5433,
...: "user": "dbadmin",
...: #'password': 'some_password',
...: "database": "VMart",
...: "session_label": "some_label",
...: "unicode_error": "strict",
...: "ssl": False,
...: "autocommit": True,
...: "use_prepared_statements": False,
...: "connection_timeout": 5,
...: }
...:
...: query = """
...: SELECT sales_quantity, sales_dollar_amount, transaction_type
...: FROM online_sales.online_sales_fact
...: LIMIT 100000
...: """
...:
...: def run_query():
...: with vertica_python.connect(**conn_info) as conn:
...: cur = conn.cursor()
...: cur.execute(query)
...: cur.fetchall()
...:
In [3]: # using read_bytes_using_file
In [4]: %timeit run_query()
960 ms ± 6.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [5]: # Using read_bytes
In [6]: %timeit run_query()
1.95 s ± 7.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) |
@sitingren - Hey, do you need any additional information regarding this change? |
Wow! Impressive work 🥇 |
@sitingren - Thanks for the review, |
Thanks @sitingren ! When can we expect a release? |
Very exciting news guys :) |
This goes into release v1.0.5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buf += data
might not be needed at all. The read will return exactly the amount needed. As far as I remember this interface is different from the original read
as it returns full results only
edit: both cpython and pypy return a io.BufferedReader
which will handle this for you
#397