Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use get_multi instead of get for datastore reads #1814

Merged
merged 3 commits into from
Aug 31, 2021

Conversation

achals
Copy link
Member

@achals achals commented Aug 30, 2021

Signed-off-by: Achal Shah [email protected]

What this PR does / why we need it:

get is a wrapper around get_multi, and get_multi reduces the network costs involved.

Which issue(s) this PR fixes:

Fixes #1759

Does this PR introduce a user-facing change?:

Use a batch lookup API for Google Datastore. Users should see lower latencies for `get_online_features` when using Datastore as the online store.

@codecov-commenter
Copy link

codecov-commenter commented Aug 30, 2021

Codecov Report

Merging #1814 (3a53cfb) into master (5857a55) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1814      +/-   ##
==========================================
- Coverage   84.85%   84.84%   -0.01%     
==========================================
  Files          92       92              
  Lines        6828     6844      +16     
==========================================
+ Hits         5794     5807      +13     
- Misses       1034     1037       +3     
Flag Coverage Δ
integrationtests 84.77% <100.00%> (-0.01%) ⬇️
unittests 63.90% <10.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdk/python/feast/infra/online_stores/datastore.py 93.49% <100.00%> (+0.39%) ⬆️
sdk/python/tests/doctest/test_all.py 92.72% <0.00%> (-3.20%) ⬇️
sdk/python/feast/infra/provider.py 88.49% <0.00%> (-0.60%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5857a55...3a53cfb. Read the comment docs.

@woop
Copy link
Member

woop commented Aug 30, 2021

What is the performance difference between these two approaches? Have you run a benchmark?

@achals
Copy link
Member Author

achals commented Aug 30, 2021

What is the performance difference between these two approaches? Have you run a benchmark?

retrieving 10 records, using pytest-benchmark. Single retreival seems to be strictly slower, which is not surprising since get is implemented as a wrapper over get_multi.

----------------------------------------------------------------------------------------- benchmark: 2 tests ----------------------------------------------------------------------------------------
Name (time in s)                              Min               Max              Mean            StdDev            Median               IQR            Outliers     OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_online_retrieval_datastore_batch      2.8716 (1.0)      3.2226 (1.0)      2.9943 (1.0)      0.0986 (1.0)      2.9681 (1.0)      0.1220 (1.0)           8;0  0.3340 (1.0)          30           5
test_online_retrieval_datastore_single     4.0147 (1.40)     4.6170 (1.43)     4.2247 (1.41)     0.1446 (1.47)     4.2366 (1.43)     0.1492 (1.22)          9;3  0.2367 (0.71)         30           5
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Copy link
Collaborator

@felixwang9817 felixwang9817 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: achals, felixwang9817

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Achal Shah <[email protected]>
@feast-ci-bot
Copy link
Collaborator

New changes are detected. LGTM label has been removed.

Signed-off-by: Achal Shah <[email protected]>
@achals achals added the lgtm label Aug 31, 2021
@feast-ci-bot feast-ci-bot merged commit f9b13b0 into feast-dev:master Aug 31, 2021
@achals achals deleted the achal/batch-datastore branch August 31, 2021 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datastore online request makes a call once for each entity
5 participants