Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About nextQuery #99

Closed
stephenplusplus opened this issue Aug 7, 2014 · 6 comments
Closed

About nextQuery #99

stephenplusplus opened this issue Aug 7, 2014 · 6 comments
Assignees
Labels
api: datastore Issues related to the Datastore API. 🚨 This issue needs some love. triage me I really want to be triaged.

Comments

@stephenplusplus
Copy link
Contributor

Can we talk about this again? 😊

To use an example from datastore (but equally applies to the other service calls as well):

var query = ds.createQuery('Company').limit(5);
ds.runQuery(q, function(err, entities, nextQuery) {
    // nextQuery is not null if there are more results.
    if (nextQuery) {
        ds.runQuery(nextQuery, callback);
    }
});

Can you show me an example of how this is meant to be used? I'll take a guess, which I'll try to use to demonstrate the trouble I'm having with it:

var results = [];
function querySuccess(err, entities, nextQuery) {
  results.push(entities);

  if (nextQuery) {
    ds.runQuery(nextQuery, querySuccess);
  } else {
    nextThingIWantToDo();
  }
}

var q = ds.createQuery('Company').limit(5);
ds.runQuery(q, querySuccess);

There won't be anything I want to do in my first callback that I won't want to do in my second callback, or I would have run a different query, right? How would I manage doing something different on the second callback than the first, and something even different on the third, etc? That would be very unmanageable, and the reason someone would write 3 different queries if they needed 3 types of results.

When I run a query, I'm only interested in getting the entities that match my query back. Even if my query is loose enough to send me back 100,000 results, it would be my fault that I didn't restrict the limit in my query or specify strict enough matching rules in the query.

I know I asked about this before, but an "explain it like I'm 5" explanation would be welcome for why we can't just only call the callback once all results are in. :)

@rakyll
Copy link
Contributor

rakyll commented Aug 7, 2014

When I run a query, I'm only interested in getting the entities that match my query back.

I can add a special case for limit queries. If there is a limit set and I haven't retrieved all the pages, keep doing it until the end and call callback. Does it sound good?

@stephenplusplus
Copy link
Contributor Author

Sure, but what happens if I don't set a limit and do a query? Do I have to deal with the recursive query running still?

I'm just trying to understand why we want to give the developer control beyond the customization of the query. For example, if they don't set a limit, and their query yields a huge amount of results, what are the reasons they wouldn't want to keep running nextQuery to get the full set of data they queried for?

@rakyll
Copy link
Contributor

rakyll commented Aug 7, 2014

I'm just trying to understand why we want to give the developer control beyond the customization of the query

Typical batch jobs.

  • Retrieve some records
  • Process them
  • Retrieve more or schedule to retrieve more (User usually requires some level of control to fetch more data)

@pcostell, is it OK to continuously keep retrieving new pages in the background if limit is not met yet? We do pagination already to meet some performance constraints of the the backend, would it be harsh on the API if we implement auto pagination for the limit queries at the client level?

@pcostell
Copy link
Contributor

pcostell commented Aug 7, 2014

You definitely can, although you might cause unintended hurt if people start setting the limit to an arbitrarily high number but don't necessarily want that limit (it happens).

One of the key things to remember, especially if you look at the App Engine client libraries, is that Cloud Datastore is much better about giving you all the data that you want. In App Engine, the datastore has two explicit calls for queries, RunQuery and Next. Next allows the query to be continued, and is used pretty extensively for prefetching throughout the App Engine client libraries. However, the size of the result set returned from RunQuery is much smaller than that returned by Cloud Datastore. In particular, if you are writing a latency-sensitive application (i.e. user-facing stuff) and Cloud Datastore doesn't return the number of results that you wanted, the user should probably only display those results since getting a result means you're hitting a large size or time limit.

In the case of a user wanting to do a query and get all the results then process them all together, it may be best to let them hit the point where Cloud Datastore returns and force them to continue manually if that's really what they want.

Typical batch jobs.

  • Retrieve some records
  • Process them
  • Retrieve more or schedule to retrieve more (User usually requires some level of control to fetch more data)

I think here is really where this would be great. It definitely seems like a useful feature for users trying to do large workloads. In particular, doing something where the next batch is being retrieved as the user processes the current batch.

Note that this also might be a good place for the user to use a data processing framework because they can run queries in parallel (although the supported feature set of those queries is limited). We use a special scattered property to split their entire data into chunks which can be processed in parallel.

To answer your specific question, the Datastore API shouldn't have problems if you implement auto-pagination, however specifically for slow queries it may be a poor experience for users if they don't regain control until all results have been fetched.

@stephenplusplus
Copy link
Contributor Author

Thanks for the detailed reply, @pcostell! It sounds like we should keep the API the same with regards to always having to manually invoke nextQuery. It would likely result in confusion and/or misuse if we had a work-around that changed the behavior of the callback. Feel free to re-open if anyone feels differently 👍

@rakyll
Copy link
Contributor

rakyll commented Aug 7, 2014

I think we should keep providing runQuery as it is, but can provide helper iterators on top of that. Let's make a release, as people start to use the client, we will see patterns of replication. We might prefer to address them on a different layer and keep having runQuery as the most granular option for those who need performance.

@jgeewax jgeewax added api: datastore Issues related to the Datastore API. enhancement labels Feb 2, 2015
@jgeewax jgeewax added this to the Datastore Stable milestone Feb 2, 2015
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Apr 6, 2020
sofisl pushed a commit that referenced this issue Sep 27, 2022
sofisl pushed a commit that referenced this issue Oct 12, 2022
sofisl pushed a commit that referenced this issue Nov 11, 2022
[![WhiteSource Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [ts-loader](https://togithub.com/TypeStrong/ts-loader) | [`^8.0.2` -> `^9.0.0`](https://renovatebot.com/diffs/npm/ts-loader/8.1.0/9.0.0) | [![age](https://badges.renovateapi.com/packages/npm/ts-loader/9.0.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/npm/ts-loader/9.0.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/npm/ts-loader/9.0.0/compatibility-slim/8.1.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/npm/ts-loader/9.0.0/confidence-slim/8.1.0)](https://docs.renovatebot.com/merge-confidence/) |

---

### Release Notes

<details>
<summary>TypeStrong/ts-loader</summary>

### [`v9.0.0`](https://togithub.com/TypeStrong/ts-loader/blob/master/CHANGELOG.md#v900)

[Compare Source](https://togithub.com/TypeStrong/ts-loader/compare/v8.1.0...v9.0.0)

Breaking changes:

-   minimum webpack version: 5
-   minimum node version: 12

Changes:

-   [webpack 5 migration](https://togithub.com/TypeStrong/ts-loader/pull/1251) - thanks [@&#8203;johnnyreilly](https://togithub.com/johnnyreilly), [@&#8203;jonwallsten](https://togithub.com/jonwallsten), [@&#8203;sokra](https://togithub.com/sokra), [@&#8203;appzuka](https://togithub.com/appzuka), [@&#8203;alexander-akait](https://togithub.com/alexander-akait)

</details>

---

### Configuration

:date: **Schedule**: "after 9am and before 3pm" (UTC).

:vertical_traffic_light: **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

:recycle: **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

:no_bell: **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box.

---

This PR has been generated by [WhiteSource Renovate](https://renovate.whitesourcesoftware.com). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/nodejs-dialogflow-cx).
sofisl pushed a commit that referenced this issue Nov 11, 2022
* chore(deps): upgrade gapic-generator-java to 2.4.1

PiperOrigin-RevId: 422607515

Source-Link: googleapis/googleapis@ba2ffd6

Source-Link: googleapis/googleapis-gen@73ba4ad
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiNzNiYTRhZGQyMzlhNjE5ZGE1NjdmZmJkNGU1NzMwZmRkNmRlMDRkMyJ9

* 🦉 Updates from OwlBot

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
sofisl pushed a commit that referenced this issue Nov 11, 2022
- [ ] Regenerate this pull request now.

PiperOrigin-RevId: 408439482

Source-Link: googleapis/googleapis@b9f6184

Source-Link: googleapis/googleapis-gen@eb888bc
Copy-Tag: eyJwIjoiLmdpdGh1Yi8uT3dsQm90LnlhbWwiLCJoIjoiZWI4ODhiYzIxNGVmYzdiZjQzYmY0NjM0YjQ3MDI1NDU2NWE2NTlhNSJ9
sofisl pushed a commit that referenced this issue Nov 11, 2022
Source-Link: googleapis/synthtool@6981da4
Post-Processor: gcr.io/cloud-devrel-public-resources/owlbot-nodejs:latest@sha256:3563b6b264989c4f5aa31a3682e4df36c95756cfef275d3201508947cbfc511e

Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
sofisl pushed a commit that referenced this issue Nov 11, 2022
🤖 I have created a release *beep* *boop*
---


## [2.1.0](googleapis/nodejs-service-usage@v2.0.0...v2.1.0) (2022-06-30)


### Features

* support regapic LRO ([#98](googleapis/nodejs-service-usage#98)) ([b433238](googleapis/nodejs-service-usage@b433238))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
sofisl pushed a commit that referenced this issue Nov 11, 2022
* chore(main): release 2.0.2

* 🦉 Updates from OwlBot post-processor

See https://github.com/googleapis/repo-automation-bots/blob/main/packages/owl-bot/README.md

Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
Co-authored-by: Owl Bot <gcf-owl-bot[bot]@users.noreply.github.com>
sofisl pushed a commit that referenced this issue Nov 11, 2022
🤖 I have created a release *beep* *boop*
---


## [2.0.1](googleapis/nodejs-essential-contacts@v2.0.0...v2.0.1) (2022-06-30)


### Bug Fixes

* **docs:** describe fallback rest option ([#98](googleapis/nodejs-essential-contacts#98)) ([1ed45c6](googleapis/nodejs-essential-contacts@1ed45c6))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
sofisl pushed a commit that referenced this issue Nov 16, 2022
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
sofisl pushed a commit that referenced this issue Nov 17, 2022
sofisl pushed a commit that referenced this issue Nov 30, 2022
…ept strings (#99)

* [CHANGE ME] Re-generated  to pick up changes in the API or client library generator.

* fix: resource name helper functions

Co-authored-by: Alexander Fenster <[email protected]>
sofisl pushed a commit that referenced this issue Jan 26, 2023
* chore: migrate to owl bot

* chore: copy files from googleapis-gen 298d075b25f90307b1e3b56eb256b66304dd329c

* chore: run the post processor

* chore: migrate to owl bot

* chore: copy files from googleapis-gen 674a41e0de2869f44f45eb7b1a605852a5394bba

* chore: run the post processor

* chore: retrigger github checks

Co-authored-by: Luke Davis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. 🚨 This issue needs some love. triage me I really want to be triaged.
Projects
None yet
Development

No branches or pull requests

5 participants