Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

createQueryStream is not truly streaming; seems to be pooling #120

Closed
jimkang opened this issue Jun 13, 2018 · 6 comments
Closed

createQueryStream is not truly streaming; seems to be pooling #120

jimkang opened this issue Jun 13, 2018 · 6 comments
Assignees
Labels
api: bigquery Issues related to the googleapis/nodejs-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jimkang
Copy link

jimkang commented Jun 13, 2018

Environment details

  • OS: Ubuntu, OS X
  • Node.js version: 8
  • npm version: 6.0.1
  • @google-cloud/bigquery version: 1.3.0

Steps to reproduce

  1. Use createQueryStream to run a query with a very large payload, in the GB range of results.
  2. Put a log in the data event handler for the stream.
  3. Watch the memory usage of that Node process swell before even a single data event. (If you have a big enough query, you'll get an OOM error. e.g.:
OOM in node_modules/@google-cloud/common/src/util.js:185
 183   if (is.string(body)) {
 184     try {
>185       parsedHttpRespBody.body = JSON.parse(body);
 186     } catch (err) {
 187       parsedHttpRespBody.err = new util.ApiError('Cannot parse JSON response');
debug>

Can we get it to stream data as it gets it and NOT pool it up in memory?

@JustinBeckwith JustinBeckwith added triage me I really want to be triaged. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jun 13, 2018
@JustinBeckwith JustinBeckwith removed the triage me I really want to be triaged. label Jun 15, 2018
@klon
Copy link

klon commented Sep 12, 2018

Do you have any kind of workaround for 1.3.0 to make streaming work for createQueryStream? We are having memory issues due to this bug.

@jimkang
Copy link
Author

jimkang commented Oct 17, 2018

Hi, any update on when this might be published?

@Portur
Copy link

Portur commented Aug 27, 2020

I want to keep this issue alive as I recently had to upgrade my GAE instance size (+$40 p/m) to 2.3gb memory to support this bug. Its been 2 years. Please if you could find the time to fix this, I'd greatly appreciate it.

@stephenplusplus
Copy link
Contributor

stephenplusplus commented Aug 27, 2020

@Portur could you create a new issue? As far as we knew, this was resolved. A new issue would give a chance for you to share more details and reproduction steps using our issue template. Sorry for the trouble.

@Portur
Copy link

Portur commented Aug 28, 2020

Sorry, miscommunication on my side. I had an instance die on me a few times when querying data from BigQuery. I opened a support ticket with GCP and they directed me to this bug.

In summary they explained the nodejs version is not efficient/performant when querying data. I'd like to explain but without the context and code snippets from the support ticket I won't be able to fully explain what went wrong and where. All I can say right now is using createQueryStream and job.getQueryResults() have both caused memory issues which caused the instances to die in GAE.

This issue is fairly unresolved from GCP Supports' standpoint and may not be directly related to createQueryStream.

Is there anyway for me to share the content of the ticket without throwing you with a wall of text? I'd be happy to open a new ticket with public google docs link.

@stephenplusplus
Copy link
Contributor

Thanks for the information. Creating a new ticket with a Docs link sounds like a good start. Sorry for the inconvenience this has put on you. Hopefully we can figure out what's going on!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/nodejs-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

6 participants