Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: support /upload method for loading data from local files. #1392

Closed
tswast opened this issue Nov 11, 2016 · 16 comments · Fixed by #1395
Closed

BigQuery: support /upload method for loading data from local files. #1392

tswast opened this issue Nov 11, 2016 · 16 comments · Fixed by #1395
Assignees
Labels
api: bigquery Issues related to the BigQuery API.

Comments

@tswast
Copy link
Contributor

tswast commented Nov 11, 2016

See documentation: https://cloud.google.com/bigquery/loading-data-post-request

Python, PHP, Ruby, and Node.js provide an upload-from-file method on the Table object.

This is blocking documentation samples.

@tswast tswast added the api: bigquery Issues related to the BigQuery API. label Nov 11, 2016
@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

Actually, I see we have a LoadJobConfiguration. http://googlecloudplatform.github.io/google-cloud-java/0.5.1/apidocs/com/google/cloud/bigquery/LoadJobConfiguration.html That might be sufficient.

@mziccard
Copy link
Contributor

Tim have a look at BigQuery.writer(), it allows you to upload a file in chunks (generally a good idea as they tend to be quite big files). It should be rather simple to write a whole file using it.

@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

Thanks for the pointer.

@mziccard
Copy link
Contributor

Can we close this?

@tswast tswast closed this as completed Nov 11, 2016
@tswast tswast reopened this Nov 11, 2016
@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

Is there a way to get the Job the writer must create? The Python version is able to get the number of rows written in it's sample.

https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/bigquery/cloud-client/load_data_from_file.py#L51

I am trying to improve the snippets for this (there are no integration tests) and having the returned data from the Job would be really useful.

@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

Or even getting the Job ID would be useful. Then I can fetch the job results after finishing my writes.

@mziccard
Copy link
Contributor

Is there a way to get the Job the writer must create? The Python version is able to get the number of rows written in it's sample.

I need to look into this. I don't know any way of getting the job metadata/id when using resumable upload: the open request just returns the Upload ID, all subsequent POST/PUT seem to have empty response body.

@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

Thanks. I'll dig into it, too.

@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

My current thought is making BigQueryRpc.open() return a Job instead of a String. I will attempt to get a PR out for this.

@mziccard
Copy link
Contributor

Tim I was wrong. The last call of a chunked upload returns all the metadata of the just created job. We can put #1394 on hold until I fix this. I will probably make TableDataWriteChannel public and add a getJob() method that returns the job associated to the upload, only once the channel is closed (i.e. the upload was finalized).

@tswast
Copy link
Contributor Author

tswast commented Nov 11, 2016

@mziccard Thanks. That sounds like a great solution.

@mziccard
Copy link
Contributor

Thanks for catching this, I have always assumed that the job metadata wasn't available and I couldn't be more wrong :)

@tswast
Copy link
Contributor Author

tswast commented Nov 16, 2016

I'm getting an error:

java.lang.ClassCastException: com.google.cloud.bigquery.JobStatistics$CopyStatistics cannot be cast to com.google.cloud.bigquery.JobStatistics$LoadStatistics

when trying to use this new functionality in the snippets like:

TableDataWriteChannel writer = bigquery.writer(writeChannelConfiguration);
  // Write data to writer
 try {
    writer.write(ByteBuffer.wrap(csvData.getBytes(Charsets.UTF_8)));
  } finally {
    writer.close();
  }
  // Get load job
  Job job = writer.getJob();
  job.waitFor();
  LoadStatistics stats = job.getStatistics();
  return stats.getOutputRows();

@tswast
Copy link
Contributor Author

tswast commented Nov 16, 2016

I think it should be a Load Job, not a Copy Job.

@tswast
Copy link
Contributor Author

tswast commented Nov 16, 2016

Ah, setting job = job.waitFor(); fixes it.

Is this desired behavior? @mziccard

@michael-hll
Copy link

I tied the example, it seems the issue still there. It happens when my csv data was not prepared well. I'm using cloud version below:

com.google.cloud
google-cloud
0.8.0

github-actions bot pushed a commit that referenced this issue Jun 21, 2022
🤖 I have created a release *beep* *boop*
---


## [2.2.4](googleapis/java-bigquerydatatransfer@v2.2.3...v2.2.4) (2022-06-13)


### Dependencies

* update dependency com.google.cloud:google-cloud-bigquery to v2.13.2 ([#1388](googleapis/java-bigquerydatatransfer#1388)) ([5366270](googleapis/java-bigquerydatatransfer@5366270))
* update dependency com.google.cloud:google-cloud-pubsub to v1.119.1 ([#1387](googleapis/java-bigquerydatatransfer#1387)) ([578c028](googleapis/java-bigquerydatatransfer@578c028))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
github-actions bot pushed a commit to yoshi-code-bot/google-cloud-java that referenced this issue Oct 6, 2022
…3.0 (googleapis#1392)

[![Mend Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [com.google.cloud:google-cloud-storage](https://togithub.com/googleapis/java-storage) | `2.12.0` -> `2.13.0` | [![age](https://badges.renovateapi.com/packages/maven/com.google.cloud:google-cloud-storage/2.13.0/age-slim)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://badges.renovateapi.com/packages/maven/com.google.cloud:google-cloud-storage/2.13.0/adoption-slim)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://badges.renovateapi.com/packages/maven/com.google.cloud:google-cloud-storage/2.13.0/compatibility-slim/2.12.0)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://badges.renovateapi.com/packages/maven/com.google.cloud:google-cloud-storage/2.13.0/confidence-slim/2.12.0)](https://docs.renovatebot.com/merge-confidence/) |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Renovate will not automatically rebase this PR, because other commits have been found.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. ⚠ **Warning**: custom changes will be lost.

---

This PR has been generated by [Mend Renovate](https://www.mend.io/free-developer-tools/renovate/). View repository job log [here](https://app.renovatebot.com/dashboard#github/googleapis/java-asset).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4yMTYuMCIsInVwZGF0ZWRJblZlciI6IjMyLjIxNi4wIn0=-->
github-actions bot pushed a commit to renovate-bot/google-cloud-java that referenced this issue Oct 8, 2022
🤖 I have created a release *beep* *boop*
---


## [3.7.1](https://togithub.com/googleapis/java-asset/compare/v3.7.0...v3.7.1) (2022-10-07)


### Dependencies

* Update dependency com.google.api.grpc:proto-google-cloud-orgpolicy-v1 to v2.3.6 ([googleapis#1390](https://togithub.com/googleapis/java-asset/issues/1390)) ([4219b66](https://togithub.com/googleapis/java-asset/commit/4219b66da58325f5a133caa84243dc0958d90149))
* Update dependency com.google.api.grpc:proto-google-cloud-os-config-v1 to v2.5.6 ([googleapis#1382](https://togithub.com/googleapis/java-asset/issues/1382)) ([3abde9e](https://togithub.com/googleapis/java-asset/commit/3abde9e4130d62d9e6857dd3b016be88691bd7f4))
* Update dependency com.google.api.grpc:proto-google-cloud-pubsub-v1 to v1.102.20 ([googleapis#1384](https://togithub.com/googleapis/java-asset/issues/1384)) ([5904111](https://togithub.com/googleapis/java-asset/commit/5904111205fc7c698703eb8db805b159a6513389))
* Update dependency com.google.api.grpc:proto-google-identity-accesscontextmanager-v1 to v1.4.5 ([googleapis#1383](https://togithub.com/googleapis/java-asset/issues/1383)) ([48f2dd8](https://togithub.com/googleapis/java-asset/commit/48f2dd863c23ec3fffb24970046cab6795a64045))
* Update dependency com.google.cloud:google-cloud-bigquery to v2.17.0 ([googleapis#1386](https://togithub.com/googleapis/java-asset/issues/1386)) ([d3cf534](https://togithub.com/googleapis/java-asset/commit/d3cf534180a8b0c4a421ff52a998b43edd502e6c))
* Update dependency com.google.cloud:google-cloud-core to v2.8.20 ([googleapis#1380](https://togithub.com/googleapis/java-asset/issues/1380)) ([113b74e](https://togithub.com/googleapis/java-asset/commit/113b74e29f05047ba3eb22cee370b2b35b133991))
* Update dependency com.google.cloud:google-cloud-pubsub to v1.120.20 ([googleapis#1385](https://togithub.com/googleapis/java-asset/issues/1385)) ([792d9a2](https://togithub.com/googleapis/java-asset/commit/792d9a2aa2253e4cb6541c587810aab936aa9d81))
* Update dependency com.google.cloud:google-cloud-resourcemanager to v1.5.6 ([googleapis#1391](https://togithub.com/googleapis/java-asset/issues/1391)) ([9afc924](https://togithub.com/googleapis/java-asset/commit/9afc924fbe87b481dc3b731935d3dc021e1bae00))
* Update dependency com.google.cloud:google-cloud-storage to v2.13.0 ([googleapis#1392](https://togithub.com/googleapis/java-asset/issues/1392)) ([9660ee4](https://togithub.com/googleapis/java-asset/commit/9660ee42c68e7cb88e0d6867ac46f2cd6056ca17))

---
This PR was generated with [Release Please](https://togithub.com/googleapis/release-please). See [documentation](https://togithub.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants