Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait for job result before emitting complete. #85

Merged
merged 3 commits into from
Aug 13, 2018

Conversation

stephenplusplus
Copy link
Contributor

@stephenplusplus stephenplusplus commented Apr 2, 2018

Fixes #8

This would be a breaking change

Currently, if you call table.createWriteStream(), the stream will emit "complete" with a Job object. You can then use that Job object to assign event handlers, as usual with an LRO. Judging from #8, I'm not sure this pattern is clear to the users. And aside from that, I'm not sure what we're doing is ideal, either.

Ideally, we would not emit "complete" until the Job finished, instead of making the user do it manually. This PR shows how easy it would be for us to implement. However, I believe this would be a breaking change, although it's hard to say if it's actually a bugfix.

I'm leaving this open for discussion on how we should handle this. Please share your thoughts!


Before

table.createWriteStream(...)
  .on('complete', job => {
    job
      .on('error', err => {
        // Job failed.
      })
      .on('complete', () => {
        // Job completed successfully.
      });
  });

After

table.createWriteStream(...)
  .on('error', err => {
    // Job failed.
  })
  .on('job', job => {
    // `job` is a Job object.
  })
  .on('finish', () => {
    // Job completed successfully.
  });

And, complete still works to combine those. I added the "finish" event, because that's the natural writable stream closing event.

table.createWriteStream(...)
  .on('error', err => {
    // Job failed.
  })
  .on('complete', job => {
    // `job` is a Job object.
    // Job completed successfully.
  });

@stephenplusplus stephenplusplus added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Apr 2, 2018
@callmehiphop
Copy link
Contributor

I'm not sure what the official rules are for making breaking changes are now that we're 1.x.x. But we're definitely going to want a lower level equivalent that returns the job. If making a breaking change is too cumbersome, you could just create a new method with a different name that abstracts the job away.

@stephenplusplus
Copy link
Contributor Author

We could add an option, e.g. waitForJobCompletion, which would be false by default.

@callmehiphop
Copy link
Contributor

We could add an option, e.g. waitForJobCompletion, which would be false by default.

The BigQuery team has previously requested that we provide both a high and low level method for all methods that create a job. Aside from that using an option to change the signature of a callback has generally been frowned upon in the past. Not to mention this behavior might be found confusing since no other job related methods follow this pattern.

@stephenplusplus
Copy link
Contributor Author

API team feedback is welcome. We wouldn't change the signature-- we can still emit "complete" with the job. IMO, we'd be making the behavior of the method more logical.

Not to mention this behavior might be found confusing since no other job related methods follow this pattern.

Do we have stream methods that follow that pattern?

@callmehiphop
Copy link
Contributor

We wouldn't change the signature-- we can still emit "complete" with the job.

But the Job itself would be in a different state, hence why this is a breaking change. IMO that is a signature change.

I'll go with whatever the group decides, but my vote would be for a new higher level method instead of using an option.

@stephenplusplus
Copy link
Contributor Author

stephenplusplus commented Apr 2, 2018

I think there's room for multiple interpretations of breaking-- to me, the contestants for considering this a breaking change are:

  1. Waiting for a job to complete takes longer than before (code doesn't change, though)
  2. We could emit an error where we didn't before (code doesn't change, but where there wasn't an error before, there now could be)

The candidates for this to be a bug fix to me are:

  1. As discovered in the issue, it's not clear to users they have to assign further event handlers to the Job object (frankly, I didn't know, either)
  2. Users could be getting the "complete" event and proceeding as if all operations succeeded, when in fact, there were errors

If we decide a breaking change, that's fine. When considering how to solve, I'll just mention that createWriteStream is a bit different than a regular method named after an upstream one, in that this uses the de facto writable-stream language with "createWriteStream" (e.g. fs.createWriteStream). I would prefer to solve with an option, so we can retain the logical name.

@JustinBeckwith
Copy link
Contributor

@stephenplusplus what's the status on this one?

@codecov
Copy link

codecov bot commented Jul 25, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@f330015). Click here to learn what that means.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #85   +/-   ##
=========================================
  Coverage          ?   99.68%           
=========================================
  Files             ?        4           
  Lines             ?      640           
  Branches          ?        0           
=========================================
  Hits              ?      638           
  Misses            ?        2           
  Partials          ?        0
Impacted Files Coverage Δ
src/table.js 99.29% <33.33%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f330015...d33ac44. Read the comment docs.

@stephenplusplus
Copy link
Contributor Author

I've updated the PR, it still needs tests. Before I get to that point, I suppose a quick roundup of opinions on this change would help decide if we want this change. My thoughts are stated above and throughout the discussion so far. @JustinBeckwith hearing your thoughts would be great!

@ghost ghost assigned JustinBeckwith Jul 31, 2018
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 31, 2018
@JustinBeckwith
Copy link
Contributor

I'm going to be honest - I'm not deep enough with this module to make an informed decision on what we should do :) We have a semver major bump coming up very soon, so if we're going to make a breaking change, now's the time.

I would trust the opinion @tswast here.

@tswast
Copy link
Contributor

tswast commented Aug 10, 2018

I've definitely found it confusing that complete is emitted before the job actually finishes. I'm in favor of this change.

The BigQuery team has previously requested that we provide both a high and low level method for all methods that create a job. Aside from that using an option to change the signature of a callback has generally been frowned upon in the past. Not to mention this behavior might be found confusing since no other job related methods follow this pattern.

We actually somewhat ignored this advice in the Python client because it made a whole lot more sense to make a flexible method for creating a job (well, technically we have a method for each job type) and provide multiple ways to wait for it to finish (blocking + asynchronously). I think the same logic applies to Node.js: we should use the the built-in ways to wait for things.

I'm happy so long as the following are satisfied:

  • The initial function call to create the job is non-blocking,
  • You can pass in a full job configuration to customize the job.
  • The initial function call returns the initial job state (so that the job ID can be serialized / shared with the user before waiting),
  • It's easy to wait for the job to complete.

@stephenplusplus stephenplusplus removed the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Aug 10, 2018
@stephenplusplus
Copy link
Contributor Author

👍

I made the change. Here's the old and new:

Before

table.createWriteStream(...)
  .on('complete', job => {
    job
      .on('error', err => {
        // Job failed.
      })
      .on('complete', () => {
        // Job completed successfully.
      });
  });

After

table.createWriteStream(...)
  .on('error', err => {
    // Job failed.
  })
  .on('job', job => {
    // `job` is a Job object.
  })
  .on('finish', () => {
    // Job completed successfully.
  });

And, complete still works to combine those. I added the "finish" event, because that's the natural writable stream closing event.

table.createWriteStream(...)
  .on('error', err => {
    // Job failed.
  })
  .on('complete', job => {
    // `job` is a Job object.
    // Job completed successfully.
  });

@stephenplusplus stephenplusplus merged commit 1130ee5 into googleapis:master Aug 13, 2018
@stephenplusplus stephenplusplus deleted the spp--8 branch August 13, 2018 22:59
@callmehiphop callmehiphop mentioned this pull request Aug 24, 2018
3 tasks
stephenplusplus pushed a commit that referenced this pull request Aug 27, 2018
Fixes #174
Closes #173 

- [x] Tests and linter pass
- [x] Code coverage does not decrease (if any source code was changed)
- [x] Appropriate docs were updated (if necessary)

The original issue being addressed (#174) was remedied by simply upgrading the common package. However additional bugs were identified after that were introduced via #85. A second commit to fix said bug was created.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bigquery:createWriteStream() should propogate insertErrors
5 participants