-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coalesce rows #89
Coalesce rows #89
Conversation
@pradithya @woop @budi This is the PR I mentioned to you in chat. Take a look if you're interested in the implementation details. It's still needs a few changes. eg
|
/wip |
|
||
public class JobOptions implements Options { | ||
|
||
private long sampleLimit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how all these configuration will affect the ingestion?
public class CoalesceFeatureRows extends | ||
PTransform<PCollection<FeatureRow>, PCollection<FeatureRow>> { | ||
|
||
private static final Comparator<Timestamp> TIMESTAMP_COMPARATOR = Comparator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use Timestamps.compare(t1, t2)
for this
@@ -170,6 +175,12 @@ public void expand() { | |||
ParDo.of(new RoundEventTimestampsDoFn())), | |||
pFeatureRows.getErrors()); | |||
|
|||
if (jobOptions.isCoalesceRowsEnabled()) { | |||
pFeatureRows = pFeatureRows.apply("foo", new CoalescePFeatureRows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"foo" can be replaced with "Coalesce Feature Row"
return sampleLimit; | ||
} | ||
|
||
public void setSampleimit(long sampleLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo on method name
… key in the global window
…ing history from serving stores
/hold cancel /assign pradithya please review |
I've included not writing to the feature stores as requested. |
Related to #87 |
Biggest change externally is to the ImportSpec. From
To:
|
To turn on coalesceRows you need to pass You can also set, coalesceRows.delaySeconds, and coalesceRows.timeoutSeconds, but these settings are only relevant for streaming. The delay indicates how many seconds the watermark must advance before the rows are flushed, default is 10 seconds. |
should this be on or off by default? |
I think it should be on by default. Is this in-scope for 0.1.0? |
/assign zhilingc Can you run with some of your existing workloads? |
@tims done, I think we're good to merge it in |
/approve |
sorry, sausage fingers :( |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zhilingc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
@tims: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
boo... @pradithya can you lgtm this? |
/lgtm |
For #88
/wip