Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Persist/restore state for DFA classification #50040

Conversation

dimitris-athanasiou
Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou commented Dec 10, 2019

This commit adds state persist/restore for data frame analytics classification jobs.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

@Override
public String getStateDocId(String jobId) {
throw new UnsupportedOperationException();
// The state doc id prefix is same as for regression
return jobId + "_regression_state#1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be "_classification_state" instead?

Copy link
Contributor Author

@dimitris-athanasiou dimitris-athanasiou Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure whether the underlying state was similar so I ignored this. However, after talking to @tveasey about it we decided it'd be clearer to distinguish between them. So, I'll be raising a PR to address this in the c++ side. Thanks for raising it!

@@ -444,4 +448,12 @@ private static void indexData(String sourceIndex, int numTrainingRows, int numNo
assertThat(confusionMatrixResult.getOtherActualClassCount(), equalTo(0L));
}
}

private static void assertModelStatePersisted(String jobId) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the same method as in RegressionIT? If so, it could be moved to MlNativeDataFrameAnalyticsIntegTestCase, just like assertInferenceModelPersisted.

This commit enables restoring state for data frame analytics
classification jobs.
@dimitris-athanasiou dimitris-athanasiou changed the title [ML] Restore state for DFA classification [ML] Persist/restore state for DFA classification Dec 12, 2019
@dimitris-athanasiou
Copy link
Contributor Author

@droberts195, @przemekwitek I have adjusted the PR to handle different suffix per analysis for the state doc ids. Could you please take another look?

@dimitris-athanasiou
Copy link
Contributor Author

run elasticsearch-ci/2

@dimitris-athanasiou
Copy link
Contributor Author

run elasticsearch-ci/packaging-sample-matrix

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

It would be good to delete orphaned data frame analytics state documents in the nightly maintenance in a followup PR.

@dimitris-athanasiou dimitris-athanasiou merged commit b805d95 into elastic:master Dec 12, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the restore-classification-state branch December 12, 2019 15:27
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Dec 12, 2019
This commit adds state persist/restore for data frame analytics classification jobs.

Backport of elastic#50040
dimitris-athanasiou added a commit that referenced this pull request Dec 13, 2019
)

This commit adds state persist/restore for data frame analytics classification jobs.

Backport of #50040
Copy link
Contributor

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
This commit adds state persist/restore for data frame analytics classification jobs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants