[ML] fix x-pack usage regression caused by index migration #36936

hendrikmuhs · 2018-12-21T13:15:20Z

Changes the feature usage retrieval to use the job manager rather than
directly talking to the cluster state, because jobs can now be either in
cluster state or stored in an index

This is a follow-up of #36702 / #36698

This is a blocker for 6.6: For every cluster that has at least 1 ML job created with 6.6+ (job configuration persisted in the new config index) the _xpack/usage endpoint is broken which also breaks the collection of usage data.

Notes:

i am looking into adding an integration test for this as this has been only found by manual testing

elasticmachine · 2018-12-21T13:15:21Z

Pinging @elastic/ml-core

droberts195

LGTM

hendrikmuhs · 2018-12-21T16:03:12Z

This PR still fails, because JobManager is not created if the ml plugin is disabled and causes MachineLearningFeatureSet failing to load:

[2018-12-21T16:45:22,991][ERROR][o.e.b.Bootstrap          ] [node-0] Guice Exception: org.elasticsearch.common.inject.CreationException: Guice creation errors:

1) Could not find a suitable constructor in org.elasticsearch.xpack.ml.job.JobManager. Classes must have either one (and only one) constructor annotated with @Inject or a zero-argument constructor that is not private.
  at org.elasticsearch.xpack.ml.job.JobManager.class(Unknown Source)
  while locating org.elasticsearch.xpack.ml.job.JobManager
    for parameter 4 at org.elasticsearch.xpack.ml.MachineLearningFeatureSet.<init>(Unknown Source)
  at _unknown_

1 error
        at <<<guice>>>
        at org.elasticsearch.node.Node.<init>(Node.java:556)
        at org.elasticsearch.node.Node.<init>(Node.java:250)
        at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:211)
        at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:211)
        at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:325)
        at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159)
        at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
        at org.elasticsearch.cli.Command.main(Command.java:90)
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:115)
        at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92)

If anyone has a best practice for this case, please let me know.

droberts195 · 2018-12-22T09:55:48Z

I think the solution is to have guice inject a different class, say JobManagerHolder, that contains a reference to a JobManager that is null when ML is disabled. Then MachineLearning.createComponents can return a list containing just a JobManagerHolder (with null JobManager reference) in cases where it currently returns an empty list.

Changed the feature usage retrieval to use the job manager rather than directly talking to the cluster state, because jobs can now be either in cluster state or stored in an index

Changes the feature usage retrieval to use the job manager rather than directly talking to the cluster state, because jobs can now be either in cluster state or stored in an index This is a follow-up of #36702 / #36698

hendrikmuhs added >bug blocker v7.0.0 :ml Machine learning v6.6.0 v6.7.0 labels Dec 21, 2018

droberts195 approved these changes Dec 21, 2018

View reviewed changes

hendrikmuhs mentioned this pull request Dec 22, 2018

NullPointerException in MachineLearningFeatureSet$Retriever.addJobsUsage elastic/ml-cpp#351

Closed

Hendrik Muhs added 8 commits December 28, 2018 09:35

fix x-pack usage regression caused by index migration

16a95d1

Changed the feature usage retrieval to use the job manager rather than directly talking to the cluster state, because jobs can now be either in cluster state or stored in an index

repair unit test

7e1fbd9

mock job manager

57f266a

add a holder to workaround service injection problems

2bc1884

adapt unit test

fffa471

fix transport client mode

d493e3d

add test cases for _usage

256e463

fix unit test

1e203a8

hendrikmuhs force-pushed the ml-feature-usage-jindex branch from c2c93aa to 1e203a8 Compare December 28, 2018 08:35

hendrikmuhs merged commit 632c7fb into elastic:master Dec 31, 2018

droberts195 mentioned this pull request Jan 3, 2019

[ML] UnusedStateRemover will remove all state for jobs not in cluster state #37109

Closed

lcawl added the >non-issue label Jan 21, 2019

jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] fix x-pack usage regression caused by index migration #36936

[ML] fix x-pack usage regression caused by index migration #36936

hendrikmuhs commented Dec 21, 2018 •

edited

Loading

elasticmachine commented Dec 21, 2018

droberts195 left a comment

hendrikmuhs commented Dec 21, 2018

droberts195 commented Dec 22, 2018

[ML] fix x-pack usage regression caused by index migration #36936

[ML] fix x-pack usage regression caused by index migration #36936

Conversation

hendrikmuhs commented Dec 21, 2018 • edited Loading

elasticmachine commented Dec 21, 2018

droberts195 left a comment

Choose a reason for hiding this comment

hendrikmuhs commented Dec 21, 2018

droberts195 commented Dec 22, 2018

hendrikmuhs commented Dec 21, 2018 •

edited

Loading