Run core's integration tests with runtime fields #60931

nik9000 · 2020-08-10T19:05:54Z

Adds the x-pack/plugin/runtime-fields/qa/rest project to run core's
integration tests against an index with runtime_scipt fields. This
works by modifying the test configuration on load to replace supported
field types with runtime_script.

Relates to #59332

Adds the `x-pack/plugin/runtime-fields/qa/rest` project to run core's integration tests against an index with `runtime_scipt` fields. This works by modifying the test configuration on load to replace supported field types with `runtime_script`. Relates to elastic#59332

elasticmachine · 2020-08-10T19:05:56Z

Pinging @elastic/es-search (:Search/Search)

nik9000 · 2020-08-10T19:06:10Z

This adds a little under 300 integration tests for runtime fields.

nik9000 · 2020-08-10T19:07:55Z

rest-api-spec/src/main/resources/rest-api-spec/test/search.aggregation/20_terms.yml

@@ -724,7 +724,7 @@ setup:
        body: { "size" : 0, "aggs" : { "no_field_terms" : { "terms" : { "size": 1 } } } }

 ---
-"string profiler":


I wrote this one so I figured I could fix it while I was here. Runtime fields don't have global ordinals so rather than skip the entire test for them I split it into two tests, one that runtime fields can run (below) and this one which runtime fields will skip.

jtibshirani · 2020-08-10T19:16:08Z

This is a nice idea! I can see how this would really increase confidence in 'cross-cutting' functionality for runtime fields.

I think it'd be good to loop in someone from core infra for their thoughts on this testing strategy. One potential concern is that we're running many more REST tests than previously. We followed a similar approach (of running most tests twice) when deprecating document types, but this was clearly temporary and we removed the duplication in 8.0.

nik9000 · 2020-08-10T19:36:08Z

I think it'd be good to loop in someone from core infra for their thoughts on this testing strategy.

I'll do it!

One potential concern is that we're running many more REST tests than previously. We followed a similar approach (of running most tests twice) when deprecating document types, but this was clearly temporary and we removed the duplication in 8.0.

We do it in qa/smoke-test-multinode, qa/mixed-cluster, and x-pack/qa/core-rest-tests-with-security. That isn't to say that we should keep doing it over and over and over again, but it feels like a fairly good strategy here.

I think skipping the test when I'm not able to convert it into a runtime_script speeds up the build quite a bit.

mark-vieira · 2020-08-10T19:55:05Z

x-pack/plugin/runtime-fields/qa/rest/build.gradle

+
+integTest {
+  systemProperty 'tests.rest.blacklist',
+    [


This is quite the list of exclusions. Seems a bit brittle. One issue I anticipate is future newly added tests also being incompatible with this approach for one reason or another.

My goal is to fix the ones between here and line 36 fairly soon. At least theoretically, tests that fetch the mapping are never going to work but other sorts of tests should work. I could scan the test configuration for the command to fetch the mapping and skip those. It wouldn't be too bad and would shrink these.

Another option is to add skip style property so these tests can opt themselves out of running against runtime field.

I like the idea of a skip property -- that way the 'skip reason' is discoverable and right next to the test itself.

We'd have to bring this notion of runtime fields testing up into the test framework itself though, which seems a bit awkward given how this is currently structured as a specific qa project.

mark-vieira · 2020-08-10T20:07:29Z

One potential concern is that we're running many more REST tests than previously.

This is a concern of mine as well. In terms of total build time, it doesn't really add that much but we should still determine whether this coverage is worth the cost, or if there's some subset of this that would be sufficient to give us confidence we haven't introduce some kind of regression.

mark-vieira · 2020-08-10T20:11:17Z

x-pack/plugin/runtime-fields/qa/rest/build.gradle

@@ -0,0 +1,45 @@
+apply plugin: 'elasticsearch.testclusters'


We should replace all of this with the elasticsearch.yaml-rest-test plugin. Additionally the test suite should move from src/test to src/yamlRestTest to comply with new conventions.

nik9000 · 2020-08-10T20:28:14Z

we should still determine whether this coverage is worth the cost, or if there's some subset of this that would be sufficient to give us confidence we haven't introduce some kind of regression.

I think it is worth it. In hacking runtime fields I've found that the way we link fields and the search definition can be a little janky. Sometimes it is nice and as simple as implementing an interface. Other times there are a lot of instanceofs hiding logic and you just have to test it. I wish we had less of that and I've been working to remove it as I go, but we still have quite a bit. In lots of cases I've just had to play along to keep everything working smoothly.

mark-vieira · 2020-08-10T20:41:01Z

I think it is worth it.

I don't have an opinion here. I'll let folks with better knowledge on the behavioral implications (you) make that call.

jtibshirani

... we should still determine whether this coverage is worth the cost, or if there's some subset of this that would be sufficient to give us confidence we haven't introduce some kind of regression.

I also think it's really helpful. I know if we had added this kind of coverage for field aliases we would've caught bugs around interactions with other features like field collapsing.

Would it make sense to limit this to only search-related tests? That would include everything under the search, msearch, and field_caps directories, plus some related ones like explain and suggest (which should maybe live under search anyways...) This is really where I see the most value in terms of test coverage. Plus runtime fields are only interchangeable with 'standard' fields from the perspective of a search request, their behavior is not the same in terms of mappings, CRUD operations, etc.

jtibshirani · 2020-08-10T20:45:17Z

x-pack/plugin/runtime-fields/qa/rest/build.gradle

+
+integTest {
+  systemProperty 'tests.rest.blacklist',
+    [


I like the idea of a skip property -- that way the 'skip reason' is discoverable and right next to the test itself.

jtibshirani · 2020-08-10T20:52:37Z

...mlRestTest/java/org/elasticsearch/xpack/runtimefields/rest/CoreTestsWithRuntimeFieldsIT.java

+        Map<String, Boolean> seenSetups = new HashMap<>();
+        List<Object[]> result = new ArrayList<>();
+        for (Object[] orig : ESClientYamlSuiteTestCase.createParameters()) {
+            ClientYamlTestCandidate candidate = (ClientYamlTestCandidate) orig[0];


Is there ever a time when the rest of the Object array is relevant (beyond the first element)? Maybe we could replace orig[0] and still return orig, or simply assert orig.length == 1 ?

The array only ever has a single entry, but I'll add an assert just to make sure it stays that way. The funny shape of the return here comes from the method being used primarily for junit's parameterized tests.

jtibshirani · 2020-08-10T20:59:59Z

...mlRestTest/java/org/elasticsearch/xpack/runtimefields/rest/CoreTestsWithRuntimeFieldsIT.java

+    }
+
+    private static boolean containsUnsupportedSettings(Map<?, ?> settings) {
+        return


I find the formatting hard to follow, does it work to just do return settings.containsKey("sort.field"); ? Also maybe we could use the phrase 'index sort' to clarify what this setting refers to.

I super thought there'd be other things to add here. But it turned out this was it. I'll remove the method and move the check into the caller.

jtibshirani · 2020-08-10T21:20:10Z

...mlRestTest/java/org/elasticsearch/xpack/runtimefields/rest/CoreTestsWithRuntimeFieldsIT.java

+
+    private static final Map<String, String> PAINLESS_TO_EMIT = Map.ofEntries(
+        // TODO implement dates against the parser
+        Map.entry(


This seems like the best approach for now but feels a bit fragile -- we're essentially duplicating the source parsing logic, which can be pretty nuanced. For example, for boolean fields, the value "" is interpreted as false. We also don't cover cases where the value is malformed or missing from _source.

To make this more robust, I guess we could switch to 'source-only fields' when those are available? That way we could reuse the logic from fields retrieval for parsing values from source. Or we could consider adding a way to access nicely-formatted source fields from painless?

It really is pretty fragile. I was thinking of something similar to what you said - steal the source parsing and expose it to the script. That wouldn't be a "painless feature" so much as a feature of the script context of the runtime field. The advantage of this is that I know exactly which runtime field I'm running against so, for example, dates could use the the field's date parsing logic.

That makes sense. I think this is the right way to go for this feature branch, but it'd be good to follow-up on source parsing improvements at a later point.

nik9000 · 2020-08-11T13:30:32Z

Would it make sense to limit this to only search-related tests?

I think that'd get us a huge way there and skip a lot of unneeded stuff. If we find more things we want to drag in we can do that. I'll give it a shot!

nik9000 · 2020-08-11T13:49:44Z

I think that'd get us a huge way there and skip a lot of unneeded stuff. If we find more things we want to drag in we can do that. I'll give it a shot!

When I limit the list to just search related APIs I still get a ton of coverage and I can drop a bunch of the "strange" exclusions. I'm happy about it!

nik9000 · 2020-08-11T14:24:28Z

@jtibshirani I pushed an update that:

Merges the feature/runtime_fields branch which has a fix for about half of the exclusions in it.
Limited the suites to the search related stuff.
Dropped out excludes that aren't search related.

This leaves me with 2 skip lines that I know I'll never support and 7 that I plan to fix. I think that list is small enough that it'd be ok to keep it in the build file. Hopefully it'll stay that short!

jtibshirani

This looks good to me as an initial version. One question that's still open is whether we'll add a special skip type for runtime fields.

jtibshirani · 2020-08-11T15:56:39Z

...mlRestTest/java/org/elasticsearch/xpack/runtimefields/rest/CoreTestsWithRuntimeFieldsIT.java

+                Object settings = body.get("settings");
+                if (settings instanceof Map && ((Map<?, ?>) settings).containsKey("sort.field")) {
+                    /*
+                     * You can't sort on a runtime_keyword and it is hard to figure out


Small comment, we could say 'index sort' for clarity?

jtibshirani · 2020-08-11T15:59:36Z

...mlRestTest/java/org/elasticsearch/xpack/runtimefields/rest/CoreTestsWithRuntimeFieldsIT.java

+
+    private static final Map<String, String> PAINLESS_TO_EMIT = Map.ofEntries(
+        // TODO implement dates against the parser
+        Map.entry(


That makes sense. I think this is the right way to go for this feature branch, but it'd be good to follow-up on source parsing improvements at a later point.

jtibshirani · 2020-08-11T16:04:34Z

I think that list is small enough that it'd be ok to keep it in the build file. Hopefully it'll stay that short!

Oops, I missed this while preparing my review. That works for me, we can revisit this if it turns out to be hard to maintain.

nik9000 · 2020-08-11T16:18:08Z

Thanks @jtibshirani and @mark-vieira! This'll really help!

javanna · 2020-08-13T14:20:16Z

This is great @nik9000 , I wonder if we can also execute async_search tests, but they may be taken from another place which possibly complicates things.

I think this is the right way to go for this feature branch, but it'd be good to follow-up on source parsing improvements at a later point.

I wonder how this translates to what we need to do once we merge the feature branch to master, which we plan to do very soon.

nik9000 · 2020-08-13T14:25:31Z

This is great @nik9000 , I wonder if we can also execute async_search tests, but they may be taken from another place which possibly complicates things.

Great idea! I'll take a look at that.

I wonder how this translates to what we need to do once we merge the feature branch to master, which we plan to do very soon.

I'm working on something that'll parse dates that might make this cleaner by giving us a path to being more consistent.

Breaks up an integration test into one that runtime fields can run and one that runtime fields have to skip. This is because runtime fields don't have global ords and we assert things *about* global ords in the test we have to skip.

This "borrows" 150 more tests from core for runtime fields, extending the work done in elastic#60931. More precisely, it adds a template to every test that forces dynamic mapping updates to build runtime fields where possible. In particular, `long` and `double` field are created as runtime fields. `string`-typed fields are mimick the out of the box behavior and create a top level `text` field with a `.keyword` multi-field, but this `keyword` multi-field executes a script and loads from source.

…1103) Breaks up an integration test into one that runtime fields can run and one that runtime fields have to skip. This is because runtime fields don't have global ords and we assert things *about* global ords in the test we have to skip.

…er) (elastic#61103) Breaks up an integration test into one that runtime fields can run and one that runtime fields have to skip. This is because runtime fields don't have global ords and we assert things *about* global ords in the test we have to skip.

…1103) (#61114) Breaks up an integration test into one that runtime fields can run and one that runtime fields have to skip. This is because runtime fields don't have global ords and we assert things *about* global ords in the test we have to skip.

This "borrows" 150 more tests from core for runtime fields, extending the work done in #60931. More precisely, it adds a template to every test that forces dynamic mapping updates to build runtime fields where possible. In particular, `long` and `double` field are created as runtime fields. `string`-typed fields are mimick the out of the box behavior and create a top level `text` field with a `.keyword` multi-field, but this `keyword` multi-field executes a script and loads from source.

nik9000 added 2 commits August 10, 2020 15:03

Newline

fb8402f

nik9000 added the :Search/Search Search-related issues that do not fall into other categories label Aug 10, 2020

nik9000 requested a review from jtibshirani August 10, 2020 19:05

elasticmachine added the Team:Search Meta label for search team label Aug 10, 2020

nik9000 commented Aug 10, 2020

View reviewed changes

mark-vieira reviewed Aug 10, 2020

View reviewed changes

New way

b392332

jtibshirani reviewed Aug 10, 2020

View reviewed changes

Just search

ac330ec

Merge branch 'feature/runtime_fields' into runtime_fields_tests

5eccfcd

nik9000 requested a review from jtibshirani August 11, 2020 14:19

javanna mentioned this pull request Aug 11, 2020

Add support for runtime fields #59332

Closed

30 tasks

jtibshirani approved these changes Aug 11, 2020

View reviewed changes

Words

ac5e2e1

nik9000 merged commit eeb4afb into elastic:feature/runtime_fields Aug 11, 2020

nik9000 mentioned this pull request Aug 13, 2020

Borrow more tests for runtime fields #61113

Merged

Run core's integration tests with runtime fields #60931

Run core's integration tests with runtime fields #60931

Conversation

nik9000 commented Aug 10, 2020

elasticmachine commented Aug 10, 2020

nik9000 commented Aug 10, 2020

Choose a reason for hiding this comment

jtibshirani commented Aug 10, 2020

nik9000 commented Aug 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mark-vieira commented Aug 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 commented Aug 10, 2020

mark-vieira commented Aug 10, 2020

jtibshirani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nik9000 commented Aug 11, 2020

nik9000 commented Aug 11, 2020

nik9000 commented Aug 11, 2020

jtibshirani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtibshirani commented Aug 11, 2020

nik9000 commented Aug 11, 2020

javanna commented Aug 13, 2020

nik9000 commented Aug 13, 2020