Port BaseTest to v2 engine (attempt two) #5611

stuhood · 2018-03-19T02:22:20Z

This is a revert of the revert of #4867.

The only interesting differences here are in the topmost commit, which hides a few more APIs that we want to deprecate, and fixes the flakiness that caused the revert (which turned out to be leaked references to the scheduler causing "Too many open file handle" issues).

stuhood · 2018-04-05T21:07:52Z

This is now reviewable.

baroquebobcat

Looks ok to me, though I'd prefer if the src changes that seem unrelated to updating the test class were broken out.

From looking at the rest of the patch, it seems like some of those changes are to bring the error messages in line with the existing assertions, but there are some moves that look like they could be broken out.

baroquebobcat · 2018-04-05T21:35:52Z

contrib/go/tests/python/pants_test/contrib/go/tasks/test_go_buildgen.py

@@ -285,7 +285,8 @@ def test_stitch_deps_remote_existing_rev_respected(self):
                     pkg='prod',
                     rev='v1.2.3')
    pre_execute_files = self.stitch_deps_remote(materialize=True)
-    self.build_graph.reset()  # Force targets to be loaded off disk
+    self.reset_build_graph(reset_build_files=True)  # Force targets to be loaded off disk
+    print('>>> {}'.format(self.buildroot_files()))


baroquebobcat · 2018-04-05T21:37:12Z

src/python/pants/build_graph/build_file_address_mapper.py

@@ -253,7 +253,7 @@ def _raise_incorrect_address_error(self, spec_path, wrong_target_name, addresses

    if not addresses:
      raise self.EmptyBuildFileError(
-        '{was_not_found_message}, because that directory contains no BUILD files defining addressable entities.'
+        '{was_not_found_message}, because that directory does not contain any BUILD files defining addressable entities.'


This feels like it's unrelated to this change, but I do like the updated wording better.

It's related to the change because a bunch of tests are swapping from being run in v1 to being run in v2.

baroquebobcat · 2018-04-05T21:37:35Z

src/python/pants/build_graph/build_file_aliases.py

+        'TargetMacro.Factory instances that construct more than one type are no longer supported. '
+        'Consider using a `context_aware_object_factory, which can construct any number of '
+        'different objects.'
+      )


Should this be a separate change?

Maybe, but I no longer remember which test I was fixing when I changed it, so I'd rather land it here.

baroquebobcat · 2018-04-05T21:39:45Z

src/python/pants/engine/build_files.py

@@ -319,27 +319,12 @@ def spec_to_globs(address_mapper, specs):
    elif type(spec) is AscendantAddresses:
      patterns.update(join(f, pattern)
                      for pattern in address_mapper.build_patterns
-                      for f in _recursive_dirname(spec.directory))
+                      for f in recursive_dirname(spec.directory))


Separate change?

The method is now used in file invalidation in test_base.

baroquebobcat · 2018-04-05T21:43:38Z

tests/python/pants_test/test_base.py

@@ -0,0 +1,573 @@
+# coding=utf-8
+# Copyright 2014 Pants project contributors (see CONTRIBUTORS.md).


Should this have an updated year? Or is it primarily a copy of 2014 code? Looks like the latter.

Mm. Good question. Might as well refresh.

baroquebobcat

Looks good to me!

kwlzn

thanks!

### Problem `Core` instances are currently being leaked in cases where `Nodes` have not completed running, and thus hold a `Context` in the closure for their `Future`. The cycle is: ``` Core -> Graph -> Node -> Context -> Core ``` ### Solution Clear all `Node` states when we `Drop` a `Scheduler`, breaking their cycles with the `Core`. ### Result Fixes #5732 by ensuring that the `Store` held via `Scheduler -> Core -> Store` is dropped. It should also unblock #5611.

stuhood · 2018-05-23T06:43:42Z

This now depends on #5859.

stuhood · 2018-05-23T07:06:39Z

There are two remaining failures in this job, both of which (to be confirmed shortly) are caused by SIGSEGVs in tests, which should now be more cleanly exposed by #5859. Searching for the phrase FAILURE: Test was killed in the logs should show which.

As far as I can tell, the failures occur in stable locations, although I'm unable to reproduce either of them locally:

tests/python/pants_test/backend/python/tasks: These are unit tests using TestBase, so it might potentially make sense to see a failure here due to the usage of the new scheduler.
contrib/go/tests/python/pants_test/contrib/go/tasks:integration: These are integration tests (extending PantsRunIntegrationTest), and thus should not actually have been affected by this PR at all (integration tests have run with v2 for a while).

At some point in the past I was briefly able to reproduce these failures locally, and running the tests with ./pants test $target -- -s (to disable all pytest output capturing) exposed a panic due to having run out of file handles. But unfortunately, I have not had luck attempting to reproduce that error today (and it's possible that it has changed to a virtual memory error at this point... unclear).

Having typed all of this out, I think I have one more thing I want to try, which is to see whether we might be able to generically catch panics in the scheduler via https://doc.rust-lang.org/std/panic/fn.catch_unwind.html and convert them directly to error messages. EDIT: I went ahead and did this here: master...twitter:stuhood/catch-unwind ... it's not pretty, because using catch_unwind safely is apparently quite challenging, but it's possible that it will allow us to get the errors in a more useful context?

…5859) ### Problem While debugging #5611, I determined that when a test was killed by a signal (and thus had a negative return value from `poll()`), `TestRunnerTaskMixin` incorrectly interpreted the failure as a still-living and hung process, and would later try to kill it. There were a few broken tests allowing for this, but primarily: `test_pytest_run_timeout_cant_terminate` was not waiting long enough for the tested-test to start up, and so was killing it before it had its signal handler in place. Additionally, the usage of a `Timer` + `poll()` to implement test termination meant that we were guaranteed to wait the full `timeout_terminate_wait` time (10 seconds by default) before `poll()`ing to see whether the test had exited. ### Solution 1. Interpret `poll() == None` as a still running process, and `poll() < 0` as a process killed by a signal. 2. Report processes killed by signals before our initial attempt to kill them (which should expose a SIGSEV in #5611). 3. Rather than a timer, use `wait()` between `terminate()` and `kill()`, which avoids unnecessary sleeping. ### Result Improved debuggability for tests that exit abnormally.

This reverts commit 45d0f3a.

…opped in tearDown.

…lace.

stuhood · 2018-05-26T04:11:57Z

The two failing shards are due to the Yarn outage earlier. Going to go for it.

stuhood changed the title ~~WIP: Port BaseTast to v2 engine (attempt two)~~ WIP: Port BaseTest to v2 engine (attempt two) Mar 19, 2018

stuhood force-pushed the stuhood/test-base-round-2 branch 3 times, most recently from b0aafa8 to bdc4688 Compare March 19, 2018 05:34

stuhood force-pushed the stuhood/test-base-round-2 branch 4 times, most recently from 28ad9c1 to 6347674 Compare April 5, 2018 21:04

stuhood requested review from kwlzn, benjyw and illicitonion April 5, 2018 21:06

stuhood changed the title ~~WIP: Port BaseTest to v2 engine (attempt two)~~ Port BaseTest to v2 engine (attempt two) Apr 5, 2018

stuhood force-pushed the stuhood/test-base-round-2 branch from 6347674 to ad6912b Compare April 5, 2018 21:13

baroquebobcat reviewed Apr 5, 2018

View reviewed changes

stuhood force-pushed the stuhood/test-base-round-2 branch from ad6912b to 5e21e8b Compare April 5, 2018 22:11

baroquebobcat approved these changes Apr 5, 2018

View reviewed changes

kwlzn approved these changes Apr 6, 2018

View reviewed changes

stuhood force-pushed the stuhood/test-base-round-2 branch from 5e21e8b to ee791e0 Compare April 6, 2018 02:10

benjyw approved these changes Apr 6, 2018

View reviewed changes

stuhood mentioned this pull request Apr 20, 2018

Break a Core / Node cycle #5733

Merged

stuhood force-pushed the stuhood/test-base-round-2 branch from ee791e0 to bb136fb Compare April 21, 2018 02:10

stuhood force-pushed the stuhood/test-base-round-2 branch 4 times, most recently from f383188 to 2e99612 Compare May 11, 2018 04:58

stuhood force-pushed the master branch from b6bb42d to 9e2fdb5 Compare May 11, 2018 23:54

stuhood force-pushed the stuhood/test-base-round-2 branch from 2e99612 to 5984fde Compare May 22, 2018 19:56

stuhood mentioned this pull request May 22, 2018

WIP: SnapshotBackedEagerFilesetWithSpec #5856

Closed

stuhood force-pushed the stuhood/test-base-round-2 branch 2 times, most recently from 9568019 to 6daee11 Compare May 23, 2018 06:24

stuhood mentioned this pull request May 23, 2018

Improve logging/handling of signaled, killed, and terminated tests #5859

Merged

illicitonion mentioned this pull request May 23, 2018

Expose Target's sources attribute as a Snapshot to Tasks #5762

Closed

illicitonion force-pushed the stuhood/test-base-round-2 branch from 6daee11 to 92f1381 Compare May 23, 2018 12:38

stuhood added 8 commits May 24, 2018 09:27

Port BaseTest to v2 engine

d8512c7

This reverts commit 45d0f3a.

Remove redundancy in injection.

64cf744

Hide BuildFileParser and BuildFile, and ensure that schedulers are dr…

7b19968

…opped in tearDown.

Bump deprecation

f30c9bf

Increase some timeouts and fix failing tests.

7bbb810

Avoid an unnecessary request during graph creation.

21bfc67

Class-level scheduler.

bbf0fd0

Update alias_groups to a classmethod.

78cd420

stuhood force-pushed the stuhood/test-base-round-2 branch from e930cb7 to 78cd420 Compare May 24, 2018 20:49

stuhood added 2 commits May 24, 2018 15:25

Test fixes for static scheduler.

4d7c7d2

WIP: Deprecate the existing TaskTestBase rather than changing it in p…

39f7626

…lace.

stuhood mentioned this pull request May 25, 2018

Port test_pytest_run.py to TestBase #5870

Closed

stuhood added 2 commits May 25, 2018 15:26

Update BUILD files and imports for task_test_base split.

a98054e

Argh. Actually use the deprecated adaptor.

4ad67af

stuhood merged commit 2e65f46 into pantsbuild:master May 26, 2018

stuhood deleted the stuhood/test-base-round-2 branch May 26, 2018 04:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port BaseTest to v2 engine (attempt two) #5611

Port BaseTest to v2 engine (attempt two) #5611

stuhood commented Mar 19, 2018 •

edited

Loading

stuhood commented Apr 5, 2018

baroquebobcat left a comment

baroquebobcat Apr 5, 2018

baroquebobcat Apr 5, 2018

stuhood Apr 5, 2018

baroquebobcat Apr 5, 2018

stuhood Apr 5, 2018

baroquebobcat Apr 5, 2018

stuhood Apr 5, 2018

baroquebobcat Apr 5, 2018

stuhood Apr 5, 2018

baroquebobcat left a comment

kwlzn left a comment

stuhood commented May 23, 2018

stuhood commented May 23, 2018 •

edited

Loading

stuhood commented May 26, 2018

		@@ -0,0 +1,573 @@
		# coding=utf-8
		# Copyright 2014 Pants project contributors (see CONTRIBUTORS.md).

Port BaseTest to v2 engine (attempt two) #5611

Port BaseTest to v2 engine (attempt two) #5611

Conversation

stuhood commented Mar 19, 2018 • edited Loading

stuhood commented Apr 5, 2018

baroquebobcat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

baroquebobcat left a comment

Choose a reason for hiding this comment

kwlzn left a comment

Choose a reason for hiding this comment

stuhood commented May 23, 2018

stuhood commented May 23, 2018 • edited Loading

stuhood commented May 26, 2018

stuhood commented Mar 19, 2018 •

edited

Loading

stuhood commented May 23, 2018 •

edited

Loading