Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

twitter-commons merge into twitter #1

Closed
jsirois opened this issue Jul 30, 2013 · 4 comments
Closed

twitter-commons merge into twitter #1

jsirois opened this issue Jul 30, 2013 · 4 comments
Assignees

Comments

@jsirois
Copy link
Contributor

jsirois commented Jul 30, 2013

A longstanding merge review should be completed and committed. This blocks changes introducing python goals

@ghost ghost assigned jsirois Jul 30, 2013
@benjyw
Copy link
Contributor

benjyw commented Aug 9, 2013

Any news on this?

@areitz
Copy link
Contributor

areitz commented May 13, 2014

It seems like this is done, now that pants is in it's own repo?

@lahosken
Copy link
Contributor

lahosken commented Aug 5, 2014

It seems like this is done, now that pants has python goals?

WamBamBoozle pushed a commit to WamBamBoozle/pants that referenced this issue Apr 23, 2015
megaserg referenced this issue in megaserg/pants Oct 7, 2015
kwlzn added a commit that referenced this issue Mar 31, 2017
…n test. (#4407)

### Problem

Currently, on Linux the first thin client call to the daemon can deadlock just after the pantsd->fork->pantsd-runner workflow. Connecting to the process with `gdb` reveals a deadlock in the following stack in the `post_fork` `drop` of the `CpuPool`:

```
#0  0x00007f63f04c31bd in __lll_lock_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f63f04c0ded in pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x00007f63d3cfa438 in notify_one ()
    at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/condvar.rs:52
No locals.
#3  notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys_common/condvar.rs:39
No locals.
#4  notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sync/condvar.rs:208
No locals.
#5  std::thread::{{impl}}::unpark () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/thread/mod.rs:633
No locals.
#6  0x00007f63d3c583d1 in crossbeam::sync::ms_queue::{{impl}}::push<futures_cpupool::Message> (self=<optimized out>, t=...)
    at /home/kwilson/.cache/pants/rust-toolchain/registry/src/github.com-1ecc6299db9ec823/crossbeam-0.2.10/src/sync/ms_queue.rs:178
        guard = <optimized out>
        self = <optimized out>
#7  0x00007f63d3c588ed in futures_cpupool::{{impl}}::drop (self=<optimized out>)
    at /home/kwilson/.cache/pants/rust-toolchain/git/checkouts/futures-rs-a4f11d094efefb0a/f7e6bc8/futures-cpupool/src/lib.rs:236
        self = 0x37547a0
#8  0x00007f63d3be871c in engine::fs::{{impl}}::post_fork (self=0x3754778) at /home/kwilson/dev/pants/src/rust/engine/src/fs.rs:355
        self = 0x3754778
#9  0x00007f63d3be10e4 in engine::context::{{impl}}::post_fork (self=0x37545b0)
    at /home/kwilson/dev/pants/src/rust/engine/src/context.rs:93
        self = 0x37545b0
#10 0x00007f63d3c0de5a in {{closure}} (scheduler=<optimized out>) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:275
        scheduler = 0x3740580
#11 with_scheduler<closure,()> (scheduler_ptr=<optimized out>, f=...) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:584
        scheduler = 0x3740580
        scheduler_ptr = 0x3740580
#12 engine::scheduler_post_fork (scheduler_ptr=0x3740580) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:274
        scheduler_ptr = 0x3740580
#13 0x00007f63d3c1be8c in _cffi_f_scheduler_post_fork (self=<optimized out>, arg0=0x35798f0) at src/cffi/native_engine.c:2234
        _save = 0x34a65a0
        x0 = 0x3740580
        datasize = <optimized out>
#14 0x00007f63f07b5a62 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
```

This presents as a hang in the thin client, because the pailgun socket is left open in the pantsd-runner.

### Solution

Add pre-fork hooks and tear down the `CpuPool` instances prior to forking and rebuilding them.

### Result

Can no longer reproduce the hang.
lenucksi pushed a commit to lenucksi/pants that referenced this issue Apr 25, 2017
…n test. (pantsbuild#4407)

### Problem

Currently, on Linux the first thin client call to the daemon can deadlock just after the pantsd->fork->pantsd-runner workflow. Connecting to the process with `gdb` reveals a deadlock in the following stack in the `post_fork` `drop` of the `CpuPool`:

```
#0  0x00007f63f04c31bd in __lll_lock_wait () from /lib64/libpthread.so.0
No symbol table info available.
pantsbuild#1  0x00007f63f04c0ded in pthread_cond_signal@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
pantsbuild#2  0x00007f63d3cfa438 in notify_one ()
    at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/condvar.rs:52
No locals.
pantsbuild#3  notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys_common/condvar.rs:39
No locals.
pantsbuild#4  notify_one () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sync/condvar.rs:208
No locals.
pantsbuild#5  std::thread::{{impl}}::unpark () at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/thread/mod.rs:633
No locals.
pantsbuild#6  0x00007f63d3c583d1 in crossbeam::sync::ms_queue::{{impl}}::push<futures_cpupool::Message> (self=<optimized out>, t=...)
    at /home/kwilson/.cache/pants/rust-toolchain/registry/src/github.com-1ecc6299db9ec823/crossbeam-0.2.10/src/sync/ms_queue.rs:178
        guard = <optimized out>
        self = <optimized out>
pantsbuild#7  0x00007f63d3c588ed in futures_cpupool::{{impl}}::drop (self=<optimized out>)
    at /home/kwilson/.cache/pants/rust-toolchain/git/checkouts/futures-rs-a4f11d094efefb0a/f7e6bc8/futures-cpupool/src/lib.rs:236
        self = 0x37547a0
pantsbuild#8  0x00007f63d3be871c in engine::fs::{{impl}}::post_fork (self=0x3754778) at /home/kwilson/dev/pants/src/rust/engine/src/fs.rs:355
        self = 0x3754778
pantsbuild#9  0x00007f63d3be10e4 in engine::context::{{impl}}::post_fork (self=0x37545b0)
    at /home/kwilson/dev/pants/src/rust/engine/src/context.rs:93
        self = 0x37545b0
pantsbuild#10 0x00007f63d3c0de5a in {{closure}} (scheduler=<optimized out>) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:275
        scheduler = 0x3740580
pantsbuild#11 with_scheduler<closure,()> (scheduler_ptr=<optimized out>, f=...) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:584
        scheduler = 0x3740580
        scheduler_ptr = 0x3740580
pantsbuild#12 engine::scheduler_post_fork (scheduler_ptr=0x3740580) at /home/kwilson/dev/pants/src/rust/engine/src/lib.rs:274
        scheduler_ptr = 0x3740580
pantsbuild#13 0x00007f63d3c1be8c in _cffi_f_scheduler_post_fork (self=<optimized out>, arg0=0x35798f0) at src/cffi/native_engine.c:2234
        _save = 0x34a65a0
        x0 = 0x3740580
        datasize = <optimized out>
pantsbuild#14 0x00007f63f07b5a62 in PyEval_EvalFrameEx () from /lib64/libpython2.7.so.1.0
```

This presents as a hang in the thin client, because the pailgun socket is left open in the pantsd-runner.

### Solution

Add pre-fork hooks and tear down the `CpuPool` instances prior to forking and rebuilding them.

### Result

Can no longer reproduce the hang.
Eric-Arellano added a commit that referenced this issue Jan 31, 2019
…7187)

## Problem
There are three issues when trying to run `./pants3` and switching between Python 3.6 and Python 3.7.

1) You must first `rm -rf build-support/pants_dev_deps.py3.venv` and then `rm -rf ~/.cache/pants/python_cache/interpreters` because the cache will have a bad symlink to the prior Python 3 interpreter. This is inconvenient, and the latter is surprising and difficult to remember.
2) If both Python 3.6 and Python 3.7 are discoverable on the `PATH`, there is no reliable way to specify that you want to use Python 3.7. The `virtualenv` code works by invoking `python3`, which defaults in most systems to the minimum version. The only workaround is to ensure `python3` points first to `python3.7` by messing with the PATH. While this is possible, it is not obvious to the user and it would be best to provide a solution that does not require messing with the PATH.
3) When running `./pants3`, we always override `PANTS_PYTHON_SETUP_INTERPRETER_CONSTRAINTS` to allow Python 3.6, so it is impossible at the moment to set a tighter constraint without modifying source code. Even if we resolve issues 1 and 2, we must address this issue to ensure that spawned subprocesses use the correct interpreter.

## Solution 
### Part 1 - renamed `venv` folders
Rename `py2.venv` -> `py27.venv` and split out `py3.venv` into `py36.venv` and `py37.venv`.

This allows us to have a distinct venv for each Python version, which is necessary to avoid redownloading/resymlinking dependencies anytime we change from 3.6 to 3.7. It is also necessary to fix problem #1 of having to run `rm -rf build-support/pants_dev_deps.py3.venv ~/.cache/pants/python_cache/interpreters` whenever making this change.

### Part 2 - `$PY` env var
We refactor `virtualenv` and `pants_venv` to require the caller to pre-set the env var `$PY`. This value stores the bin name, such as `python3.7`, to be used in setting up the venv.

If the user wants to constrain `./pants3` to a specific Python 3 interpreter, they may do so by prefixing the command with `PY=python3.7 ./pants3`. Otherwise, `./pants` will default to `python2.7` and `./pants3` will default to `python3` as before.

### Also changed - renames in `.travis.yml`
Update `ci.sh` and `.travis.yml` to specify the exact Python versions as Python 2.7 and 3.6, rather than Python 2 and Python 3. `ci.sh` does make a semantic change in that it now requires exactly Python 3.6 for Py3 shards, whereas earlier it only used Py3.6 de facto. Meanwhile, `.travis.yml` changes the OSX rust tests to use Python 2.7 to fix an issue with requiring exactly Python 3.6, updates the cache directories to use the new `venv` naming scheme, and renames `py2`->`py27` and `py3`->`py36` everywhere.

This is pre-work to make the Python 3.7 CI pull request easier to review.

## Alternative solutions considered
### `./pants3` interpreter flags
Originally this PR added the CI flags `./pants -6` and `./pants -7` to allow forcing Pants to use only Python 3.6 or 3.7, respectively.

This solution was rejected for being over-constrained and too complex.

### `./pants37`
Originally we considered adding a new script `./pants37` to specify Python 3.7, rather than `./pants3 -7`. 

This was rejected because it results in too many root-level scripts that are prefixed `./pants`, which is confusing. 

Further, it is not often that Pants devs will need to switch between Python 3.6 and 3.7—we certainly will not expect them to do this frequently (whereas we do encourage frequently changing between `./pants` and `./pants3`). We merely want the ability to run with Python 3.7, particularly for CI; it is not important for the option to be especially discoverable. 

## Result
Calling `./pants` or `./pants3` for the first time will delete the legacy venv folders and use the new naming scheme.

Scripts now behave as follows:
* `./pants` will use Python 2.7 as before
* `./pants3` will use the first `python3` bin on the PATH, as before
* `PY=python3.6 ./pants3` will use only Python 3.6
* `PY=python3.7 ./pants3` will use only Python 3.7

Switching between these options will trigger an engine rebuild the first time, and then after that will change the interpreter used with no cost.
stuhood pushed a commit that referenced this issue Jun 6, 2019
…-no-strict` mode (#7864)

### Problem
When running `py-thrift-namespace-clash-check` in Twitter's sandbox with `--no-strict` mode, Pants crashed with the exception:

```
zip_longest argument #1 must support iteration
```

This is because we were trying to log a `NamespaceExtractionError`, and `log()` only understands `Union[str, Tuple[str, str]]`.

### Solution
Fix the issue by using `str()` to get the underlying message.

Also add an assertion to make sure that we don't introduce this issue again. Once we have MyPy set up, we can remove this assertion.
stuhood pushed a commit that referenced this issue Jun 6, 2019
…-no-strict` mode (#7864)

### Problem
When running `py-thrift-namespace-clash-check` in Twitter's sandbox with `--no-strict` mode, Pants crashed with the exception:

```
zip_longest argument #1 must support iteration
```

This is because we were trying to log a `NamespaceExtractionError`, and `log()` only understands `Union[str, Tuple[str, str]]`.

### Solution
Fix the issue by using `str()` to get the underlying message.

Also add an assertion to make sure that we don't introduce this issue again. Once we have MyPy set up, we can remove this assertion.
Eric-Arellano added a commit that referenced this issue Jul 22, 2021
This shows how we can convert types defined in Python into Rust, e.g. `PathGlobs`. 

Whereas we used to use a free function to do the conversion and asked for `PyObject` in the FFI function, now we implement the trait `FromPyObject` on a wrapper type `PyPathGlobs` so that we can request it directly in the function signature. This is more idiomatic and makes the `#[pyfunction]` more clear what it does.

We also now directly use PyO3's `.getattr()` and `.extract()` at callsites, rather than using our custom `getattr()` functions that wrapped those with Rust-Cpython. This removes a layer of indirection and doesn't noticeably increase boilerplate at callsites - it in fact reduces some because it's easier to chain `.getattr()` multiple times.

At least for now, we don't use `Value`, which is an `Arc` around a Rust-CPython `PyObject`. We were using that because cloning with Rust-CPython requires the GIL. Here, we're able to clone w/o requiring the GIL (in `PyPathGlobs.parse()`), so I went with a simpler implementation. Some benefits of avoiding `Value` are a) better type safety and b) avoiding a lock. We may end up needing `Value` at the end of the day, but we can add this back.

### Benchmark

PyO3 does have slightly worse performance with this diff:

```diff
diff --git a/src/python/pants/bin/pants_loader.py b/src/python/pants/bin/pants_loader.py
index 10c8b406b..ec9ef50bf 100644
--- a/src/python/pants/bin/pants_loader.py
+++ b/src/python/pants/bin/pants_loader.py
@@ -126,4 +126,8 @@ def main() -> None:


 if __name__ == "__main__":
-    main()
+    from pants.engine.internals.native_engine import match_path_globs
+    from pants.engine.fs import PathGlobs
+
+    x = match_path_globs(PathGlobs(["**/*.txt"]), tuple(f"foo/{i}.txt" for i in range(1_000_000)))
+    # main()
```

Before:

```
❯ hyperfine --warmup 1 --runs=100 './pants'
Benchmark #1: ./pants
  Time (mean ± σ):     712.3 ms ±   4.6 ms    [User: 587.7 ms, System: 103.3 ms]
  Range (min … max):   701.0 ms … 722.9 ms    100 runs

```

After:

```
❯ hyperfine --warmup 1 --runs=100 './pants'
Benchmark #1: ./pants
  Time (mean ± σ):     732.7 ms ±   8.0 ms    [User: 610.8 ms, System: 100.2 ms]
  Range (min … max):   720.5 ms … 778.0 ms    100 runs
```

Another `after` benchmark run with fewer apps on my machine shows `726.1 ms ±   4.9 ms    [User: 610.0 ms, System: 94.8 ms]`. Note that User is still the same, and only System changes. So this seems to be an actual slowdown.

I am not yet sure why, but can profile if this is concerning.

[ci skip-build-wheels]
williamscs added a commit to williamscs/pants that referenced this issue Sep 9, 2021
Omitting these packages would break backport libraries (such as dataclasses). Performance impact is negligible since the dependency graph has already been gathered.

Benchmarks using `$ hyperfine --warmup=1 --runs=10 './pants --no-remote-cache-{read,write} --no-pantsd dependencies ::'`
Before these changes (omitting stdlib dependencies):
```bash
Benchmark pantsbuild#1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.852 s ±  0.105 s    [User: 8.612 s, System: 5.314 s]
  Range (min … max):    8.702 s …  9.107 s    10 runs
```

After changes (with stdlib dependency inference)
```bash
Benchmark pantsbuild#1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.854 s ±  0.123 s    [User: 8.619 s, System: 5.333 s]
  Range (min … max):    8.682 s …  9.111 s    10 runs
```

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano pushed a commit that referenced this issue Sep 9, 2021
…dependency inference (#12818)

Omitting these packages would break backport libraries (such as dataclasses). Performance impact is negligible since the dependency graph has already been gathered.

Benchmarks using 
`$ hyperfine --warmup=1 --runs=10 './pants --no-remote-cache-{read,write} --no-pantsd dependencies ::'`

Before these changes (omitting stdlib dependencies):
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.852 s ±  0.105 s    [User: 8.612 s, System: 5.314 s]
  Range (min … max):    8.702 s …  9.107 s    10 runs
```

After changes (with stdlib dependency inference)
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.854 s ±  0.123 s    [User: 8.619 s, System: 5.333 s]
  Range (min … max):    8.682 s …  9.111 s    10 runs
```

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano referenced this issue in Eric-Arellano/pants Sep 9, 2021
…dependency inference (pantsbuild#12818)

Omitting these packages would break backport libraries (such as dataclasses). Performance impact is negligible since the dependency graph has already been gathered.

Benchmarks using 
`$ hyperfine --warmup=1 --runs=10 './pants --no-remote-cache-{read,write} --no-pantsd dependencies ::'`

Before these changes (omitting stdlib dependencies):
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.852 s ±  0.105 s    [User: 8.612 s, System: 5.314 s]
  Range (min … max):    8.702 s …  9.107 s    10 runs
```

After changes (with stdlib dependency inference)
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.854 s ±  0.123 s    [User: 8.619 s, System: 5.333 s]
  Range (min … max):    8.682 s …  9.111 s    10 runs
```

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this issue Sep 9, 2021
…dependency inference (Cherry-pick of #12818) (#12819)

Omitting these packages would break backport libraries (such as dataclasses). Performance impact is negligible since the dependency graph has already been gathered.

Benchmarks using 
`$ hyperfine --warmup=1 --runs=10 './pants --no-remote-cache-{read,write} --no-pantsd dependencies ::'`

Before these changes (omitting stdlib dependencies):
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.852 s ±  0.105 s    [User: 8.612 s, System: 5.314 s]
  Range (min … max):    8.702 s …  9.107 s    10 runs
```

After changes (with stdlib dependency inference)
```bash
Benchmark #1: ./pants --no-remote-cache-{read,write} --no-pantsd dependencies ::
  Time (mean ± σ):      8.854 s ±  0.123 s    [User: 8.619 s, System: 5.333 s]
  Range (min … max):    8.682 s …  9.111 s    10 runs
```

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this issue Oct 6, 2021
…ze external packages (#13128)

We were running `go list -json` much more frequently than necessary. We run it once per module to generate `_go_external_package` targets, but then we would rerun it on each package within the module. Those each often took 5-9 seconds, so this was atrocious for performance. Instead, we can simply run it once per module and leverage the engine's memoization to avoid re-running.

This refactors our external module code so that it no longer interacts at all with the Target API, which reduces coupling. We rename things like the file to `external_pkg.py` so that only the `ExternalModuleInfo` and `ExternalPkgInfo` types are exposed. Notably, we no longer use `ResolvedGoPackage` for external code, which allows us to ignore lots of irrelevant fields like `test_imports`.

### `dependencies` benchmark

This requires changing this line to use `Targets` so that generated targets have their dependencies resolved too:

https://github.com/pantsbuild/pants/blob/a744978d2badd06509cc6e8091c14220bed3aff6/src/python/pants/backend/project_info/dependencies.py#L94

Before (I aborted after 1 run because it was so slow): 

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
Current estimate: 344.640 s
```

After:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
  Time (mean ± σ):     26.501 s ±  1.537 s    [User: 29.554 s, System: 24.115 s]
  Range (min … max):   24.928 s … 28.763 s    5 runs
```

### `package` benchmark

Compilation is faster now too, as we no longer need to run `go list` to determine the `.go` and `.s` files. It's already eagerly computed beforehand.

Before:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
Benchmark #1: ./pants_from_sources --no-pantsd --no-process-execution-local-cache package ::
  Time (mean ± σ):     54.981 s ±  1.441 s    [User: 70.577 s, System: 81.823 s]
  Range (min … max):   53.426 s … 57.188 s    5 runs
```

After:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
  Time (mean ± σ):     33.777 s ±  0.248 s    [User: 39.221 s, System: 39.389 s]
  Range (min … max):   33.517 s … 34.062 s    5 runs
```

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this issue Oct 6, 2021
Closes #13049. This greatly reduces boilerplate and also allows us to make some fields required like `import_path` and `subpath`, so that we don't have to calculate those in consuming rules like `build_go_pkg.py`.

## The address format

The generated address looks like `project#./dir`. @tdyas offered that this is intuitive for Go developers because they have to say `go build ./dir` already with the leading `./`. 

This solves how to represent when the `_go_internal_package` is in the same dir as the `go_mod`: `project#./`.

This also makes very clear the difference from external packages like `project#rsc.io/quote` vs. internal packages like `project#./dir`.

## Improves dependency inference

Now that the `import_path` field is required for both `_go_internal_package` and `_go_external_package`, we can create a global mapping of `import_path -> pkg_target`. This is necessary for #13114.

This also improves performance. We don't need to call `ResolvedGoPackage` on all the candidate targets a package might depend on to calculate their import paths. We do still need it when resolving the deps of a particular `_go_internal_package`, but we can be lazier when we call that codepath in not evaluating all candidate targets.

### `dependencies` benchmark

As expected, there is no difference because we are finding the dependencies of everything, so we still have to call `ResolvedGoPackage`. The perf gains are only in things sometimes being less eager, which isn't the case here.

Before:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
  Time (mean ± σ):     26.501 s ±  1.537 s    [User: 29.554 s, System: 24.115 s]
  Range (min … max):   24.928 s … 28.763 s    5 runs
```

After:
```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache dependencies ::'
  Time (mean ± σ):     26.359 s ±  0.526 s    [User: 29.600 s, System: 23.769 s]
  Range (min … max):   25.625 s … 26.993 s    5 runs
```

### `package` benchmark

Before:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
  Time (mean ± σ):     33.777 s ±  0.248 s    [User: 39.221 s, System: 39.389 s]
  Range (min … max):   33.517 s … 34.062 s    5 runs
```

After:

```
❯ hyperfine -r 5 './pants_from_sources --no-pantsd --no-process-execution-local-cache package ::'
Benchmark #1: ./pants_from_sources --no-pantsd --no-process-execution-local-cache package ::
  Time (mean ± σ):     31.049 s ±  0.702 s    [User: 40.606 s, System: 40.537 s]
  Range (min … max):   30.512 s … 32.273 s    5 runs
```

## TODO: fix `go_binary` inference of `main` field

#13117 added inference of the `main` field for `go_binary`, that it defaults to the `go_package` defined in that directory.

But target generation no longer generates targets actually in each directory. All generated targets are "located" in the BUILD file of the `go_mod`, i.e. their `spec_path` is set to that. So it no longer looks to `AddressSpecs` like there are any targets in each subdirectory, and there are >1 `_go_internal_package` targets in the `go_mod` dir.

Instead, we should use the `subpath` field to determine what directory the targets correspond to.

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this issue Oct 13, 2021
…inference, only `python_source` and `python_test` targets (#13231)

Now our Python code operates on `python_source` and `python_test` targets. This causes some changes.

### Dependency inference

Dependency inference no longer runs on `python_sources` and `python_tests`, only `python_source` and `python_test`. That applies to import analysis, conftest.py, and `__init__.py`. 

This actually speeds up dependency inference! We no longer run `import_parser.py` twice on the same files.

Before:

```
❯ hyperfine -r 10 './pants --no-process-execution-local-cache --no-pantsd dependencies src/python::'
Benchmark #1: ./pants --no-process-execution-local-cache --no-pantsd dependencies src/python::
  Time (mean ± σ):      7.650 s ±  0.741 s    [User: 5.794 s, System: 2.141 s]
  Range (min … max):    7.250 s …  9.673 s    10 runs
```

After:

```
❯ hyperfine -r 10 './pants --no-process-execution-local-cache --no-pantsd dependencies src/python::'
Benchmark #1: ./pants --no-process-execution-local-cache --no-pantsd dependencies src/python::
  Time (mean ± σ):      5.998 s ±  0.279 s    [User: 3.524 s, System: 0.903 s]
  Range (min … max):    5.708 s …  6.688 s    10 runs
```

It means that `./pants dependencies src/py:lib` now only has explicitly declared dependencies. You have to use `--transitive` to get the results of dep inference.

Thanks to this change, we can simplify `import_parser.py` to assume there is only one file.

### Lockfile generation

When determining the interpreter constraints to use, we now look at generated targets, not target generators. We only run over the `python_source` and `python_test` targets that actually will be used. 

This makes us future-proof to an `overrides` field adding `skip_tool` to only some of the files. I think it also fixes our dependency resolution for Pylint looking at direct deps?

### `FieldSet.opt_out` no longer worries about file targets

Pytest and MyPy used to skip running on non-file targets. That can be removed now. This unblocks allowing you to explicitly define a `python_source`/`python_test` target.
Eric-Arellano added a commit that referenced this issue Oct 22, 2021
This lets you compile a package without needing to run tests or package a binary. 

Note that you can only directly compile a first-party package - to check if a third-party package is buildable, it must be a dependency of a first-party package. Works around #13175.

This does not yet have an optimal UX. It's overly chatty:

```
❯ ./pants check testprojects/src/go::
13:36:38.57 [INFO] Initializing scheduler...
13:36:39.20 [INFO] Scheduler initialized.
13:36:39.71 [ERROR] Completed: pants.backend.go.goals.check.check_go - go failed (exit code 1).
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



𐄂 go failed.
```

We also should be streaming the result of each compilation as it happens, which has an added benefit of that output happening when compilation happens as a side effect of `run`, `package`, and `test`. Those fixes will be in a followup.

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this issue Oct 28, 2021
Before:

```
❯ ./pants check testprojects/src/go::
14:51:39.15 [INFO] Completed: pants.backend.go.goals.check.check_go - go succeeded.
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



✓ go succeeded.
```
```
❯ ./pants check testprojects/src/go::
14:52:54.54 [ERROR] Completed: pants.backend.go.goals.check.check_go - go failed (exit code 1).
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



𐄂 go failed.
```

After:

```
❯ ./pants check testprojects/src/go::
14:20:40.79 [INFO] Completed: Check Go compilation - go succeeded.

✓ go compile succeeded.
```

```
❯ ./pants check testprojects/src/go::
14:23:16.18 [ERROR] Completed: Compile with Go - github.com/pantsbuild/pants/testprojects/src/go/pants_test failed (exit code 1).
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

14:23:16.18 [ERROR] Completed: Check Go compilation - go failed (exit code 1).

𐄂 go compile failed.
```

That is, we now:

- Stream results for each package, rather than waiting to batch at the very end.
- Don't log the `Partition #1` part, which is verbose and confusing.
- Log success at DEBUG level, rather than INFO level, for less noise. (Reminder: these workunits will show up in the dynamic UI still)

[ci skip-rust]
Eric-Arellano referenced this issue in Eric-Arellano/pants Oct 28, 2021
Before:

```
❯ ./pants check testprojects/src/go::
14:51:39.15 [INFO] Completed: pants.backend.go.goals.check.check_go - go succeeded.
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



✓ go succeeded.
```
```
❯ ./pants check testprojects/src/go::
14:52:54.54 [ERROR] Completed: pants.backend.go.goals.check.check_go - go failed (exit code 1).
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



𐄂 go failed.
```

After:

```
❯ ./pants check testprojects/src/go::
14:20:40.79 [INFO] Completed: Check Go compilation - go succeeded.

✓ go compile succeeded.
```

```
❯ ./pants check testprojects/src/go::
14:23:16.18 [ERROR] Completed: Compile with Go - github.com/pantsbuild/pants/testprojects/src/go/pants_test failed (exit code 1).
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

14:23:16.18 [ERROR] Completed: Check Go compilation - go failed (exit code 1).

𐄂 go compile failed.
```

That is, we now:

- Stream results for each package, rather than waiting to batch at the very end.
- Don't log the `Partition #1` part, which is verbose and confusing.
- Log success at DEBUG level, rather than INFO level, for less noise. (Reminder: these workunits will show up in the dynamic UI still)

[ci skip-rust]
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood pushed a commit that referenced this issue Oct 28, 2021
#13388)

Before:

```
❯ ./pants check testprojects/src/go::
14:51:39.15 [INFO] Completed: pants.backend.go.goals.check.check_go - go succeeded.
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



✓ go succeeded.
```
```
❯ ./pants check testprojects/src/go::
14:52:54.54 [ERROR] Completed: pants.backend.go.goals.check.check_go - go failed (exit code 1).
Partition #1 - github.com/pantsbuild/pants/testprojects/src/go/pants_test:
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

Partition #2 - github.com/pantsbuild/pants/testprojects/src/go/pants_test/bar:



𐄂 go failed.
```

After:

```
❯ ./pants check testprojects/src/go::
14:20:40.79 [INFO] Completed: Check Go compilation - go succeeded.

✓ go compile succeeded.
```

```
❯ ./pants check testprojects/src/go::
14:23:16.18 [ERROR] Completed: Compile with Go - github.com/pantsbuild/pants/testprojects/src/go/pants_test failed (exit code 1).
./testprojects/src/go/pants_test/foo.go:3:1: syntax error: non-declaration statement outside function body

14:23:16.18 [ERROR] Completed: Check Go compilation - go failed (exit code 1).

𐄂 go compile failed.
```

That is, we now:

- Stream results for each package, rather than waiting to batch at the very end.
- Don't log the `Partition #1` part, which is verbose and confusing.
- Log success at DEBUG level, rather than INFO level, for less noise. (Reminder: these workunits will show up in the dynamic UI still)

[ci skip-rust]
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood added a commit that referenced this issue Oct 28, 2021
In pathological cases of very small `@rule` bodies or extreme graph shapes, cycle detection in the `Graph` can be very expensive. This cost can be challenging to detect in production, but it does occasionally show up as hot, particularly while cleaning graphs (since that skips all `@rule` logic). That's precisely when we do not want to be spending any extraneous time on formalities.

This change moves to asynchronously checking for cycles in Running nodes in the `Graph`. Checking in the background has two advantages: 1. cycle detection is batched across the entire graph, rather than per-edge-addition, and 2. in most cases, successful Nodes never need to be checked while running.

We also add two pathological graph shape benchmarks, which show significant improvement:
* `main`:
    ```
    Benchmark #1: ./pants test --force --no-pytest-timeouts src/python/pants/engine/internals/engine_benchmarks_test.py
      Time (mean ± σ):     247.157 s ±  9.095 s    [User: 872.4 ms, System: 305.2 ms]
      Range (min … max):   240.523 s … 266.031 s    10 runs
    ```
* this branch:
    ```
    Benchmark #1: ./pants test --force --no-pytest-timeouts src/python/pants/engine/internals/engine_benchmarks_test.py
      Time (mean ± σ):     34.779 s ±  4.625 s    [User: 889.1 ms, System: 303.5 ms]
      Range (min … max):   32.246 s … 47.491 s    10 runs
    ```
Eric-Arellano added a commit that referenced this issue Nov 14, 2021
See #12110 for the original motivation.

## Why migrate

Beyond the APIs requiring much less boilerplate and being more ergonomic (e.g. the `.call0()` and `call1()` variants of `.call()`), I think the biggest benefit is clearer modeling for when the GIL is held.

With PyO3, the two [core types](https://pyo3.rs/v0.15.0/types.html) are `&PyAny` and `PyObject` (aka `Py<PyAny`). `&PyAny` is always a reference with the same lifetime as the `Python` token, which represents when the GIL is used. Whenever you want to do Python operations in Rust, like `.getattr()`, you use `&PyAny`. `PyObject` and any `Py<T>` are GIL-independent.

In contrast, rust-cpython only had `PyObject` and `&PyObject`. It passed around `Python` as an argument to any Python functions like `.getattr()`.

--

An additional benefit is that using proc macros instead of declarative macros means PyCharm no longer hangs for me when working in the `engine/` crate! PyCharm does much better w/ proc macros, and things feel zippy now like autocomplete working.

## Migration strategy

I tried originally doing an incremental strategy, most recently with #12451, which proved technically infeasible.

This PR completely swaps Rust-Cpython with PyO3, but tries to minimize the diff. It only makes changes when it was too difficult to get working with PyO3. For example, `interface.rs` is far too big, but this keeps the same organization as before.

For now, we still have the `native_engine_pyo3` extension along with `native_engine`. That will be consolidated in a followup.

## Benchmark

Before:
```
❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::'
Benchmark #1: ./pants --no-pantsd list ::
  Time (mean ± σ):      2.170 s ±  0.025 s    [User: 1.786 s, System: 0.303 s]
  Range (min … max):    2.142 s …  2.227 s    10 runs
```

After:

```
❯ hyperfine -w 1 -r 10 './pants --no-pantsd list ::'
Benchmark #1: ./pants --no-pantsd list ::
  Time (mean ± σ):      2.193 s ±  0.060 s    [User: 1.798 s, System: 0.292 s]
  Range (min … max):    2.144 s …  2.348 s    10 runs
```
Eric-Arellano added a commit that referenced this issue Dec 27, 2021
…and Protobuf (#13981)

Because we know that a `SingleSourceField` has exactly one file—and we can ban `*` so can compute its file path without using the engine—we can simplify dependency inference when creating the global module mapping. There's no need to actually look up the file system.

A downside of this PR is that we have lazier validation that the `source` field matches files on disk, i.e. that the file is there. But there are still plenty of places this gets validated, such as when computing the dependencies of a target because we have to inspect its source code. (Compared to creating a global module mapping for dep inference in general.)

There's a marginal speed up, too:

Before:

```
❯ hyperfine -w 1 -r 25 './pants --no-pantsd dependencies src/python/pants/util/strutil.py:util'
Benchmark #1: ./pants --no-pantsd dependencies src/python/pants/util/strutil.py:util
  Time (mean ± σ):      4.576 s ±  0.139 s    [User: 3.180 s, System: 0.657 s]
  Range (min … max):    4.373 s …  5.003 s    25 runs
```

After:

```
❯ hyperfine -w 1 -r 25 './pants --no-pantsd dependencies src/python/pants/util/strutil.py:util'
Benchmark #1: ./pants --no-pantsd dependencies src/python/pants/util/strutil.py:util
  Time (mean ± σ):      4.487 s ±  0.172 s    [User: 3.019 s, System: 0.628 s]
  Range (min … max):    4.224 s …  5.094 s    25 runs
```

[ci skip-rust]
[ci skip-build-wheels]
@kaos
Copy link
Member

kaos commented Apr 26, 2022

@Eric-Arellano perhaps use some other symbol than # when enumerating bullets..? (to avoid GH treating them as issue references... :P )

stuhood added a commit that referenced this issue May 10, 2023
The Pants native client which was introduced in #11922 has so far only
been usable in the Pants repo when the `USE_NATIVE_PANTS` environment
variable was set.

To make the native client available to end users, this change begins
distributing the native-client binary in Pants wheels. A corresponding
change in the `scie-pants` repo
(pantsbuild/scie-pants#172) uses the native
client to run `pants`.

To reduce the surface area of `scie-pants` (rather than having it be
responsible for handling the special `75` exit code similar to the
`pants` script integration), this PR also moves to having the
native-client execute its own fallback (via `execv`) to the Pants
entrypoint. In future, the `pantsbuild/pants` `pants` script could also
use that facility (e.g. by specifying a separate `pants_server` bash
script as the entrypoint to use as the `_PANTS_SERVER_EXE`).

----

As originally demonstrated on #11831, the native client is still much
faster than the legacy client. Using
pantsbuild/scie-pants#172, the timings look
like:
```
Benchmark #1: PANTS_NO_NATIVE_CLIENT=true PANTS_SHA=836cceb74e6af042e7c82677f3ceb4927efce20e scie-pants-macos-x86_64 help
  Time (mean ± σ):      1.161 s ±  0.067 s    [User: 830.6 ms, System: 79.2 ms]
  Range (min … max):    1.054 s …  1.309 s    10 runs

Benchmark #2: PANTS_SHA=836cceb74e6af042e7c82677f3ceb4927efce20e scie-pants-macos-x86_64 help
  Time (mean ± σ):     271.0 ms ±  30.6 ms    [User: 8.9 ms, System: 6.9 ms]
  Range (min … max):   241.5 ms … 360.6 ms    12 runs

Summary
  'PANTS_SHA=836cceb74e6af042e7c82677f3ceb4927efce20e scie-pants-macos-x86_64 help' ran
    4.29 ± 0.54 times faster than 'PANTS_NO_NATIVE_CLIENT=true PANTS_SHA=836cceb74e6af042e7c82677f3ceb4927efce20e scie-pants-macos-x86_64 help'
```

Fixes #11831.
wisechengyi added a commit that referenced this issue Jun 3, 2023
#19243)

…9047)"

This reverts commit 0aedb6b.

contains breaking change:
```
Error: 3.20 [ERROR] Failed to load the pants.backend.experimental.javascript.register backend: ImportError("cannot import name 'UserChosenNodeJSResolveAliases' from 'pants.backend.javascript.subsystems.nodejs' (/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/subsystems/nodejs.py)")
Traceback (most recent call last):
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/init/extension_loader.py", line 141, in load_backend
    module = importlib.import_module(backend_module)
  File "/home/gha/.pyenv/versions/3.9.13/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/experimental/javascript/register.py", line 9, in <module>
    from pants.backend.javascript.goals import lockfile, tailor, test
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/goals/lockfile.py", line 9, in <module>
    from pants.backend.javascript import nodejs_project_environment
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/nodejs_project_environment.py", line 9, in <module>
    from pants.backend.javascript import package_json, resolve
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/resolve.py", line 9, in <module>
    from pants.backend.javascript import nodejs_project
  File "/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/nodejs_project.py", line 19, in <module>
    from pants.backend.javascript.subsystems.nodejs import NodeJS, UserChosenNodeJSResolveAliases
ImportError: cannot import name 'UserChosenNodeJSResolveAliases' from 'pants.backend.javascript.subsystems.nodejs' (/home/gha/actions-runner1/_work/pants/pants/src/python/pants/backend/javascript/subsystems/nodejs.py)
```
tobni added a commit to tobni/pants that referenced this issue Jun 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants