Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge GoLibrary and GoSource providers #4030

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dzbarsky
Copy link
Contributor

@dzbarsky dzbarsky commented Aug 11, 2024

What type of PR is this?
Code simplification

What does this PR do? Why is it needed?
These providers are 1:1, and we always create a library followed by source_from_library. Merging them allows to remove the library field from GoSource as well as the the resolve field, since we don't need to smuggle the closure between the providers anymore.

If we do this, we will also need to fix up Gazelle. We can either have a patch in this repo so CI passes and then followup with the Gazelle fix, or somehow shim the API until we release new versions in lockstep. Open to ideas how to best do this.

Which issues(s) does this PR fix?
One step toward #2578

Fixes #

Other notes for review

"""
library = go.new_library(go)
source = go.library_to_source(go, {}, library, False)
source = new_source(go, {}, coverage_instrumented = False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed this helper from the context since it seemed simpler to just import the code we want to call, but I can put it back if desired

go/private/context.bzl Outdated Show resolved Hide resolved
go/private/context.bzl Outdated Show resolved Hide resolved
@fmeum
Copy link
Member

fmeum commented Aug 12, 2024

Can we shim the library, perhaps by having GoLibrary remain an alias of GoSource for a while? Otherwise this change could be too breaking to dependents.

@dzbarsky
Copy link
Contributor Author

Can we shim the library, perhaps by having GoLibrary remain an alias of GoSource for a while? Otherwise this change could be too breaking to dependents.

Aliasing them is a good idea, I think I'll keep the existing APIs for now but implement library_to_source in terms of new_source and remove the documentation. I think that should avoid breaking downstream repos (unless they are using source.library or something)

@dzbarsky dzbarsky force-pushed the zbarsky/libmerge branch 2 times, most recently from e904d8e to 596038b Compare August 12, 2024 16:47
@sluongng
Copy link
Contributor

From an end user perspective, I would think that a x_library rule should be producing a *Library provider rather than an *Source provider. So I would rather keep GoLibrary and ditch GoSource if possible.

@fmeum
Copy link
Member

fmeum commented Aug 13, 2024

Can we just call it GoInfo and keep aliases around? That's arguably the canonical name.

@sluongng
Copy link
Contributor

Yeah GoInfo works as well, following the convention of CcInfo and PyInfo.

👍 to keep aliases around for backward compatibility

@dzbarsky dzbarsky force-pushed the zbarsky/libmerge branch 3 times, most recently from b6e2b49 to 792404a Compare August 13, 2024 16:38
@dzbarsky
Copy link
Contributor Author

GoInfo is a great idea; updated. I've kept the source param name to GoArchive for now; I'd like to take a look at merging those as well after this lands. Other than that, I've renamed the usage in the repo

Copy link
Member

@fmeum fmeum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tyler-french Could you also test this at Uber?

go/private/rules/test.bzl Show resolved Hide resolved
@dzbarsky dzbarsky force-pushed the zbarsky/libmerge branch 3 times, most recently from 8243f56 to 4fe1b52 Compare August 21, 2024 23:51
@tyler-french
Copy link
Contributor

tyler-french commented Aug 22, 2024

Hey @dzbarsky, I'm having trouble with testing this PR with our thriftrw plugin. Here's the code for that:

def _go_thriftrw_library_impl(ctx):
    go = go_context(ctx)
    go_srcs = go_thriftrw_compile(
        go,
        ...
    )

    library = go.new_library(
        go,
        resolver = _resolve,
        srcs = go_srcs,
    )
    source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
    archive = go.archive(go, source)

    return [
        library,
        source,
        archive,
        DefaultInfo(
            files = depset([archive.data.file]),
            runfiles = archive.runfiles,
        ),
        ...


_go_thriftrw_library = rule(
    implementation = _go_thriftrw_library_impl,
    attrs = {
        "deps": attr.label_list(
            providers = [GoInfo], # just changed this
        ),
        ...
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),

Here's the error I'm getting:

ERROR: /home/user/go-code/idl/../BUILD.bazel:13:20: in _go_thriftrw_library rule //idl/..:object_config_host_agent_go_thriftrw: 
Traceback (most recent call last):
        File "/home/user/go-code/rules/thrift/go_thriftrw.bzl", line 149, column 25, in _go_thriftrw_library_impl
                archive = go.archive(go, source)
        File "/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/actions/archive.bzl", line 76, column 23, in emit_archive
                files.append(a.runfiles)
Error: 'Target' value has no field or method 'runfiles' (did you mean 'files'?)

Do you know why this might be?

Also, I tried using load("@io_bazel_rules_go//go/private:context.bzl", "new_go_info") instead, and it also had the same issue. But, I'm wondering if we should attach new_go_info as a helper to go_context (as is mentioned in the readme).

Let me know if you have any ideas what might be happening. Thanks!

@dzbarsky
Copy link
Contributor Author

@tyler-french Thanks for giving it a shot! I suspect that what might be happening is GoInfo.deps previously could be a list of either GoArchive providers or Targets, but now we expect it to be a list of GoArchive providers. I'm not sure what your _resolve function is doing, but perhaps that needs an update? Alternately, if that's the cause, we may be able to make _merge_embed ensure that the deps are normalized to provider fields, to avoid this being a breaking change. If it's possible to share the actual code, I'm happy to dig deeper!

I recommend to switch to calling new_go_info to future-proof, it should have been exposed in the defs.bzl, that was an oversight I noticed while also drafting release notes, I'll update this PR momentarily and also fix the docs. I would prefer not to put it on go_context for the following reasons:

  1. Loading it directly makes it more understandable for IDE bazel plugins, so you can click through it to jump to def
  2. I think it's weird API design to have to call go.new_go_info(go, ...) and also enforce that you are passing in the same context; it wants to be an instance method but those don't exist
  3. It's slightly more efficient to not add extra fields to the context struct that is created for every action :)

@tyler-french
Copy link
Contributor

tyler-french commented Aug 25, 2024

@dzbarsky @fmeum @linzhp I think I would like to wait to merge this PR until after we release v0.50.0. And then once we do merge this, we can make a v0.51.0-rc1.

The only other breaking changes after 0.49 are changing the inputs to a depset and separating out Nogo.

That was we reduce the amount of breaking changes in this release, and also will soon give users the chance to migrate their providers.

I want to make sure we do our best to make sure consumers of rules_go have a good experience.

@dzbarsky
Copy link
Contributor Author

@tyler-french That plan sounds good to me. BTW I'm not sure if your breakage with the thrift rules is caused by this PR, it might be another one of the PRs. Here's some potential breaking changenots I drafted (though most of these are very minor and unlikely to be an actual issue)

image image

@tyler-french
Copy link
Contributor

BTW I'm not sure if your breakage with the thrift rules is caused by this PR

@dzbarsky You're right, it seems to be a different PR. Here's the full code of the plugin:

load("//rules/thrift:thrift.bzl", "ThriftInfo")
load("@io_bazel_rules_go//go:def.bzl", "GoLibrary", "go_context")
load("@io_bazel_rules_go//go/private/rules:transition.bzl", "go_reset_target")
load("@bazel_skylib//lib:paths.bzl", "paths")

THRIFRW_OUTPUT_FILE = "thriftrw-gen.go"

GoThriftRWPlugin = provider(
    fields = [
        "expected_pkg_files",
        "plugin_name",
        "plugin",
        "plugin_deps",
        "plugin_flags",
    ],
)

def _resolve(go, attr, source, merge):
    plugin_deps = []
    for c in attr.plugins:
        plugin_deps.extend(c[GoThriftRWPlugin].plugin_deps)
    source["deps"] = source["deps"] + plugin_deps + attr._default_deps

def _thrift_file_name(file_path):
    file_name = paths.basename(file_path)
    words = paths.split_extension(file_name)
    if len(words) != 2:
        fail("Unable to parse the thrift file for name: {}".format(file_name))
    return words[0]

def _parse_pkg(package):
    package_tokens = package.split("/")
    if len(package_tokens) == 1:
        return "base", None
    elif len(package_tokens) == 2:
        service_word = package_tokens[1]

        pkg_suffixes = ["client", "server", "fx", "test"]
        for suffix in pkg_suffixes:
            if service_word.endswith(suffix):
                return suffix, service_word[:-len(suffix)]
        fail("unable to parse the type of package: {}".format(package))
    fail("the package: {} has an invalid number of components".format(package))

def _thriftrw_output(go, directory, package, source, plugins):
    """
    _thriftrw_output declares the files that should be generated by ThriftRW
    against a package. This selects the subset of files generated by ThriftRW
    that correspond to a ThriftRW generated package.
    """
    gen_files = []
    pkg_type, thrift_service = _parse_pkg(package)
    output_dir = None

    pkg_tmpls = []
    for m in [{"base": [THRIFRW_OUTPUT_FILE]}] + [p.expected_pkg_files for p in plugins]:
        pkg_tmpls += m.get(pkg_type, [])
    for t in pkg_tmpls:
        rel = paths.join(directory, _thrift_file_name(source.path), t.format(service_name = thrift_service))
        f = go.declare_file(go, rel)
        gen_files += [f]
        output_dir = f.path[:len(f.path) - len(rel)]
    return gen_files, output_dir

def go_thriftrw_compile(go, compiler, thrift_root, plugins, source, transitive_sources, importpath, package):
    """
    go_thriftrw_compile invokes the ThriftRW compiler with a particular plugin.
    Args:
        go: the Go context.
        compiler: the ThritRWCompiler.
        sources: the direct Thrift sources.
        transitive_sources: the transitive Thrift sources.
        importpath: the generated import path.
        package: the generated package.
    Returns:
        The generated Go sources.
    """
    thrift_root = paths.join(source.root.path, thrift_root)
    if not importpath.endswith(package):
        fail("package must be at end of import path:\n  {}\n  {}".format(importpath[-len(package):], package))

    directory = paths.dirname(source.path[len(thrift_root) + 1:])

    gen_files, out_dir = _thriftrw_output(go, directory, package, source, plugins)

    args = go.actions.args()
    args.add("--thrift-root", thrift_root)
    args.add("--output-file", THRIFRW_OUTPUT_FILE)
    args.add("--pkg-prefix", importpath.partition("/")[0])
    args.add("--no-recurse")
    for p in plugins:
        plugin_arg = p.plugin_name
        if p.plugin_flags:
            plugin_arg = "{} {}".format(p.plugin_name, p.plugin_flags)
        args.add("--plugin", plugin_arg)
    args.add("--out", out_dir)
    args.add(source)

    go.actions.run_shell(
        command = """
            export PATH={path}:$PATH &&
            exec {cmd} "$@"
        """.format(
            cmd = compiler.path,
            path = ":".join(["$(pwd)/{}".format(p.plugin.dirname) for p in plugins]),
        ),
        outputs = gen_files,
        inputs = transitive_sources,
        arguments = [args],
        tools = [compiler] + [p.plugin for p in plugins],
        mnemonic = "GoThriftrwGen",
    )

    return gen_files

def _go_thriftrw_library_impl(ctx):
    go = go_context(ctx)
    go_srcs = go_thriftrw_compile(
        go,
        compiler = ctx.executable.thriftrw,
        plugins = [c[GoThriftRWPlugin] for c in ctx.attr.plugins],
        thrift_root = ctx.attr.thrift_root,
        source = ctx.attr.thrift[ThriftInfo].src,
        transitive_sources = ctx.attr.thrift[ThriftInfo].transitive_srcs.to_list(),
        importpath = ctx.attr.importpath,
        package = ctx.attr.package,
    )

    library = go.new_library(
        go,
        resolver = _resolve,
        srcs = go_srcs,
    )
    source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
    archive = go.archive(go, source)

    return [
        library,
        source,
        archive,
        DefaultInfo(
            files = depset([archive.data.file]),
            runfiles = archive.runfiles,
        ),
        OutputGroupInfo(
            compilation_outputs = [archive.data.file],
            go_generated_srcs = go_srcs,
        ),
    ]

_go_thriftrw_library = rule(
    implementation = _go_thriftrw_library_impl,
    attrs = {
        "deps": attr.label_list(
            providers = [GoLibrary],
        ),
        "thrift": attr.label(
            mandatory = True,
            providers = [ThriftInfo],
        ),
        "importpath": attr.string(),
        "package": attr.string(),
        "plugins": attr.label_list(
            providers = [GoThriftRWPlugin],
            default = ["//rules/thrift:go_thriftrw_yarpc"],
        ),
        "thrift_root": attr.string(default = "idl"),
        "thriftrw": attr.label(
            executable = True,
            cfg = "exec",
            default = "@org_uber_go_thriftrw//:thriftrw",
        ),
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),
        "_default_deps": attr.label_list(
            default = [
                "@org_uber_go_thriftrw//protocol/binary:go_default_library",
                "@org_uber_go_thriftrw//protocol/stream:go_default_library",
            ],
        ),
    },
    toolchains = ["@io_bazel_rules_go//go:toolchain"],
)

def go_thriftrw_library(name, **kwargs):
    if "_thriftrw" not in kwargs:
        kwargs["_thriftrw"] = Label("@org_uber_go_thriftrw//:thriftrw")
    thriftrw = kwargs.pop("_thriftrw")

    reset_plugin_name = name + "_reset_plugin_"
    go_reset_target(
        name = reset_plugin_name,
        dep = thriftrw,
        visibility = ["//visibility:private"],
    )
    _go_thriftrw_library(
        name = name,
        thriftrw = reset_plugin_name,
        **kwargs
    )

def _go_thriftrw_plugin_impl(ctx):
    thriftrw_plugin_prefix = "thriftrw-plugin-"
    plugin_basename = ctx.executable.plugin.basename
    plugin_name_index = plugin_basename.rfind(thriftrw_plugin_prefix)
    plugin_name = plugin_basename[plugin_name_index + len(thriftrw_plugin_prefix):]
    return [
        GoThriftRWPlugin(
            plugin_deps = ctx.attr.plugin_deps,
            plugin_name = plugin_name,
            plugin = ctx.executable.plugin,
            expected_pkg_files = ctx.attr.expected_pkg_files,
            plugin_flags = ctx.attr.plugin_flags,
        ),
    ]

_go_thriftrw_plugin = rule(
    implementation = _go_thriftrw_plugin_impl,
    attrs = {
        "expected_pkg_files": attr.string_list_dict(doc = "A map of package types to expected file outputs from code generation. Example: {'client': ['{service_name}client/client.go']}"),
        "plugin_deps": attr.label_list(providers = [GoLibrary]),
        "plugin": attr.label(
            allow_single_file = True,
            executable = True,
            cfg = "exec",
        ),
        "plugin_flags": attr.string(doc = "An optional flag string the plugin can be invoked with"),
        "_go_context_data": attr.label(default = "@io_bazel_rules_go//:go_context_data"),
    },
    toolchains = ["@io_bazel_rules_go//go:toolchain"],
)

def go_thriftrw_plugin(name, **kwargs):
    plugin = kwargs.pop("plugin")
    reset_plugin_name = name + "_reset_plugin_"
    go_reset_target(
        name = reset_plugin_name,
        dep = plugin,
        visibility = ["//visibility:private"],
    )
    _go_thriftrw_plugin(
        name = name,
        plugin = reset_plugin_name,
        **kwargs
    )

I am slightly concerned about all of these breaking changes here.

@dzbarsky
Copy link
Contributor Author

@tyler-french Can you try replacing your _resolve function with the following?

def _resolve(go, attr, source, merge):
    plugin_deps = []
    for c in attr.plugins:
        plugin_deps.extend(c[GoThriftRWPlugin].plugin_deps)
    source["deps"] += [dep[GoArchive] for dep in plugin_deps + attr._default_deps]

Your concern is fair but i think the list is a bit pedantic and a lot of the changes are to providers that are already declared to be non-public API in the docs. In practice the actual changes that I think may cause some snags:

  1. The SDK depsets
  2. The deps becoming GoArchive
  3. go.mode.link becoming go.mode.linkmode

I think we can smooth out (2) by fixing up source["deps"] after calling the resolver inside the rules, but let's confirm that this fixes things for you first and whether there are other issues

@fmeum
Copy link
Member

fmeum commented Aug 26, 2024

I also like the release plan. We should probably fix #4062 for that release, I'll look into it tomorrow.

The breaking changes David listed are also the three I consider most relevant. As we will document them in the release notes, I think that migration won't be too problematic given how much users can gain.

@dzbarsky Do you have before/after CPU/memory data for your series if patches?

fmeum pushed a commit that referenced this pull request Aug 27, 2024
**What type of PR is this?**
Starlark fixup

**What does this PR do? Why is it needed?**
Make the migration process a little smoother

**Which issues(s) does this PR fix?**
#4030 (comment)

**Other notes for review**
@tyler-french
Copy link
Contributor

Sorry, hitting another failure here:

 % bazel build --nobuild //...
INFO: Writing tracer profile to '/tmp/bazel_20240827160435_o7sby'
INFO: Invocation ID: b049c579-1db2-41b8-a63c-157e0b994068
ERROR: /home/user/go-code/src/../BUILD.bazel:27:14: in _glue_generate rule //src/...:glue_gateway: 
Traceback (most recent call last):
        File "/home/user/go-code/rules/glue/glue.bzl", line 107, column 34, in _glue_generate_impl
                source = go.library_to_source(go, ctx.attr, library, ctx.coverage_instrumented())
        File "/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/context.bzl", line 312, column 24, in _library_to_source
                library.resolve(go, attr, source, _merge_embed)
        File "/home/user/go-code/rules/glue/glue.bzl", line 8, column 20, in _resolve
                tdeps = dep[GoSource].deps
Error: type 'struct' has no operator [](Provider)
ERROR: /home/user/go-code/..../BUILD.bazel:27:14: Analysis of target '//src/...:glue_gateway' failed
Analyzing: 486068 targets (21318 packages loaded, 12690 targets configured)
FATAL: bazel ran out of memory and crashed. Printing stack trace:
net.starlark.java.eval.Starlark$UncheckedEvalError: OutOfMemoryError thrown during Starlark evaluation (//idl/code.uber.internal/rds/hubbub:glue_mock)
        at <starlark>.print(<builtin>:0)
        at <starlark>._resolve(/home/user/go-code/rules/glue/mock.bzl:13)
        at <starlark>._library_to_source(/home/user/.cache/bazel/_bazel_tfrench/b97476d719d716accead0f2d5b93104f/external/io_bazel_rules_go/go/private/context.bzl:312)
        at <starlark>._glue_generate_mock_impl(/home/user/go-code/rules/glue/mock.bzl:64)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at java.base/java.util.Arrays.copyOf(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at java.base/java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.base/java.lang.StringBuilder.append(Unknown Source)
        at net.starlark.java.eval.Printer.append(Printer.java:60)
        at com.google.devtools.build.lib.collect.nestedset.Depset.repr(Depset.java:305)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
        at net.starlark.java.eval.StarlarkList.repr(StarlarkList.java:249)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at com.google.devtools.build.lib.packages.StructImpl.repr(StructImpl.java:136)
        at net.starlark.java.eval.Printer.repr(Printer.java:195)
        at net.starlark.java.eval.Printer.printList(Printer.java:101)
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 60)

This is the resolve function here:

def _resolve(go, attr, source, merge):
    libdeps = []
    lib = attr.library
    if GoSource in lib:
        libdeps = lib[GoSource].deps

    transitive = []
    for libdep in libdeps:
        tdeps = libdep[GoSource].deps
        transitive += tdeps

    source["deps"] = source["deps"] + transitive + libdeps + attr._generated_mock_deps

@dzbarsky
Copy link
Contributor Author

@tyler-french No problem, thanks for iterating with me on this :)

If I'm reading correctly, libdeps are a list of GoArchive, so you probably want to change it like so:

-        tdeps = libdep[GoSource].deps
+        tdeps = libdep.source.deps

does that work?

@dzbarsky
Copy link
Contributor Author

@fmeum shall we merge this? Are we waiting for anything else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants