Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Java] Java Cookbook fails on 16.0.0-SNAPSHOT #347

Closed
1 task
Tracked by #348
amoeba opened this issue Apr 11, 2024 · 14 comments · Fixed by #350
Closed
1 task
Tracked by #348

[Java] Java Cookbook fails on 16.0.0-SNAPSHOT #347

amoeba opened this issue Apr 11, 2024 · 14 comments · Fixed by #350
Assignees

Comments

@amoeba
Copy link
Member

amoeba commented Apr 11, 2024

When I run make javatest from the latest main with ARROW_NIGHTLY=1, the Flight cookbook fails with,

Running with Arrow version: 16.0.0-SNAPSHOT
...snip...
Exception java.lang.NoClassDefFoundError: io/grpc/BindableService
      at FlightServer.builder (FlightServer.java:169)
      at (#39:4)
Caused by: java.lang.ClassNotFoundException: io.grpc.BindableService
      at BuiltinClassLoader.loadClass (BuiltinClassLoader.java:641)
      at ClassLoaders$AppClassLoader.loadClass (ClassLoaders.java:188)
      at ClassLoader.loadClass (ClassLoader.java:525)
      ...

I'm not sure what might be causing this but I wonder if it's related to the recent JPMS changes. @jduo do you have any ideas? In case you're not familiar with how the Java cookbooks run, they essentially,

  1. Generate a classpath using a stub pom.xml in https://github.com/apache/arrow-cookbook/blob/main/java/source/demo/pom.xml and by calling maven,
    "mvn",
    "-q",
    "dependency:build-classpath",
    "-DincludeTypes=jar",
    "-Dmdep.outputFile=.cp.tmp",
    f"-Darrow.version={self.env.config.version}",
  2. Runs each code snippet with jshell using the above classpath via
    ["jshell", "-R--add-opens=java.base/java.nio=ALL-UNNAMED", "--class-path", stdout_dependency, "-s", "/dev/stdin"],

TODOs from the thread below:

@amoeba amoeba self-assigned this Apr 11, 2024
@jduo
Copy link
Member

jduo commented Apr 11, 2024

I haven't seen this error.

It could be related to JPMS. We also updated gRPC's few times, merged some flight modules into one module, and changed some of the shading in Flight (though that may have only affected clients and not the server).

I'll take a peek at the cookbook and let you know how it goes @amoeba

@amoeba
Copy link
Member Author

amoeba commented Apr 11, 2024

Thanks so much @jduo.

@jduo
Copy link
Member

jduo commented Apr 11, 2024

@amoeba , I was able to make this work by making some changes to the POM. This seems like the wrong way to resolve the issue though -- we seem to have some problems getting transitive dependencies.

In the demo pom.xml:

  • I changed the importing of flight-core to use the "shaded" classifier instead of the default.
  • I also needed to add the following dependencies explicitly:
        <dependency>
              <groupId>io.netty</groupId>
              <artifactId>netty-codec-http2</artifactId>
              <version>4.1.108.Final</version>
        </dependency>
        <dependency>
              <groupId>io.perfmark</groupId>
              <artifactId>perfmark-api</artifactId>
              <version>0.26.0</version>
        </dependency>

These are both from grpc-java.

  • Finally this wasn't necessary but probably should be done -- I changed the jshell command to match what's mentioned in install.rst for the Arrow project:
"-R--add-reads=org.apache.arrow.flight.core=ALL-UNNAMED", "-R--add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED"

@lidavidm , do you know if the flight-core artifact is supposed to be the shaded classifier by default? And any ideas why we might not be inheriting transitive dependencies anymore?

@lidavidm
Copy link
Member

I don't think shading is meant to be the default. I didn't even realize we shipped a separate shaded version. Regardless, can we just compare 15.0.0 and 16.0.0 and see what might have changed?

@lidavidm
Copy link
Member

In fact, apache/arrow@92682f0 seems to be the only commit in 16.0.0 touching flight-core.

@jduo
Copy link
Member

jduo commented Apr 11, 2024

When using the unshaded flight-core artifact I add grpc dependencies explicitly:

        <dependency>
            <groupId>io.grpc</groupId>
            <artifactId>grpc-api</artifactId>
            <version>1.63.0</version>
        </dependency>
        <dependency>
            <groupId>io.grpc</groupId>
            <artifactId>grpc-netty</artifactId>
            <version>1.63.0</version>
        </dependency>
        <dependency>
            <groupId>io.grpc</groupId>
            <artifactId>grpc-stub</artifactId>
            <version>1.63.0</version>
        </dependency>
        <dependency>
            <groupId>io.grpc</groupId>
            <artifactId>grpc-protobuf</artifactId>
            <version>1.63.0</version>
        </dependency>

and this leads to the following error:

Exception java.lang.IllegalAccessError: class org.apache.arrow.flight.impl.Flight$FlightDescriptor tried to access method 'com.google.protobuf.LazyStringArrayList com.google.protobuf.LazyStringArrayList.emptyList()' (org.apache.arrow.flight.impl.Flight$FlightDescriptor and com.google.protobuf.LazyStringArrayList are in unnamed module of loader 'app')
      at Flight$FlightDescriptor.<init> (Flight.java:7034)
      at Flight$FlightDescriptor.<clinit> (Flight.java:7738)
      at FlightServiceGrpc.getGetFlightInfoMethod (FlightServiceGrpc.java:106)
      at FlightServiceGrpc.getServiceDescriptor (FlightServiceGrpc.java:1180)
      at FlightServiceGrpc.bindService (FlightServiceGrpc.java:1059)
      at FlightServiceGrpc$FlightServiceImplBase.bindService (FlightServiceGrpc.java:558)
      at FlightBindingService.bindService (FlightBindingService.java:96)
      at ServerInterceptors.intercept (ServerInterceptors.java:93)
      at FlightServer$Builder.build (FlightServer.java:306)
      at (#39:4)

I'm looking into this. However it's worth noting that using dependency:build-classpath doesn't separate out JARs that should go on the module-path (ie JPMS modules) vs. JARs that should go on the classpath.

jshell requires you to explicitly put JARs that are to be used as JPMS modules on the module-path parameter. Everything put on the classpath goes into the unnamed module, which is why in the error above, FlightDescriptor is being shown as part of the unnamed module instead of the org.apache.flight.core module.

This differs from running tests from maven, where maven inspects JAR metadata to figure out if they support JPMS or not.

@lidavidm
Copy link
Member

Hmm so are we just invoking JShell wrongly then?

@jduo
Copy link
Member

jduo commented Apr 11, 2024

I was able to get past the above error by adding an explicit dependency on protobuf to the demo pom too:

        <dependency>
            <groupId>com.google.protobuf</groupId>
            <artifactId>protobuf-java</artifactId>
            <version>3.23.1</version>
        </dependency>

It seems that unnamed modules cannot access other dependencies in the unnamed module that have been brought in transitively, but can when they have been brought in explicitly:
https://stackoverflow.com/questions/70439672/illegalaccesserror-classa-and-class-b-are-in-unnamed-module-of-loader-app

Hmm so are we just invoking JShell wrongly then?

The right thing to do is likely to correctly build module-path and classpath separately when using Arrow 16+. I don't see a friendly way to do this compared to using the build-classpath target though. Maybe jshell isn't the way to go too and we should run via maven.

We probably should step back and think about what we want to support. For example do we intend to allow for users to put Arrow in the unnamed module? If so, how do we make this friendlier -- alternate artifacts that skip adding module-info.java perhaps.

@lidavidm
Copy link
Member

I'd rather not try to support every possible permutation of ways to do things - unless there's a real need from other users then I'd vote that we try to fix what's happening here instead of adding yet another artifact upstream.

@lidavidm
Copy link
Member

I think we chose JShell here initially because it had less ceremony required (in particular not requiring nesting everything in a class/main method). If we have to move back to using "regular" classes then so be it.

@amoeba
Copy link
Member Author

amoeba commented Apr 11, 2024

Since it sounds like there's not an easy way to tweak the current setup (using maven to pass the right info to jshell), it seems like refactoring the Java cookbook to use something like mvn exec:java might be the long-term fix here. Though I think we could keep the examples as-is and avoid wrapping them in a class if we just use some string manipulation.

Does that seems like a good plan? I'm happy to take that on.

@jduo
Copy link
Member

jduo commented Apr 11, 2024

Yeah that sounds like a good solution. It would work with both pre and post-JPMS builds.
I haven't checked if mvn exec:java would automatically put modules on module-path like surefire does though.

@amoeba
Copy link
Member Author

amoeba commented May 11, 2024

Today I started in on a rewrite of javadoctest.py that uses exec:java. I have the basic shell of it working so I think the approach will work. What I'm doing so far is basically:

  • Create a tempdir, copy pom.xml and a template .java file in
  • Insert the code, run it with a customized exec:java call and capture output

I should have a PR up early next week.

@amoeba
Copy link
Member Author

amoeba commented May 21, 2024

I put a PR up at #350 that refactors the Java cookbooks to use Maven instead of JShell.

amoeba added a commit that referenced this issue May 23, 2024
In #347 we found the way
we have been running cookbooks for Java (JShell) doesn't work well with
JPMS which was introduced in Arrow 16. This refactors `javadoctest.py`
to run examples directly with Maven using `exec:java` instead of with
JShell. This PR also bumps the Java source/target version to 11 to fix
some compiler errors and fixes a few compilation errors in cookbook
code.

I ran into one snag will require a follow-up commit to this PR: The way
the examples in
[substrait.rst](https://raw.githubusercontent.com/apache/arrow-cookbook/main/java/source/substrait.rst)
are written doesn't work with my approach. My approach splits each code
snippet into its `import` statements and non-`import` statements, puts
the imports outside the main class definition and puts the non-imports
inside the class's main method. This works fine for every example except
[substrait.rst](https://raw.githubusercontent.com/apache/arrow-cookbook/main/java/source/substrait.rst)
which needs some of its code to be defined in the main class, e.g.,

We probably generally want to support examples that need this so I think
we may need to rewrite all the Java cookbooks to have an explicit main
class. @lidavidm suspected this might be the case in
#347 (comment)
but I do wonder if there is still a way to avoid this. Any ideas
welcome.

Fixes #347
Related #348
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants