Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28417: Bump Log4j2 to 2.24.1 to facilitate compilation of GraalVM Native Image #5375

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

linghengqian
Copy link
Member

@linghengqian linghengqian commented Jul 31, 2024

What changes were proposed in this pull request?

  • Bump Log4j2 to 2.24.1 to facilitate compilation of GraalVM Native Image.

Why are the changes needed?

  • This PR is aimed at ensuring that HiveServer2 JDBC Driver is compatible with future versions of LOG4J as much as possible. I created a unit test at https://github.com/linghengqian/log4j2-v2231-graalvm-native-image-test to prove that the HiveServer2 JDBC Driver using apache/logging-log4j2:2.23.1 can be used under the GraalVM Native Image compiled by GraalVM CE For JDK 22.0.2.
sdk install java 8.0.422-tem
sdk install java 22.0.2-graalce
sdk use java 8.0.422-tem
sdk install maven
git clone [email protected]:linghengqian/hive.git -b log4j-bump
cd ./hive/
mvn clean install -DskipTests
mvn clean package -pl packaging -DskipTests -Pdocker
cd ../

git clone [email protected]:linghengqian/log4j2-v2231-graalvm-native-image-test.git
cd ./log4j2-v2231-graalvm-native-image-test/
sdk use java 22.0.2-graalce
./mvnw -PgenerateMetadata -DskipNativeTests -e -T1C clean test native:metadata-copy
./mvnw -PnativeTestInHive -T1C -e clean test
Moved `log4j-api` and `log4j-core` artifacts with classifier `tests` to `log4j-api-test` and `log4j-core-test` respectively. 
I took care of the [release notes](https://logging.apache.org/log4j/2.x/release-notes/2.20.0.html).

The difference in dependencies is caused by Maven:

for the log4j-core artifact with tests classifier Maven pulled the wrong dependencies (compile scope instead of test scope),
for the log4j-core-test artifact the real dependencies are no longer hidden: the artifact mostly contains JUnit 4 and JUnit 5 extensions, so it depends on all those libraries.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

How was this patch tested?

sdk install java 8.0.422-tem
sdk use java 8.0.422-tem
sdk install maven
mvn clean install -DskipTests
mvn test -Dtest=TestHplsqlLocal -pl hplsql
mvn test -Dtest=TestHplsqlOffline -pl hplsql

Copy link
Member Author

@linghengqian linghengqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unexpected exception java.lang.RuntimeException: Process docker failed to run in 30 seconds
 at org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.runCmd(AbstractExternalDB.java:91)
 at org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.runCmdAndPrintStreams(AbstractExternalDB.java:106)

@linghengqian linghengqian changed the title [WIP]HIVE-28417: Bump Log4j2 to 2.23.1 to facilitate compilation of GraalVM Native Image HIVE-28417: Bump Log4j2 to 2.23.1 to facilitate compilation of GraalVM Native Image Aug 5, 2024
@linghengqian linghengqian marked this pull request as ready for review August 5, 2024 04:56
@Aggarwal-Raghav
Copy link
Contributor

Had 1 question, added it as comment. Otherwise java and pom changes LGTM (+1 non-binding).

Copy link

github-actions bot commented Oct 6, 2024

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the [email protected] list if the patch is in need of reviews.

@github-actions github-actions bot added the stale label Oct 6, 2024
Copy link
Member Author

@linghengqian linghengqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the bugs https://issues.apache.org/jira/browse/HIVE-28559 and https://issues.apache.org/jira/browse/HIVE-28552, I can't actually merge the master branch directly.

@zhangbutao zhangbutao removed the stale label Oct 12, 2024
pom.xml Outdated Show resolved Hide resolved
name = HiveLog4j2Test
packages = org.apache.hadoop.hive.ql.log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this changed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2024-08-02T10:55:16.634Z main WARN The use of package scanning to locate plugins is deprecated and will be removed in a future release
2024-08-02T10:55:16.650Z main WARN The use of package scanning to locate plugins is deprecated and will be removed in a future release
2024-08-02T10:55:16.670Z main WARN The use of package scanning to locate plugins is deprecated and will be removed in a future release
2024-08-02T10:55:16.686Z main WARN The use of package scanning to locate plugins is deprecated and will be removed in a future release

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://web.archive.org/web/20240823014436/https://logging.apache.org/log4j/2.x/manual/configuration.html#configuration-syntax

Use of the packages attribute is deprecated and will be removed in Log4j 3.0. Plugins should be processed with the Log4j annotation processor. A comma separated list of package names to search for plugins. Plugins are only loaded once per classloader so changing this value may not have any effect upon reconfiguration.

How we replace packages with the new way? I think removing package will cause some unkown issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we replace packages with the new way? I think removing package will cause some unkown issue.

  • I would say that even without merging this PR, packages never worked in Hive. Unless Hive has always used log4j1 and never used log4j2. Because the class org.apache.logging.log4j.core.config.plugins.processor.PluginProcessor exists in the JAR of the final binary of Hive, org.apache.logging.log4j.core.config.plugins.processor.PluginProcessor generates Log4j2Plugins.dat, which directly covers the operation of packages. It is located at /hive/ql/target/hive-exec-4.1.0-SNAPSHOT.jar!/META-INF/org/apache/logging/log4j/core/config/plugins/Log4j2Plugins.dat. packages is actually a log4j1 thing.
  • image
  • You can determine from a comment by a log4j2 committer that this property does not actually take effect. See Fix handling of onMatch and onMismatch attributes in the properties configuration format logging-log4j2#2791 (comment) .

There is no problem in Apache Hive: I checked that the appropriate annotation processor is configured in the hive-exec project and there is a Log4j2Plugins.dat file in the JAR. You can safely remove the packages configuration attribute.

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-compiler-plugin</artifactId>
  <version>${maven-compiler-plugin.version}</version>
  <executions>
    <execution>
      <id>generate-log4j-plugin-descriptor</id>
      <goals>
        <goal>compile</goal>
      </goals>
      <phase>process-classes</phase>
      <configuration>
        <proc>only</proc>
        <annotationProcessorPaths>
          <path>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.24.1</version>
          </path>
        </annotationProcessorPaths>
        <annotationProcessors>
          <processor>org.apache.logging.log4j.core.config.plugins.processor.PluginProcessor</processor>
        </annotationProcessors>
      </configuration>
    </execution>
  </executions>
</plugin>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good explanation!

Since Hive has neither CI for JDK23 and JDK24 nor CI for GraalVM Native Image, we actually don't need to register a org.apache.logging.log4j.core.config.plugins.processor.PluginProcessor in pom.xml as https://logging.apache.org/log4j/2.x/manual/plugins.html#plugin-registry .

BTW, if Hive wants to use JDK23 or JDK24, should we need to register the PluginProcessor ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, if Hive wants to use JDK23 or JDK24, should we need to register the PluginProcessor ?

  • The answer is yes. But I think most people will actually choose to set -proc:full for JDK 23 to make JDK 23 behave the same as JDK8. -proc:full is only available on 11.0.23, 17.0.11, JDK 21 - 24. -proc:full is not available on any OpenJDK 8u, -proc:full is only available on Oracle's closed source JDK 8u411.
  • For JDK23, we can set <maven.compiler.proc>full</maven.compiler.proc> in a separate Maven Profile to make JDK23 behave the same as JDK 8. Refer to Support building Example module with OpenJDK 23 shardingsphere#33224 for the solution.
  • I would say setting <maven.compiler.proc>full</maven.compiler.proc> is not useful for Hive, as I found that the master branch of Hive cannot be compiled with JDK22. This means that none of the unit tests will be affected.

@linghengqian linghengqian changed the title HIVE-28417: Bump Log4j2 to 2.23.1 to facilitate compilation of GraalVM Native Image HIVE-28417: Bump Log4j2 to 2.24.1 to facilitate compilation of GraalVM Native Image Oct 12, 2024
Copy link

sonarcloud bot commented Oct 12, 2024

pom.xml Show resolved Hide resolved
Copy link
Contributor

@zhangbutao zhangbutao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

Thanks for the patch!

@zhangbutao zhangbutao merged commit 548990d into apache:master Oct 25, 2024
6 checks passed
@linghengqian linghengqian deleted the log4j-bump branch October 25, 2024 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants