Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download more dependencies to decrease the chance of network errors #15797

Merged
merged 6 commits into from
Feb 9, 2023

Conversation

MiguelWeezardo
Copy link
Member

@MiguelWeezardo MiguelWeezardo commented Jan 20, 2023

Description

A call to dependency:go-offline was recently added to download all dependencies into the local Maven cache.
Unfortunately it turns out the goal has some problems with multi-module projects, and Maven will still attempt to download any missing artifacts during build.

The purpose of this PR is to improve this situation by caching MORE dependencies, as every cached artifact decreases the chance of a Maven Central download failure breaking Maven builds.

Downloading ALL dependencies is NOT a requirement, although it would be ideal.

Other jobs will now depend the maven-checks job as they now need to unpack the cache artifact left by it.

Additional context and related issues

CI workers sometimes fail to download dependencies in the middle of a build.
A workaround was proposed to download as many dependencies as possible at the start of CI pipeline, cache them for other jobs, to limit the use of Maven Central afterwards.

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Jan 20, 2023
@MiguelWeezardo MiguelWeezardo changed the title Add --offline flag to Maven calls after mvn dependency:go-offline Add --offline flag to Maven calls after dependency:go-offline Jan 20, 2023
@MiguelWeezardo MiguelWeezardo changed the title Add --offline flag to Maven calls after dependency:go-offline Add --offline flag to Maven calls after dependency:go-offline Jan 20, 2023
@nineinchnick
Copy link
Member

Does this also solve fetching Maven plugins? Or is that not an issue?

@MiguelWeezardo MiguelWeezardo removed the request for review from findepi January 20, 2023 12:32
@MiguelWeezardo
Copy link
Member Author

Does this also solve fetching Maven plugins? Or is that not an issue?

A hypothesis I'm currently checking is that Maven tries to download POMs while building the module hierarchy.
I see that my .m2 repo contains the packaging-222.pom which was missing in the failed run, so I'm hoping that the --offline mode will cause Maven to use the cached version.

.github/workflows/ci.yml Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
@MiguelWeezardo
Copy link
Member Author

It seems that dependency:go-offline doesn't download everything that's required, as subsequent Maven call failed on not being able to download some plugin dependency in offline mode.

@MiguelWeezardo
Copy link
Member Author

I guess we could manually download one dependency, but I see that there may be more missing pieces.

@nineinchnick
Copy link
Member

We can make iterative improvements here.

@MiguelWeezardo
Copy link
Member Author

MiguelWeezardo commented Jan 20, 2023

We can make iterative improvements here.

We'd never catch up - the end of the road is having all dependencies (transitive and direct) from all build plugins downloaded, and unless we manage to do that there's always a window of opportunity for CI breaking. We need something like dependencies:go-offline that actually does what it promises.

@MiguelWeezardo
Copy link
Member Author

I'll make another attempt with common CI profiles enabled during dependency:go-offline

@MiguelWeezardo
Copy link
Member Author

MiguelWeezardo commented Jan 23, 2023

I'm also going to check out https://github.com/qaware/go-offline-maven-plugin

@MiguelWeezardo MiguelWeezardo force-pushed the maven_offline_flag branch 3 times, most recently from 0f54eb4 to e699b36 Compare January 23, 2023 18:36
@MiguelWeezardo
Copy link
Member Author

MiguelWeezardo commented Jan 24, 2023

It seems the new plugin worked (at least up until trino-spi module), but now there's:

Error:  Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check (default) on project trino-spi: The following API problems caused the build to fail:
Error:  java.class.externalClassExposedInAPI: class io.airlift.slice.BasicSliceInput: A class from supplementary archives is used in a public capacity in the API. [io.airlift:slice:jar:0.44]
Error:  java.class.externalClassExposedInAPI: class io.airlift.slice.Slice: A class from supplementary archives is used in a public capacity in the API. [io.airlift:slice:jar:0.44]
Error:  java.class.externalClassExposedInAPI: class io.airlift.slice.SliceInput: A class from supplementary archives is used in a public capacity in the API. [io.airlift:slice:jar:0.44]
Error:  java.class.externalClassExposedInAPI: class io.airlift.slice.SliceOutput: A class from supplementary archives is used in a public capacity in the API. [io.airlift:slice:jar:0.44]

@MiguelWeezardo MiguelWeezardo marked this pull request as draft January 24, 2023 21:18
@MiguelWeezardo MiguelWeezardo force-pushed the maven_offline_flag branch 3 times, most recently from e9d020c to 52e8dd4 Compare January 25, 2023 12:19
@MiguelWeezardo MiguelWeezardo changed the title Add --offline flag to Maven calls after dependency:go-offline Download more dependencies to decrease the chance of network errors Jan 25, 2023
@MiguelWeezardo MiguelWeezardo marked this pull request as ready for review January 25, 2023 12:23
@MiguelWeezardo
Copy link
Member Author

Once this gets merged we should purge the GitHub Maven cache.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
pom.xml Outdated Show resolved Hide resolved
core/trino-spi/pom.xml Show resolved Hide resolved
pom.xml Show resolved Hide resolved
pom.xml Show resolved Hide resolved
@MiguelWeezardo
Copy link
Member Author

Failed due to #14372

@MiguelWeezardo MiguelWeezardo force-pushed the maven_offline_flag branch 2 times, most recently from d043387 to 269a4f2 Compare February 8, 2023 09:20
Maven's built-in `dependency:go-offline` goal doesn't download build
plugin dependencies, we need an additional tool to download them.

The `maven-checks` job is responsible for:
- getting as many dependencies as possible,
- putting them in GitHub cache for other jobs and pipelines.

Experiments have shown the new plugin also has some issues.
For example it only downloads a single version of a dependency,
which breaks on any module which overrides global versions
of some common dependencies - so far `trino-pinot` is the only one.
These overridden versions will be missing.

This is one of the reasons why we can't use Maven's `--offline` flag.

However, a majority of dependencies can still be downloaded at the start
of the pipeline, providing quick feedback in case of network problems.

Anything missing will have to be downloaded in other jobs.

This won't completely eliminate the faulty networking problems, but
should decrease the chances of dependency downloads failing in late
pipeline stages.

Add dynamic dependencies found by CI
Use cache to decrease the number of downloads from Maven Central
which can potentially result in failures.
Revapi tries to compare this and previous version of any API.
If it can't download the old version, if falls back to
comparing with an empty file.

This can lead to error reports which are not shown in daily development.
In this approach each job is responsible for downloading all dependencies.
Retries should only be attempted during dependency downloads.
This explains how to handle dynamic dependencies
that aren't detected by automated tools.
Copy link
Member

@nineinchnick nineinchnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi LGTM

@MiguelWeezardo
Copy link
Member Author

Unrelated test failure:

tests               | 2023-02-08 22:30:08 WARNING: Unable to execute: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (float_column float, varchar_column varchar(20)), due: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replacing columns cannot drop columns for table default.schema_evolution_rhiw4er6vu. SerDe may be incompatible
tests               | 2023-02-08 22:30:08 WARNING: Unable to execute: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (int_column int, varchar_column varchar(20)), due: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replacing columns cannot drop columns for table default.schema_evolution_rhiw4er6vu. SerDe may be incompatible
tests               | 2023-02-08 22:30:08 WARNING: Unable to execute: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (int_column int, float_column float), due: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Replacing columns cannot drop columns for table default.schema_evolution_rhiw4er6vu. SerDe may be incompatible
tests               | 2023-02-08 22:30:08 WARNING: Unable to execute: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (tiny_column tinyint, int_column int, float_column float, varchar_column varchar(20)), due: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
tests               | tiny_column,int_column,float_column
tests               | 2023-02-08 22:30:08 WARNING: Unable to execute: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (int_column int, tiny_column tinyint, float_column float, varchar_column varchar(20)), due: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
tests               | tiny_column,float_column
presto-master       | 2023-02-08T22:30:08.957+0545	INFO	dispatcher-query-25	io.trino.event.QueryMonitor	TIMELINE: Query 20230208_164508_02152_j74bh :: FINISHED :: elapsed 87ms :: planning 47ms :: waiting 10ms :: scheduling 20ms :: running 17ms :: finishing 3ms :: begin 2023-02-08T22:30:08.869+05:45 :: end 2023-02-08T22:30:08.956+05:45
tests               | 2023-02-08 22:30:11 INFO: not retrying; @Flaky annotation not present
tests               | 2023-02-08 22:30:11 INFO: FAILURE     /    io.trino.tests.product.hive.TestHivePartitionSchemaEvolution.testParquet (Groups: ) took 4.0 seconds
tests               | 2023-02-08 22:30:11 SEVERE: Failure cause:
tests               | io.trino.tempto.query.QueryExecutionException: java.sql.SQLException
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:119)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQuery(JdbcQueryExecutor.java:84)
tests               | 	at io.trino.tests.product.hive.TestHivePartitionSchemaEvolution.tryExecuteOnHive(TestHivePartitionSchemaEvolution.java:119)
tests               | 	at io.trino.tests.product.hive.TestHivePartitionSchemaEvolution.testEvolution(TestHivePartitionSchemaEvolution.java:110)
tests               | 	at io.trino.tests.product.hive.TestHivePartitionSchemaEvolution.test(TestHivePartitionSchemaEvolution.java:92)
tests               | 	at io.trino.tests.product.hive.TestHivePartitionSchemaEvolution.testParquet(TestHivePartitionSchemaEvolution.java:58)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
tests               | 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
tests               | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
tests               | 	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
tests               | 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
tests               | 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
tests               | 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
tests               | 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
tests               | 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
tests               | 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
tests               | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
tests               | 	at java.base/java.lang.Thread.run(Thread.java:833)
tests               | Caused by: java.sql.SQLException
tests               | 	at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:275)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:128)
tests               | 	at io.trino.tempto.query.JdbcQueryExecutor.execute(JdbcQueryExecutor.java:112)
tests               | 	... 18 more
tests               | 	Suppressed: java.lang.Exception: Query: ALTER TABLE schema_evolution_rhiw4er6vu REPLACE COLUMNS (int_column int, tiny_column tinyint, varchar_column varchar(20))
tests               | 		at io.trino.tempto.query.JdbcQueryExecutor.executeQueryNoParams(JdbcQueryExecutor.java:136)
tests               | 		... 19 more

@findepi findepi merged commit aa3f43e into trinodb:master Feb 9, 2023
@findepi
Copy link
Member

findepi commented Feb 9, 2023

thanks!

@github-actions github-actions bot added this to the 407 milestone Feb 9, 2023
@MiguelWeezardo MiguelWeezardo deleted the maven_offline_flag branch February 9, 2023 13:47
@MiguelWeezardo MiguelWeezardo added the no-release-notes This pull request does not require release notes entry label Mar 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed dx Issues or pull requests related to developer experience no-release-notes This pull request does not require release notes entry tests:all Run all tests
Development

Successfully merging this pull request may close these issues.

3 participants