Trino CI CD

Pipeline performance

Job duration

Building and testing

Attempts at reducing build time:

merged Only enable check plugins if air.check.skip-* is false - compilation (with checks) and test execution is done in separate steps, so test execution could skip checks
merged Run git commit id Maven plugin only once - this plugin was being executed for every module
merged Enable Maven cache in the CI workflow - TODO run the same query as in the description to see what impact it actually had (compare before and after 10th Dec)
closed Speed up builds of freshly checked out branches - this PR tested the impact of adding more Maven flags, like -Dmaven.source.skip=true -Dmaven.site.skip=true -Dmaven.javadoc.skip=true, which is negligible; it also tested disabling test compilation, which turned out not to be possible with a clean local repository

Product tests

The heaviest jobs are product tests because they:

set up complex environments: a Trino cluster, made of a coordinator and some workers, and other dependencies, like Hive
generate test data
execute a lot and/or complex queries

TODO to confirm the above, profile tests to be able to get average times for different steps.

Possible solutions:

go through every test and see if they can be converted to integration tests - be careful not to reduce test coverage; historically, integration tests in Trino were not using containerized dependencies, so some tests were implemented as product tests, where dependencies were running in VMs
for PRs that only modify selected connectors, skip product tests not using these, by checking which catalogs are configured in the PT environment; this requires:
- mapping Maven modules to connector names: https://github.com/trinodb/trino/compare/master...nineinchnick:plugin-features
- identifying connectors used in PT environments: https://github.com/trinodb/trino/pull/10761 and/or https://github.com/trinodb/trino/pull/10730
- skipping PTs by passing list of impacted connectors to the Product Test Launcher (PTL)
Move product tests to modules they test
- How to break dependency cycles? (GIB doesn't look at dependency scope)
- How to make sure that Maven knows it should rebuild everything before running product tests to avoid using stale artifacts?
- How to handle product tests which touch multiple modules? Which module should they end up in, and how do we run them when one of the other modules changes?
- Convert product tests to run as regular Maven tests so GIB can handle running only the affected product tests.

TODO refine these notes:

split up suites to have more jobs, at least for different connectors (by group?) - everyone needs to clearly see if a test was run or not
if we make it easier to run selected PTs locally, devs would start doing this and not relying completely on the pipeline
reports: if we skip any tests in a PR we should make sure they run on master after merge - compare values from surefire reports
PT launcher should be decoupled from envs and configs - this should make it easier to navigate there
do we want to allow excluding tests from suites when running manually?
we can pass options to PTs to run tests against different versions
PTs are supposed to allow testing a lot of different configurations, like different auth

Queue

Trino's CI/CD pipelines are implemented as Github Workflows that use Github's public runners. Because there's a limit of concurrent jobs, runs are often queued for a long time before they execute.

The solution is to make the jobs shorter and (in PRs only) skip ones not relevant to the change.

Make CI only build/test modules with any changes
product tests cannot use Maven to identify dependencies, so they need to get a list of impacted connectors to skip irrelevant tests/groups/suites
an attempt at automatically identifying product test dependencies by collecting code coverage: Enable Jacoco coverage for product tests - this generates a lot of overhead (20% more wall time), requires to be cached somewhere and check for changes in any product tests

WIP PRs

how to avoid running full pipelines when the author of the PR knows it's not ready - use labels?
start pipelines by a bot instead of workflow triggers?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly