Skip to content

Trino CI CD

Jan Waś edited this page Aug 22, 2022 · 8 revisions

Pipeline performance

Job duration

Building and testing

Attempts at reducing build time:

Product tests

The heaviest jobs are product tests because they:

  • set up complex environments: a Trino cluster, made of a coordinator and some workers, and other dependencies, like Hive
  • generate test data
  • execute a lot and/or complex queries

TODO to confirm the above, profile tests to be able to get average times for different steps.

Possible solutions:

  • go through every test and see if they can be converted to integration tests - be careful not to reduce test coverage; historically, integration tests in Trino were not using containerized dependencies, so some tests were implemented as product tests, where dependencies were running in VMs
  • for PRs that only modify selected connectors, skip product tests not using these, by checking which catalogs are configured in the PT environment; this requires:
  • Move product tests to modules they test
    • How to break dependency cycles? (GIB doesn't look at dependency scope)
    • How to make sure that Maven knows it should rebuild everything before running product tests to avoid using stale artifacts?
    • How to handle product tests which touch multiple modules? Which module should they end up in, and how do we run them when one of the other modules changes?
    • Convert product tests to run as regular Maven tests so GIB can handle running only the affected product tests.

TODO refine these notes:

  • split up suites to have more jobs, at least for different connectors (by group?) - everyone needs to clearly see if a test was run or not
  • if we make it easier to run selected PTs locally, devs would start doing this and not relying completely on the pipeline
  • reports: if we skip any tests in a PR we should make sure they run on master after merge - compare values from surefire reports
  • PT launcher should be decoupled from envs and configs - this should make it easier to navigate there
  • do we want to allow excluding tests from suites when running manually?
  • we can pass options to PTs to run tests against different versions
  • PTs are supposed to allow testing a lot of different configurations, like different auth

Queue

Trino's CI/CD pipelines are implemented as Github Workflows that use Github's public runners. Because there's a limit of concurrent jobs, runs are often queued for a long time before they execute.

The solution is to make the jobs shorter and (in PRs only) skip ones not relevant to the change.

  • Make CI only build/test modules with any changes
  • product tests cannot use Maven to identify dependencies, so they need to get a list of impacted connectors to skip irrelevant tests/groups/suites
  • an attempt at automatically identifying product test dependencies by collecting code coverage: Enable Jacoco coverage for product tests - this generates a lot of overhead (20% more wall time), requires to be cached somewhere and check for changes in any product tests

WIP PRs

  • how to avoid running full pipelines when the author of the PR knows it's not ready - use labels?
  • start pipelines by a bot instead of workflow triggers?
Clone this wiki locally