Headless release #5554

tom-seqera · 2024-12-02T15:57:09Z

General design

To try to keep things understandable and maintainable, this change takes the following approach:

Avoid putting everything in a single giant bash script
Try to avoid embedded scripting in the Github Action definitions, since these are harder to test/debug/reuse
Separate the process into more distinct "assemble" and "deploy" phases, and only build the release artifacts once

I've also tried to make some simplifications:

Reduce the surface area of the project's Makefile - removing make targets which were only used for manual releases, hopefully making it a bit less confusing.
Remove some layers of gradle indirection - in particular, some release-related gradle tasks which were just wrapping or emulating shell scripts have been replaced by a shell script.

Usage

Update the version, launcher and changelog files (for Nextflow & plugins) locally, without committing
Run the make release command locally
Wait for the Release workflow to complete
Update the Nextflow Github release page with the release notes

Github Action

The approach is to use Github Actions for orchestration, but keep the execution details in bash scripts. The Github Actions structure (Workflow > Job > Step) gives us some flexibility for how to approach the orchestration:

Add a job to the existing `build.yml` workflow, or create a separate `release.yml` workflow

The release process is quite complicated, with multiple artifacts being deployed to multiple different locations/systems. Given that the existing build workflow is already quite long and complex, a separate release.yml workflow file is used.

A single job, or multiple jobs

Github will automatically run jobs in parallel where possible, whereas running multiple steps in parallel is a bit more awkward. Jobs can also specify dependencies on each other, which Github will use to draw a nice diagram of the workflow:

Jobs are more independent than steps, meaning splitting the workflow into jobs more cleanly separates the different release tasks and the credentials/artifacts required for each one, making the process easier to understand.

This approach does require a bit more yaml boilerplate: each job needs to checkout the code, initialise tooling, etc. Structuring the workflow as a single job would remove some of the yaml boilerplate, but also removes the nice diagram and makes it bit less clear what dependencies and credentials are required for some steps.

Some alternative Github Actions structures are shown in this draft PR and this other draft PR

Workflow trigger

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

I couldn't find a way to create a workflow-level filter based the commit message, only to filter the first job. This would mean the action would still "run" on every push, but then skip all the jobs, which would create a lot of empty skipped workflows in Github.

Instead, release.yml uses a workflow_dispatch trigger which is explicitly activated by a 'Release' job in the existing build.yml when a [release] commit is detected, similar to the Wave process.

Limitations

These changes don't represent full automation of nextflow releases. This is more like a headless version of the build and deploy steps from the current manual release process. It still requires a number of manual steps, notably:

Before running make release:

The VERSION file must be updated manually
The nextflow launch script must also be manually updated to match the VERSION file
The changelog.txt file must be manually updated with change notes
For each core plugin which needs to be released:
- its MANIFEST.MF file must be manually updated with the new version
- its changelog.txt file must be manually updated with change notes

After the release workflow:

Review and publish the draft Github release - the release workflow will try to copy relevant notes for the version from changelog.txt to the automatically created draft Github release

As part of this change, jars are no longer published to maven central - only to the seqera maven repository.

The build-info.properties file contains build metadata including timestamp and commit id and is currently committed to the repo - meaning that always contain incorrect data (ie the commit id before [release]). To improve the accuracy of the metadata in the released Nextflow jars, this release workflow re-generates the build-info file when assembling them. However it doesn't commit it back to the repo since this would create an awkward second release commit after the [release] one.

The existing docker image build downloads the Nextflow runtime using the launcher rather than copying in the assembled artifacts. I'm not sure why this is, so have left it unchanged for now - but it does currently impose a strict ordering requirment between uploading the jars to S3 and building the docker image.

Failure recovery

Rather than implement complex retry/force logic in the workflow or try to anticipate all the possible failure scenarios, an additional strength of using small individual shell scripts for each task is that any specific failure can be debugged and re-run manually as needed. In order to avoid rebuilding the artifacts, they can be downloaded from the Github workflow summary page.

This should be especially useful during the initial transition from manual to automated release while we iron out kinks in the process.

Configuration

This release workflow requires a number of repository variables and secrets to be configured. Although some of them seem to duplicate existing secrets use by the CI build (eg AWS_ACCESS_KEY_ID vs AWS_DEPLOY_ACCESS_KEY_ID), my recommendation would be to use different variables/secrets to better control the different permissions required for a CI build vs a release. Different Github environments could also be used on the project to restrict the release workflow and credentials to specifc branches.

TODO list all the required secrets/variables and their values/permissions.

Future enhancements

Add some validation to the make release script to check that the versions match each other, etc
Replace the remaining manual steps (like updating version files and changelogs) with improved automation
Remove build-info.properties from the repo, instead only generating during release and not committing
Move the creation of the [release] commit inside the release workflow, fully automating the process
Create automate frequent unstable preview builds for testing (eg nightly or weekly)

Signed-off-by: Tom Sellman <[email protected]>

The target repository is an S3 bucket, jars are no longer published to maven central repo.x Signed-off-by: Tom Sellman <[email protected]>

Signed-off-by: Tom Sellman <[email protected]>

netlify · 2024-12-02T15:57:30Z

✅ Deploy Preview for nextflow-docs-staging canceled.

Name	Link
🔨 Latest commit	`efbb785`
🔍 Latest deploy log	https://app.netlify.com/sites/nextflow-docs-staging/deploys/675c176229187900086a6e67

Signed-off-by: Tom Sellman <[email protected]>

… release notes Signed-off-by: Tom Sellman <[email protected]>

pditommaso · 2024-12-10T13:46:40Z

I guess this should marked ready "Ready for review", right?

pditommaso · 2024-12-10T13:56:01Z

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

The release script should make the commit, the other way around, it should be assumed the script is invoked by the release commit.

tom-seqera · 2024-12-10T14:25:54Z

Following the current project convention, the release workflow is triggered by a commit with message containing the text [release]. The commit is created by the make release command.

The release script should make the commit, the other way around, it should be assumed the script is invoked by the release commit.

I think that is how it works. The release commit is still what triggers the Github Action to perform the release. I just created a little make release target which someone doing a release can run locally to guide them through creating and pushing the release commit:

On local machine run make release
make release runs a local helper script which shows you the version you're about to release, and tells you what files you should modify if it's not right
You confirm to the local helper script that you want to do the release
The local helper script creates and pushes a commit with message [release]
The Github Action detects the release commit and executes the release workflow, building and deploying the artifacts

I think this local helper script is useful because it provides a nice "entrypoint" to the release process, and acts as documentation for both the required manual steps and how the workflow is triggered.

Over time we can gradually improve the local helper script to automate more of the remaining manual steps. For example, it could prompt for which plugins to release and what versions they should be, and then automatically update the relevant manifest files before creating the release commit.

pditommaso · 2024-12-10T15:33:24Z

Fair enough, it may be useful. Then what's the main entry for the headless part? ~~gradle makeDigest ?~~

tom-seqera · 2024-12-10T15:47:12Z

The headless part is in release.yml, which runs as a workflow in Github Actions. It's triggered by a job in build.yml when a release commit is detected.

pditommaso · 2024-12-10T19:23:58Z

.github/scripts/release.sh

+set -e
+
+# build artifacts
+make distribution


Is this the make file in the project root?

Ah yes. As you can probably tell I haven't actually tested this all-in-one script yet - it was just to show the concept.

pditommaso · 2024-12-10T19:26:04Z

.github/workflows/build.yml

+          GH_ORG: ${{ vars.PLUGINS_GITHUB_ORG }}
+          GH_USER: ${{ vars.DEPLOY_GITHUB_USER }}
+          GH_USER_EMAIL: ${{ vars.DEPLOY_GITHUB_EMAIL }}
+          GH_TOKEN: ${{ secrets.DEPLOY_GITHUB_TOKEN }}
+          MAVEN_PUBLISH_URL: ${{ vars.MAVEN_PLUGINS_PUBLISH_URL }}
+          PLUGINS_INDEX_JSON: ${{ vars.PLUGINS_INDEX_JSON }}
+          S3_RELEASE_BUCKET: ${{ vars.S3_RELEASE_BUCKET }}
+          SEQERA_REGISTRY: ${{ vars.SEQERA_PUBLIC_CR_URL }}


let's add a sensible default for all of these in the release.sh e.g. export GH_ORG=${GH_ORG:-nextflow-io}

pditommaso · 2024-12-10T19:28:16Z

.github/scripts/release.sh

Since there are a bunch of scripts it may be convenient to move all of them under <ROOT>/release and call this main.sh.

pditommaso

It looks an excellent work. Is this uploading the plugins to the respective GH projects?

Somehow unrelated, it loos like the DCO bot is not happy with your commit signature.

tom-seqera · 2024-12-11T09:54:08Z

Is this uploading the plugins to the respective GH projects?

Yes, to control the scope of the changes plugins are still uploaded to their own Github projects. I think we should treat alternative upload destinations as a separate piece of work.

Signed-off-by: Tom Sellman <[email protected]>

(and do some testing of the github action/scripts) Signed-off-by: Tom Sellman <[email protected]>

build.gradle

pditommaso · 2024-12-18T14:23:50Z

buildSrc/src/main/groovy/io/nextflow/gradle/tasks/GithubRepositoryPublisher.groovy

                .setPrettyPrinting()
                .disableHtmlEscaping()
                .create()
                .toJson(mainIndex)
+        } else {
+            null


Suggested change

null

return null

better adding a return otherwise it could be even interpreted as a type

pditommaso · 2024-12-18T14:24:02Z

buildSrc/src/main/groovy/io/nextflow/gradle/tasks/GithubRepositoryPublisher.groovy

@@ -97,11 +101,15 @@ class GithubRepositoryPublisher extends DefaultTask {
            }
        }

-        new GsonBuilder()
+        if ( modified ) {
+            new GsonBuilder()


Suggested change

new GsonBuilder()

return new GsonBuilder()

pditommaso · 2024-12-18T14:27:17Z

plugins/build.gradle

+    indexUrl = System.getenv('PLUGINS_INDEX_JSON') ?: 'https://github.com/nextflow-io/plugins/main/plugins.json'
    repos = allPlugins()
-    owner = github_organization
-    githubUser = github_username
-    githubEmail = github_commit_email
-    githubToken = github_access_token
+    owner = System.getenv('GH_ORG') ?: 'nextflow-io'
+    githubUser = System.getenv('GH_USER') ?: project.findProperty('github_username')
+    githubEmail = System.getenv('GH_USER_EMAIL') ?: project.findProperty('github_commit_email')
+    githubToken = System.getenv('GH_TOKEN') ?: project.findProperty('github_access_token')


Was thinking how to avoid the proliferation of usage of env variables in the Gradle build. What is the main.sh write a temporary gradle.properties file mapping the variables to the corresponding gradle properties ?

I've had a go at using a temporary gradle.properties but I don't think it's a good idea. It seems like it would be complex and fragile for very minimal benefit.

Ideally:

running in CI, the release should use the values from github (ie the env vars)

running locally, any values from $GRADLE_USER_HOME/gradle.properties should take priority

we don't want to risk accidentally committing secrets to the repo

we should try to keep things relatively simple to understand

One option would be to write the env vars into a gradle.properties in the project root. This would make them available to gradle, but anything in $GRADLE_USER_HOME/gradle.properties would still take priority thanks to gradle's built-in precedence rules. To make this safe, we'd need to add gradle.properties to .gitignore. But there's already a committed gradle.properties in the project so we'd have to modify it and then somehow guarantee that it would also remove any secrets from it (including in failure scenarios) to avoid accidentally committing the changes - which is much too risky.

Another option would be to write to properties file with a different name (eg .gradle.release.properties) which we could explicitly .gitignore and load into the gradle build if detected. The problem with this is I think those properties would then override anything from $GRADLE_USER_HOME so we'd have to add logic to only load properties which don't already exist, or only create the properties file if running in CI - at which point it seems like a lot of unecessary complexity.

I think we should also take care to try to keep release/main.sh as just an orchestration script and ensure that the sub-scripts can be run independently rather than assuming they will always run as part of a main script. This will be important for testing/debugging, fixing release errors, and for future flexibility.

I do think it's worth switching the priority order in the gradle scripts though, so that it looks for a gradle property first and then falls back to an env var:

githubUser = project.findProperty('github_username') ?: System.getenv('GH_USER')

This way any values in $GRADLE_USER_HOME/gradle.properties will take precedence.

One option would be to write the env vars into a gradle.properties in the project root.

How? or do you mean using only gradle properties?

Yeah, I was just thinking through options for ways to make main.sh write a temporary gradle properties file.

May be we should reverse the logic and keep all params (except secrets) in the project gradle.properties.

Those could be "imported" in the shell script with basic helper, e.g. (courtesy chatgpt)

# Read the property file line by line while IFS="=" read -r key value; do # Skip empty lines and lines starting with # (comments) [[ -z "$key" || "$key" == \#* ]] && continue # Convert the key: replace '.' with '_' and convert to uppercase env_var_name=$(echo "$key" | tr '.' '_' | tr '[:lower:]' '[:upper:]') # Export the variable as an environment variable export "$env_var_name=$value" echo "Exported: $env_var_name=$value" done < "$PROPERTY_FILE"

Secrets could be kept both in the user gradle properties or env vars. GitHub actions secrets could be easily added to project gradle.properties.

Bit clunky but it could make the trick.

tom-seqera added 5 commits December 2, 2024 11:17

Headless release: basic github action structure

100c88b

Signed-off-by: Tom Sellman <[email protected]>

Headless release: upload distribution to S3

1a5c47f

Signed-off-by: Tom Sellman <[email protected]>

Headless release: docker build and push

8c50160

Signed-off-by: Tom Sellman <[email protected]>

Headless release: publish jars to maven

83c817c

The target repository is an S3 bucket, jars are no longer published to maven central repo.x Signed-off-by: Tom Sellman <[email protected]>

Headless release: publish plugins & update index json

03f70a2

Signed-off-by: Tom Sellman <[email protected]>

tom-seqera added 2 commits December 4, 2024 10:42

Headless release: wire everything up

13341be

Signed-off-by: Tom Sellman <[email protected]>

Headless release: Automatically copy from 'changelog.txt' into github…

19f01c4

… release notes Signed-off-by: Tom Sellman <[email protected]>

tom-seqera force-pushed the headless-release branch from 8c67025 to 19f01c4 Compare December 4, 2024 10:53

This was referenced Dec 4, 2024

Headless release (onejob) #5568

Closed

Headless release (onescript) #5569

Merged

bentsherman added the github actions Pull requests that update GitHub Actions code label Dec 9, 2024

tom-seqera marked this pull request as ready for review December 10, 2024 13:48

pditommaso reviewed Dec 10, 2024

View reviewed changes

pditommaso requested changes Dec 10, 2024

View reviewed changes

tom-seqera added 2 commits December 13, 2024 11:03

Headless release: alternative single script workflow in build.yml

466df18

Signed-off-by: Tom Sellman <[email protected]>

Move release scripts to their own 'release' dir

efbb785

(and do some testing of the github action/scripts) Signed-off-by: Tom Sellman <[email protected]>

tom-seqera force-pushed the headless-release branch from 1b3bfea to efbb785 Compare December 13, 2024 11:15

pditommaso reviewed Dec 13, 2024

View reviewed changes

build.gradle Show resolved Hide resolved

pditommaso reviewed Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Headless release #5554

Headless release #5554

tom-seqera commented Dec 2, 2024 •

edited

Loading

netlify bot commented Dec 2, 2024 •

edited

Loading

pditommaso commented Dec 10, 2024

pditommaso commented Dec 10, 2024

tom-seqera commented Dec 10, 2024

pditommaso commented Dec 10, 2024 •

edited

Loading

tom-seqera commented Dec 10, 2024

pditommaso Dec 10, 2024

tom-seqera Dec 11, 2024

pditommaso Dec 10, 2024

pditommaso Dec 10, 2024

pditommaso left a comment

tom-seqera commented Dec 11, 2024

pditommaso Dec 18, 2024

pditommaso Dec 18, 2024

pditommaso Dec 18, 2024

tom-seqera Dec 19, 2024 •

edited

Loading

tom-seqera Dec 19, 2024

pditommaso Dec 19, 2024

tom-seqera Dec 19, 2024

pditommaso Dec 19, 2024

Headless release #5554

Are you sure you want to change the base?

Headless release #5554

Conversation

tom-seqera commented Dec 2, 2024 • edited Loading

General design

Usage

Github Action

Add a job to the existing build.yml workflow, or create a separate release.yml workflow

A single job, or multiple jobs

Workflow trigger

Limitations

Failure recovery

Configuration

Future enhancements

netlify bot commented Dec 2, 2024 • edited Loading

✅ Deploy Preview for nextflow-docs-staging canceled.

pditommaso commented Dec 10, 2024

pditommaso commented Dec 10, 2024

tom-seqera commented Dec 10, 2024

pditommaso commented Dec 10, 2024 • edited Loading

tom-seqera commented Dec 10, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pditommaso left a comment

Choose a reason for hiding this comment

tom-seqera commented Dec 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tom-seqera Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tom-seqera commented Dec 2, 2024 •

edited

Loading

Add a job to the existing `build.yml` workflow, or create a separate `release.yml` workflow

netlify bot commented Dec 2, 2024 •

edited

Loading

pditommaso commented Dec 10, 2024 •

edited

Loading

tom-seqera Dec 19, 2024 •

edited

Loading