2023-09-15

Secure by Default Coding Practices for CI

Securing software supply chain using secure by default coding practices. Best practices on creating workflows and plugins for GitHub Actions, CirlcleCI, Buildkite and Azure Pipelines.

The saying to treat your production pipeline like a production system is gaining traction recently in the context of software supply chain security. Yet, the practices surrounding the development of software pipelines is fundamentally different than how typical software is made. So far, I never had to secure a production system entirely written in Bash, although, regardless of the technology used in production, shell scripts and Bash are often found in the process of bringing software to production.

Multiple CI platforms offer a similar feature set that allows organizations to automate the software delivery process. Nowadays, software pipelines are gaining much more important responsibilities than ever before as the industry adopts practices such as GitOps and infrastructure as Code (IaC), making them a prime target for attacks.

All CI platforms have implemented their own format to define pipelines. In order to build them more easily, third-party CI plugins can be used as building blocks to execute common tasks such as installing dependencies, caching and linting. However, guidance on securely creating workflows and plugins as well as scanner coverage to detect issues is inconsistent across CI providers. As a result, most workflows and CI plugins use coding practices that are risky. In this article, we are going over a few code patterns that are both ubiquitous and dangerous across CI platforms and suggesting ways to write pipelines that avoid these issues entirely.

Variable Substitution

To define a CI pipeline, a YAML configuration file is typically added to a repository that contains the steps needed to build, test or release software. The steps can be written in multiple ways, but often they are using a scripting language such as Bash or JavaScript. A static file can be a constraint to define complex pipelines. As such, CI providers offer a way to compile the configuration file before executing it using a templating language. As you may imagine, mixing a templating language with a scripting language is a recipe for command injection. This type of issue is pervasive as most CI providers have, unfortunately, designed their workflow manifests in a way that makes it very easy to introduce this class of vulnerability.

In GitHub Actions, a workflow step with this issue looks like the following:

# ❌ Avoid using variables in bash steps
steps:
- run: |
    echo ${{ steps.setup.outputs.value }}

In this case, if the variable ${{steps.setup.outputs.value}} happens to contain Bash code, it will be executed in the context of the pipeline.

This issue is widespread across CI providers, but its impact is highly dependent on the role of the pipeline and the context of the injection. When applying the principle of least privileges to a CI system, the blast radius can be limited to only access to the source code and other metadata present in the execution environment. More attention should be spent on production pipelines with elevated permissions granted through long-lived credentials or using the CI’s OIDC provider to access cloud resources. Even if the pipelines require a code review before any modifications, code injections can be used to tamper with the build environment to poison build artifacts or exfiltrate sensitive information.

Finding these exploitable variable substitution bugs can be tricky. Recently, a team of researchers at Purdue University and North Carolina State University have published a paper in which they analyzed millions of workflows and thousands of GitHub Actions using their open-source tool called Argus. Their research found numerous critical code injection vulnerabilities. Such tooling for other CI providers is currently lacking.

Recommendation

To prevent this class of vulnerability, adopt safe by default code patterns that avoid any use of variables inside scripts or commands. This can generally be done by placing the variable in an environment variable.

GitHub Actions

Variable syntax: ${{ … }}

env:
  PARAM_INPUT: ${{ inputs.username }}
run: |
  echo "$PARAM_INPUT"

Variable substitution should also be avoided in the parameters of actions that executes scripts such as actions/github-script:

uses: actions/github-script@v6
env:
  PARAM_INPUT: ${{ inputs.username }}
with:
  script: |
    const { PARAM_INPUT } = process.env
    console.log(PARAM_INPUT);

CircleCI

Variable syntax: << … >>

run:
  command: |
    echo "$PARAM_INPUT"
  environment:
    PARAM_INPUT: <<parameters.input>>

Buildkite

Variable syntax: $VARIABLE

Buildkite recommends to use scripts instead of commands that use variables to avoid disclosing sensitive values to Buildkite. If you must use variables, consider escaping it so that it is evaluated at runtime instead.

steps:
# ❌ API_KEY is sent in plaintext to Buildkite & can evaluate as Bash
- command: ./linter "$API_KEY"

# ✅ API_KEY is evaluated at runtime
- command: ./linter "$$API_KEY"

The Buildkite Agent can also be configured to disable interpolation entirely.

Azure Pipelines

Variable syntax:

Compile time variable: $( … )
Compile time expression: ${{ … }}

steps:
- script: |
    echo "$PARAM_INPUT"
  env:
    PARAM_INPUT: $(variable)

- bash: |
    echo "$PARAM_INPUT"
  env:
    PARAM_INPUT: $(variable)

- task: Bash@3
  env:
    PARAM_INPUT: $(variable)
  inputs:
    targetType: inline
    script: |
      echo "$PARAM_INPUT"

Writing Bash Securely

With variable substitution out of the way, the focus can be shifted on writing pipeline steps securely. This is easier said than done with scripting languages like Bash. A common Bash mistake is running a command that includes a variable that can contain user-input. If the variable contains spaces, each part of the string is interpreted as a command argument. Depending on the command being executed, this could compromise the build environment.

# ❌ Avoid using variables without quotes
docker run --rm -v /work:/work $INPUT_IMAGE

In this example, given this is a Docker command, additional command line flags could be passed to expose sensitive environment variables or mount additional volumes in the container.

docker run --rm -v /work:/work -v /:/rootfs -e SECRET_TOKEN image

The risk of such an issue once again depends on the context and the command being executed. To avoid this risk altogether, it is best to default to always surround variables in commands with double-quotes.

# ✅ Prefer always quoting arguments that contain a variable
docker run --rm -v /work:/work "$INPUT_IMAGE"

Don’t Eval

In CircleCI Orbs, there is a recurring code pattern that allows the user to set an Orb parameter either to literal value or the name of an environment variable from which the value should be read from. While this functionality is convenient, its implementation using eval opens up the possibility of code execution.

# ❌ Avoid evaluating potential user-input
command: |
  API_KEY_VALUE=$(eval echo "${PARAM_API_KEY}")

environment:
  PARAM_API_KEY: <<parameters.api-key>>

One way to safely accomplish this functionality is by using Bash’s variable indirection. This allows to read a variable by name, or if it does not exist, expand to the literal value provided as parameter. Using this technique, code execution can be avoided entirely.

# ✅ Prefer using Bash variable indirection
command: |
  API_KEY_VALUE="${!PARAM_API_KEY:-$PARAM_API_KEY}"
environment:
  PARAM_API_KEY: <<parameters.api-key>>

Recommendation

Ensure best practices are applied when writing bash by using a scanner such as ShellCheck and Semgrep.
If possible, use a higher level language to create CI tasks so that you can leverage existing software development best practices to ensure quality and security. For instance, GitHub Actions and Azure Pipelines both have a TypeScript SDK to create actions.

Unknown Supply Chain Components

Back in 2021, Codecov’s installation script used in pipelines of many high profile organizations was maliciously modified to extract sensitive credentials, highlighting the impact of unsafe software sourcing on supply chain security.

When a CI plugin needs to process JSON, they may pull a utility binary, such as JQ, to help accomplish this task. More often than not, this is done without checking the integrity of the downloaded artifact. The lack of visibility and control on these hidden dependencies is worrying.

Unsafe software sourcing can take many forms. When reviewing a pipeline’s dependency, there several supply chain components to consider that can affect build environment. Ensure these components are trusted and that their integrity is ensured throughout the pipeline.

CI plugins (GitHub Actions, Azure Pipeline Tasks, Buildkite Plugins, CircleCI Orbs)
Container images
External resources or repositories (binary, scripts, configuration files)
Packages, libraries (ie: NPM, Pip, RubyGems, etc)

Recommendation

Verify checksums or signatures before consuming software.
Keep an inventory of supply chain components.
Review critical software pipelines and their dependencies.
Ensure CI Plugins are pinned to a specific digest if possible. Avoid using loose version constraints for CI Plugins that only support semantic versioning.

GitHub Actions

Action can be loaded from any Git reference. It is preferable to use a commit SHA to ensure the content does not change. Branches and tags can be updated.

# ❌ Avoid using the main branch directly
- uses: org/action@main

# ❌ Avoid using mutable tags
- uses: org/action@v1

# ✅ Prefer resolving tags/branches to a commit SHA
- uses: org/action@e69d57023b032c41dd275d26d64989dcf2ed1803 # v1

CircleCI

orbs:
  # ❌ Avoid using the latest version directly
  orb-latest: org/orb@volatile

  # ❌ Avoid using the development version
  orb-dev: org/orb@dev

  # ⚠️ Loose version constraint can change unexpectedly
  orb: org/orb@1

  # ✅ Prefer using a specific Orb version
  orb: org/[email protected]

Buildkite

Buildkite downloads plugins from GitHub by default, but can also download any Git repository. Similar to GitHub Actions, it is best to use a commit than mutable references like branches or tags.

plugins:
# ❌ Avoid using the main branch directly
- org/repo#main: {}

# ❌ Avoid using mutable tags
- org/repo#v1: {}

# ✅ Prefer resolving tags/branches to a commit SHA
- org/repo#e69d57023b032c41dd275d26d64989dcf2ed1803: {} # v1

Azure Pipelines

steps:
# ⚠️ Loose version constraint can change unexpectedly
- Task: SomeTask@1

# ✅ Prefer using a specific Task version
- Task: [email protected]

Developing secure CI pipelines is challenging. CI providers unfortunately make it too easy to introduce supply chain security risks, putting the burden on the user. By adopting secure by default coding practices that ensure pipelines execute in predictable ways, we can avoid multiple threat scenarios that can compromise a build process.