Skip to content

GitHub Action to analyze Pull Requests for open-source supply chain issues

License

Notifications You must be signed in to change notification settings

phylum-dev/phylum-analyze-pr-action

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Phylum Analyze PR action

GitHub GitHub issues GitHub last commit Contributor Covenant

A GitHub Action to analyze dependencies with Phylum to protect your code against increasingly sophisticated attacks and get peace of mind to focus on your work.

Overview

Phylum provides a complete risk analyis of "open-source packages" (read: untrusted software from random Internet strangers). Phylum evolved forward from legacy SCA tools to defend from supply-chain malware, malicious open-source authors, and engineering risk, in addition to software vulnerabilities and license risks. To learn more, please see our website.

Once configured for a repository, this action will provide analysis of project dependencies from lockfiles or manifests during a Pull Request (PR) and output the results as a comment on the PR unless the option to skip comments is provided. The CI job will return an error (i.e., fail the build) if any of the newly added/modified dependencies from the PR fail to meet the established policy unless audit mode is specified.

There will be no note if no dependencies were added or modified for a given PR. If one or more dependencies are still processing (no results available), then the note will make that clear and the CI job will only fail if dependencies that have completed analysis results do not meet the active policy.

Prerequisites

The GitHub Actions environment is primarily supported through the use of a Docker image. The prerequisites for using this image are:

  • Ability to run a Docker container action
    • GitHub-hosted runners must use an Ubuntu runner
    • Self-hosted runners must use a Linux operating system and have Docker installed
  • Access to the phylum-dev/phylum-ci Docker image from the GitHub Container Registry
  • A GitHub token with API access
    • Not required when comment generation has been skipped
    • Can be the default GITHUB_TOKEN provided automatically at the start of each workflow run
      • Needs at least write access for pull-requests scope - see documentation
    • Can be a personal access token (PAT) - see documentation
      • Needs the repo scope or minimally the public_repo scope if private repositories are not used
  • A Phylum token with API access
    • Contact Phylum or register to gain access
    • Consider using a bot or group account for this token
    • Forked repos require the pull_request_target event, to allow secret access
  • Access to the Phylum API endpoints
    • That usually means a connection to the internet, optionally via a proxy
    • Support for on-premises installs are not available at this time

Supported Dependency Files

If not explicitly specified, an attempt will be made to automatically detect dependency files. These include both lockfiles and manifests. The basic difference is that manifests are where top-level dependencies are specified in their loose form while lockfiles contain the completely resolved collection of the abstract declarations from a manifest.

Some dependency file types (e.g., Python/pip requirements.txt) are ambiguous in that they can be named differently and may or may not contain strict dependencies. That is, they can be either a lockfile or a manifest. We call these "lockifests." Some dependency files fail to parse as the expected lockfile type (e.g., pip instead of poetry for pyproject.toml manifests).

For these situations, the recommendation is to specify the path and lockfile type explicitly in a .phylum_project file at the root of the project repository. The easiest way to do that is with the Phylum CLI, using the phylum init command and committing the generated .phylum_project file.

The Phylum Knowledge Base contains the list of currently supported lockfiles. It is also where information on lockfile generation can be found for current manifest file support.

Getting Started

Phylum analysis of dependencies can be added to existing CI workflows or on its own with this minimal configuration:

name: Phylum_analyze
on: pull_request
jobs:
  analyze_deps:
    name: Analyze dependencies with Phylum
    permissions:
      contents: read
      pull-requests: write
    runs-on: ubuntu-latest
    steps:
      - name: Checkout the repo
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Analyze dependencies
        uses: phylum-dev/phylum-analyze-pr-action@v2
        with:
          phylum_token: ${{ secrets.PHYLUM_TOKEN }}

This configuration contains a single job, with two steps, that will only run on pull request events. It provides debug output but otherwise does not override any of the phylum-ci arguments, which are all either optional or default to secure values. Let's take a deeper dive into each part of the configuration:

Workflow and Job names

The workflow and job names can be named differently or included in existing workflows/jobs.

name: Phylum_analyze                        # Name the workflow what you like
on: pull_request
jobs:
  analyze_deps:                             # Name the job what you like
    name: Analyze dependencies with Phylum  # This name is optional (defaults to job name)

Workflow trigger

The Phylum Analyze PR action expects to be run in the context of a pull_request webhook event. This includes both pull_request and pull_request_target events.

# NOTE: These are examples. Only one definition for `on` is expected.

# Specify the `pull_request` event trigger on one line
on: pull_request

# Alternative to specify `pull_request` trigger (e.g., when other triggers are present)
on:
  pull_request:

# Specify specific branches for the `pull_request` trigger to target
on:
  pull_request:
    branches:
      - main
      - develop

Allowing pull requests from forked repositories requires using the pull_request_target event since the Phylum API key is stored as a secret and the pull_request event does not provide access to secrets when the PR comes from a fork.

on:
  pull_request:
  # Allow PRs from forked repos to access secrets, like the Phylum API key
  pull_request_target:

⚠️ WARNING ⚠️

Using the pull_request_target event for forked repositories requires additional configuration when checking out the repo. Be aware that such a configuration has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal repository secrets. Please take the time to understand and mitigate the risks:

Minimal suggestions include:

  • Use a separate workflow for the Phylum Analyze PR action
  • Do not provide access to any secrets beyond the Phylum API key
  • Limit the steps in the job to two: checking out the PR's code and using the Phylum action

Permissions

When using the default GITHUB_TOKEN provided automatically at the start of each workflow run, it is good practice to ensure the actions used in the workflow are given the least privileges needed to perform their intended function. The Phylum Analyze PR actions needs at least write access for the pull-requests scope. The actions/checkout action needs at least read access for the contents scope. See the GitHub documentation for more info.

    permissions:                # Ensure least privilege of actions
      contents: read            # For actions/checkout
      pull-requests: write      # For phylum-dev/phylum-analyze-pr-action

When using a personal access token (PAT) instead, the token should be created with the repo scope or minimally with the public_repo scope if private repositories will not be used with the PAT. See the GitHub documentation for more info.

    permissions:                # Ensure least privilege of actions
      contents: read            # For actions/checkout
      # The phylum-dev/phylum-analyze-pr-action does not
      # need the `pull-requests` scope here if using a PAT

Specifying a Runner

The Phylum Analyze PR action is a Docker container action. This requires that GitHub-hosted runners use an Ubuntu runner. Self-hosted runners must use a Linux operating system and have Docker installed.

    runs-on: ubuntu-latest

Checking out the Repository

git is used within the phylum-ci package to do things like determine if there was a dependency file change and, when specified, report on new dependencies only. Therefore, a clone of the repository is required to ensure that the local working copy is always pristine and history is available to pull the requested information.

    steps:
      - name: Checkout the repo
        uses: actions/checkout@v4
        with:
          # Specifying a depth of 0 ensures all history for all branches.
          # This input may not be required when `--all-deps` option is used.
          fetch-depth: 0

Allowing pull requests from forked repositories requires using the pull_request_target event and checking out the head of the forked repository:

    steps:
      - name: Checkout the repo
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
          # Specifying the head of the forked repository's PR branch
          # is required to get any proposed dependency file changes.
          ref: ${{ github.event.pull_request.head.sha }}

⚠️ WARNING ⚠️

Using the pull_request_target event for forked repositories and checking out the pull request's code has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal repository secrets. Please take the time to understand and mitigate the risks:

Minimal suggestions include:

  • Use a separate workflow for the Phylum Analyze PR action
  • Do not provide access to any secrets beyond the Phylum API key
  • Limit the steps in the job to two: checking out the PR's code and using the Phylum action

Action Inputs

The action inputs are used to ensure the phylum-ci tool is able to perform its job.

A Phylum token with API access is required to perform analysis on project dependencies. Contact Phylum or register to gain access. See also phylum auth register command documentation and consider using a bot or group account for this token.

A GitHub token with API access is required to use the API (e.g., to post comments). It is not required when comment generation has been skipped (e.g., when in audit mode). This can be the default GITHUB_TOKEN provided automatically at the start of each workflow run but it will need at least write access for the pull-requests scope (see documentation). Alternatively, it can be a personal access token (PAT) with the repo scope or minimally the public_repo scope, if private repositories are not used.

The values for the phylum_token and github_token action inputs can come from repository, environment, or organizational encrypted secrets. Since they are sensitive, care should be taken to protect them appropriately.

The cmd arguments to the Docker image are the way to exert control over the execution of the Phylum analysis. The phylum-ci script entry point is expected to be called. It has a number of arguments that are all optional and defaulted to secure values. To view the arguments, their description, and default values, run the script with --help output as specified in the Usage section of the phylum-dev/phylum-ci repository's README or more simply view the script options output for the latest release.

    steps:
      - name: Analyze dependencies
        uses: phylum-dev/phylum-analyze-pr-action@v2
        with:
          # Contact Phylum (phylum.io/contact-us) or register (app.phylum.io/register)
          # to gain access. Consider using a bot or group account for this token. See:
          # https://docs.phylum.io/knowledge_base/api-keys
          phylum_token: ${{ secrets.PHYLUM_TOKEN }}

          # NOTE: These are examples. Specify at most one `github_token` entry line.
          #
          # Use the default `GITHUB_TOKEN` provided automatically at the start
          # of each workflow run. This entry is optional since it is the default.
          github_token: ${{ secrets.GITHUB_TOKEN }}
          # Use a personal access token (PAT)
          github_token: ${{ secrets.GITHUB_PAT }}

          # NOTE: These are examples. Only one `cmd` entry line is expected.
          #
          # Use the defaults for all the arguments and provide debug level output.
          # The default behavior is to only analyze newly added dependencies
          # against the active policy set at the Phylum project level.
          # This entry does not have to be specified since it is the default.
          cmd: phylum-ci -vv
          # Same as the previous entry, but without debug level output.
          cmd: phylum-ci
          # Consider all dependencies in analysis results instead of just the
          # newly added ones. The default is to only analyze newly added
          # dependencies, which can be useful for existing code bases that may
          # not meet established policy rules yet, but don't want to make things
          # worse. Specifying `--all-deps` can be useful for casting the widest
          # net for strict adherence to Quality Assurance (QA) standards.
          cmd: phylum-ci --all-deps
          # Force analysis for all dependencies in a manifest file.
          # This is especially useful for *workspace* manifest files where
          # there is no companion lockfile (e.g., libraries).
          cmd: phylum-ci --force-analysis --all-deps --depfile Cargo.toml
          # Some lockfile types (e.g., Python/pip `requirements.txt`) are ambiguous
          # in that they can be named differently and may or may not contain strict
          # dependencies. In these cases it is best to specify an explicit path,
          # either with the `--depfile` option or in a `.phylum_project` file:
          # https://docs.phylum.io/knowledge_base/phylum_project_files
          # The easiest way to do that is with the Phylum CLI, using the
          # `phylum init` (https://docs.phylum.io/cli/commands/phylum_init) command
          # and committing the generated `.phylum_project` file.
          cmd: phylum-ci --depfile requirements-prod.txt
          # Specify multiple explicit dependency file paths.
          cmd: phylum-ci --depfile requirements-prod.txt path/to/dependency.file
          # Exclude dependency files by gitignore-style pattern.
          cmd: phylum-ci --exclude "requirements-*.txt"
          # Specify multiple exclusion patterns.
          cmd: phylum-ci --exclude "build.gradle" "tests/fixtures/"
          cmd: |
            phylum-ci \
              --exclude "/requirements-*.txt" \
              --exclude "build.gradle" "fixtures/"
          # Perform analysis as part of an organization and/or group-owned project.
          # When an org is specified, a group name must also be specified.
          # A paid account is needed to use orgs or groups: https://phylum.io/pricing
          cmd: phylum-ci --org my_org --group my_group
          cmd: phylum-ci --group my_group
          # Analyze all dependencies in audit mode,
          # to gain insight without failing builds.
          cmd: phylum-ci --all-deps --audit
          # Install a specific version of the Phylum CLI.
          cmd: phylum-ci --phylum-release 6.5.0 --force-install
          # Mix and match for your specific use case.
          cmd: |
            phylum-ci \
              -vv \
              --org my_org \
              --group my_group \
              --depfile requirements-dev.txt \
              --depfile requirements-prod.txt path/to/dependency.file \
              --depfile Cargo.toml \
              --force-analysis \
              --all-deps

Exit Codes

The Phylum Analyze PR action will return a zero (0) exit code when it completes successfully and a non-zero code otherwise. The full and current list of exit codes is documented here.

Example Comments

NOTE: Comments will not be shown when in audit mode or when comments are explicitly skipped. Analysis output will still be available in the logs.


Phylum OSS Supply Chain Risk Analysis - FAILED

image


Phylum OSS Supply Chain Risk Analysis - INCOMPLETE WITH FAILURE

image


Phylum OSS Supply Chain Risk Analysis - INCOMPLETE

image


Phylum OSS Supply Chain Risk Analysis - SUCCESS

image


Alternatives

The default phylum-ci Docker image contains git and the installed phylum Python package. It also contains an installed version of the Phylum CLI and all required tools needed for lockfile generation. An advantage of using the default Docker image is that the complete environment is packaged and made available with components that are known to work together.

One disadvantage to the default image is its size. It can take a while to download and may provide more tools than required for your specific use case. Special slim tags of the phylum-ci image are provided as an alternative. These tags differ from the default image in that they do not contain the required tools needed for lockfile generation (with the exception of the pip tool). The slim tags are significantly smaller and allow for faster action run times. They are useful for those instances where no manifest files are present and/or only lockfiles are used.

Using the slim image tags is possible by altering your workflow to use the image directly instead of this GitHub Action. That is possible with either container jobs or container steps.

Container Jobs

GitHub Actions allows for workflows to run a job within a container, using the container: statement in the workflow file. These are known as container jobs. More information can be found in GitHub documentation: "Running jobs in a container". To use a slim tag in a container job, use this minimal configuration:

name: Phylum_analyze
on: pull_request
jobs:
  analyze_deps:
    name: Analyze dependencies with Phylum
    permissions:
      contents: read
      pull-requests: write
    runs-on: ubuntu-latest
    container:
      image: docker://ghcr.io/phylum-dev/phylum-ci:slim
      env:
        GITHUB_TOKEN: ${{ github.token }}
        PHYLUM_API_KEY: ${{ secrets.PHYLUM_TOKEN }}
    steps:
      - name: Checkout the repo
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Analyze dependencies
        run: phylum-ci -vv

The image: value is set to the latest slim image, but other tags are available to ensure a specific release of the phylum-ci project and a specific version of the Phylum CLI. The full list of available phylum-ci image tags can be viewed on GitHub Container Registry (preferred) or Docker Hub.

The GITHUB_TOKEN and PHYLUM_API_KEY environment variables are required to have those exact names. The rest of the options are the same as already documented.

Container Steps

GitHub Actions allows for workflows to run a step within a container, by specifying that container image in the uses: statement of the workflow step. These are known as container steps. More information can be found in GitHub workflow syntax documentation. To use a slim tag in a container step, use this minimal configuration:

name: Phylum_analyze
on: pull_request
jobs:
  analyze_deps:
    name: Analyze dependencies with Phylum
    permissions:
      contents: read
      pull-requests: write
    runs-on: ubuntu-latest
    steps:
      - name: Checkout the repo
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Analyze dependencies
        uses: docker://ghcr.io/phylum-dev/phylum-ci:slim
        env:
          GITHUB_TOKEN: ${{ github.token }}
          PHYLUM_API_KEY: ${{ secrets.PHYLUM_TOKEN }}
        with:
          args: phylum-ci -vv

The uses: value is set to the latest slim image, but other tags are available to ensure a specific release of the phylum-ci project and a specific version of the Phylum CLI. The full list of available phylum-ci image tags can be viewed on GitHub Container Registry (preferred) or Docker Hub.

The GITHUB_TOKEN and PHYLUM_API_KEY environment variables are required to have those exact names. The rest of the options are the same as already documented.

FAQs

💡 INFO 💡

There are more FAQs in the Phylum Knowledge Base.

Why does Phylum report a failing status check if it shows a successful analysis comment?

It is possible to get a successful Phylum analysis comment on the PR and also have the Phylum action report a failing status check. This happens when one or more dependency files fails the filtering process while at least one dependency file passes the filtering process and the Phylum analysis.

The failing status check is meant to serve as an indication to the repository owner that an issue exists with at least one of the dependency files submitted, whether they intended it or not. The reasoning is that it is better to be explicit about possible failures, allowing for review of the logs and correction, than to silently ignore the failure and possibly allow untrusted code into the repository.

There are several reasons a dependency file may fail the filtering process and each failure will be included in the logs as a warning. The file may not exist or it may exist, but only as an empty file. The file may fail to be parsed by Phylum. The dependency files can be manifests or lockfiles and they can either be provided explicitly or automatically detected when not provided. Sometimes the automatic detection will misattribute a file as a manifest or assign the wrong lockfile type. As detailed in the "Supported Dependency Files" section, the recommendation for this situation is to specify the path and lockfile type explicitly in a .phylum_project file at the root of the project repository.

Why does analysis fail for PRs from forked repositories?

Another reason why Phylum reports failing status checks is for pull_request_target events where manifests are provided. Using pull_request_target events for forked repositories has security implications if done improperly. Attackers may be able to obtain repository write permissions or steal repository secrets. A more comprehensive enumeration of the risks can be found here:

This GitHub action disables lockfile generation to prevent arbitrary code execution in an untrusted context, like PRs from forks. This means that provided manifests are unable to be parsed by Phylum since parsing first requires generating a lockfile from the manifest. A unique error code and warning message is provided so as to better signal the implication: the resolved dependencies from the manifest have NOT been analyzed by Phylum. Care should be taken to inspect changes manually before allowing a manifest to be used in a trusted context.

License

Copyright (C) 2022 Phylum, Inc.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/gpl.html or write to [email protected] or [email protected]

Contributing

Suggestions and help are welcome. Feel free to open an issue or otherwise contribute. More information is available on the contributing documentation page.

Code of Conduct

Everyone participating in the phylum-analyze-pr-action project, and in particular in the issue tracker and pull requests, is expected to treat other people with respect and more generally to follow the guidelines articulated in the Code of Conduct.

Security Disclosures

Found a security issue in this repository? See the security policy for details on coordinated disclosure.

About

GitHub Action to analyze Pull Requests for open-source supply chain issues

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published