Skip to content

Template for getting started with Hybrid Dagster Cloud

Notifications You must be signed in to change notification settings

dagster-io/dagster-cloud-hybrid-quickstart

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dagster Cloud Hybrid Deployment Quickstart

This template lets you get started using Dagster Cloud with a Hybrid agent.

Note It is recommended to first deploy the example project included in this repository and then replace it with your own Dagster project.

Pre-requisites

What you need to start using this template:

  1. A Dagster Cloud account set up for Hybrid deployments.

  2. A Hybrid agent up and running.

  3. A Docker container registry accessible from the hybrid agent and from your GitHub workflows.

Step 1. Create a new repository from this template

Click the Use this Template button and provide details for your new repo.

Screen Shot 2022-07-06 at 7 24 02 AM

Step 2. Add your Docker registry to dagster_cloud.yaml

The dagster_cloud.yaml file defines the configuration for building and deploying your code locations. For the quickstart_etl, specify the Docker registry in the registry: key:

registry: <account-id>.dkr.ecr.us-west-2.amazonaws.com/<image-name>

Step 3. Modify the GitHub Workflow

Edit the GitHub Workflow at .github/workflows/dagster-cloud-deploy.yml to configure your Dagster Cloud account as well as Docker registry access.

  1. Set the DAGSTER_CLOUD_ORGANIZATION environment to the name of your Dagster Cloud organization. If you access Dagster Cloud at https://acme.dagster.cloud then your organization is acme.

    # The organization name in Dagster Cloud
    DAGSTER_CLOUD_ORGANIZATION: "<organization-name>"

  2. Set the IMAGE_REGISTRY environment to the same registry specified in dagster_cloud.yaml:

    # The IMAGE_REGISTRY should match the 'registry:'' in dagster_cloud.yaml
    IMAGE_REGISTRY: "<account-id>.dkr.ecr.us-west-2.amazonaws.com/dagster-cloud-image"

  3. Uncomment one of the options for logging into the Docker registry:

    # Building and deploying the docker image requires a login step specific to the container
    # registry.
    # Multiple examples are provided below.
    # # AWS ECR
    # # https://github.com/aws-actions/amazon-ecr-login
    # - name: Configure AWS credentials
    # if: steps.prerun.outputs.result != 'skip'
    # uses: aws-actions/configure-aws-credentials@v2
    # with:
    # aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    # aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    # aws-region: ${{ secrets.AWS_REGION }}
    # - name: Login to ECR
    # if: steps.prerun.outputs.result != 'skip'
    # uses: aws-actions/amazon-ecr-login@v1
    # # DockerHub
    # # https://github.com/docker/login-action#docker-hub
    # - name: Login to Docker Hub
    # if: steps.prerun.outputs.result != 'skip'
    # uses: docker/login-action@v1
    # with:
    # username: ${{ secrets.DOCKERHUB_USERNAME }}
    # password: ${{ secrets.DOCKERHUB_TOKEN }}
    # # GitHub Container Registry
    # # https://github.com/docker/login-action#github-container-registry
    # - name: Login to GitHub Container Registry
    # if: steps.prerun.outputs.result != 'skip'
    # uses: docker/login-action@v1
    # with:
    # registry: ghcr.io
    # username: ${{ github.actor }}
    # password: ${{ secrets.GITHUB_TOKEN }}
    # # GCR
    # # https://github.com/docker/login-action#google-container-registry-gcr
    # - name: Login to GCR
    # if: steps.prerun.outputs.result != 'skip'
    # uses: docker/login-action@v1
    # with:
    # registry: gcr.io
    # username: _json_key
    # password: ${{ secrets.GCR_JSON_KEY }}

Step 4. Set up secrets

Set up secrets on your newly created repository by navigating to the Settings panel in your repo, clicking Secrets on the sidebar, and selecting Actions. Then, click New repository secret. The following secrets are needed.

Name Description
DAGSTER_CLOUD_API_TOKEN An agent token, for more details see the Dagster Cloud docs.
Docker access secrets Depending on which Docker registry you are using, you must define the credentials listed in the workflow file.

Here is an example screenshot showing the secrets for AWS ECR.

image

Step 5. Verify builds are successful

At this point, the workflow run should complete successfully and you should see the quickstart_etl location in https://dagster.cloud. If builds are failing, ensure that your secrets are properly set up.

image

Add or modify code locations

Once you have the quickstart_etl example deployed, you can replace the sample code with your Dagster project. You will then need to update the dagster_cloud.yaml file:

  1. Update dagster_cloud.yaml. See documentation for details.

  2. If you have more than one code location, duplicate the build-docker-image and the "ci set-build-output" steps in dagster-cloud-deploy.yaml for the new code locations.

Advanced customization

Disable branch deployments

Branch Deployments are enabled by default. To disable them comment out the for your Hybrid agent, comment out the pull_request section in dagster_cloud.yaml:

pull_request: # For branch deployments
types: [opened, synchronize, reopened, closed]

Customize the Docker build process

A standard Dockerfile is included in this project and used to build the quickstart_etl. This file is used by the build-push-action:

- name: Build and upload Docker image for "example_location"
if: steps.prerun.outputs.result != 'skip'
uses: docker/build-push-action@v4
with:
context: .
push: true
tags: ${{ env.IMAGE_REGISTRY }}:${{ env.IMAGE_TAG }}-example-location

To customize the Docker image, modify the build-push-action and update the Dockerfile as needed:

  • To use a different directory for the Dockerfile, use the context: input. See build-push-action for more details.
  • To reuse a Docker image for multiple code locations, use a single build-push-action and multiple "ci set-build-output" steps, all using the same image tag.

Deploy a subset of code locations

The ci-init step accepts a location_names input string containing a JSON list of location names to be deployed. To deploy only specific locations provide the location_names: input, for example:

      - name: Initialize build session
        id: ci-init
        if: steps.prerun.outputs.result != 'skip'
        uses: dagster-io/dagster-cloud-action/actions/utils/[email protected]
        with:
          project_dir: ${{ env.DAGSTER_PROJECT_DIR }}
          dagster_cloud_yaml_path: ${{ env.DAGSTER_CLOUD_YAML_PATH }}
          deployment: 'prod'
          location_names: '["quickstart_etl1", "location2"]'  # only deploy these two locations