Make, Workflow, and GitHub Action

Introduction¶

Modern computational astrophysics is no longer just about interacting with the terminal or writing one-off scripts that generate plots or results. Research projects involve complex pipelines: i) generating synthetic data, ii) calibrating instruments, iii) analyzing observations, iv) running simulations, and v) producing figures and papers. Each step depends on the outputs of earlier steps. And the whole chain may need to be repeated when code changes, new data arrive, or collaborators join the project. Without systematic management, such workflows quickly become fragile, error-prone, and difficult to reproduce.

Workflow management and automation tools address these challenges. They allow us to:

Capture dependencies: make sure each step runs only when its inputs are ready.
Avoid redundant work: rebuild only the outputs affected by changes.
Scale up easily: run dozens or thousands of jobs in parallel on HPC or the cloud.
Enable reproducibility: capture all the steps needed to regenerate results.

In practice, we often combine several layers of automation:

make: lightweight automation for compiling code, running tests, build documentations, and chaining a few steps together. Make has been around for decades and remains a powerful tool for small pipelines.
Continuous Integration / Continuous Deployment (CI/CD): services such as GitHub Actions that automatically run tests, build documentation, and execute reproducible mini-workflows every time code is shared or updated. This ensures that the project stays healthy, reproducible, and transparent.

What You Will Learn¶

In this lab, we will build a minimal CCD image calibration package and then apply workflow automation to it at multiple levels:

Package & Testing: Write simple calibration functions in a Python package and test them with pytest.
make: Automate local development tasks such as testing, linting, and running a small pipeline.
CI/CD with GitHub Actions: Automate testing and documentation generation whenever code is pushed to GitHub.

By the end of this lab, you will see how automation tools turn individual scripts into a reproducible scientific workflow that is easier to run, easier to share, and easier to trust.

Set up a Python Package with Tests¶

In real research, we rarely write a single script that does everything. Instead, we build up small, reusable functions, things like “combine all bias frames” or “subtract dark current”. Over time, these functions naturally belong in a package: a collection of modules that can be imported, tested, and reused across multiple projects.

For this lab, let’s create a toy package called ccdmini. Its job is to provide the most basic CCD calibration primitives:

median_stack: combine multiple images (e.g., biases, darks, flats) into a single master calibration frame by taking the median pixel-by-pixel.
make_master_bias, make_master_dark, make_master_flat: convenience functions that wrap around median_stack and perform normalizations where needed.
apply_calibration: apply the standard CCD calibration formula:
$\begin{align} \text{Calibrated Image} = \frac{(\text{Raw Image} - \text{Master Bias} - \text{Master Dark})}{\text{Master Flat}} \end{align}$
(1)

This is the exact same operation astronomers run on real raw CCD frames. In our case, we will use tiny synthetic arrays to keep things simple and fast.

Why packaging?¶

Even though this is just a mock example, packaging matters because:

Reusability: you can use the same functions in multiple projects or scripts.
Testability: you can isolate and test each function with pytest.
Shareability: once it’s a package, you could publish it to PyPI, or share it within a collaboration with version control.

We will create a tiny mock package that implements median stacking and CCD calibration:

median_stack: median-combine many 2D arrays (for master bias/dark/flat).
make_master_*: convenience wrappers.
apply_calibration: (raw - bias - dark) / flat (with safe division).

We will also add pytest tests to lock in expected behavior.

Below, we use bash cells to create files and directories. All paths are rooted at $REPO.

# Choose where to create the repo (EDIT THIS if you want a different location)

repo = "ccdmini"

from os import environ, path
environ['REPO'] = path.join(environ.get('HOME'), repo)

%%bash

# Create the git repo and a basic tree
git init "$REPO"
echo "Repository root: $REPO"

# Create the minimal Python package structure
mkdir -p "$REPO/src/ccdmini" "$REPO/tests"

%%bash

# Create a "pyproject.toml" file as we did in ASTR 513 homework
# The syntax you see here is called a "heredoc" in `bash`

cat << 'EOF' > "$REPO/pyproject.toml" 
[project]
name = "ccdmini"
version = "0.0.0"
description = "Minimal CCD calibration primitives for ASTR 501"
requires-python = ">=3.8"
dependencies = ["numpy", "pytest"]
EOF

%%bash

# Create a "__init__.py" file.
# It is a required part of a python package.

cat << 'EOF' > "$REPO/src/ccdmini/__init__.py"
"""
ccdmini: Minimal CCD calibration primitives for ASTR 501.

This package intentionally stays tiny to keep the focus on
workflow/automation, while still representing real calibration steps.
"""

from .calib import (
    median_stack,
    make_master_bias,
    make_master_dark,
    make_master_flat,
    apply_calibration,
)
EOF

%%bash

# Implement the core calibration functions
# * median_stack: median-combine a list of 2D arrays
# * make_master_bias/dark/flat: wrappers (flat gets normalized)
# * apply_calibration: (raw - bias - dark) / flat with safe division

cat << 'EOF' > "$REPO/src/ccdmini/calib.py"
import numpy as np

def median_stack(arrays):
    """Median-combine a list of 2D arrays (H, W) to (H, W)."""
    return np.median(np.stack(arrays, axis=0), axis=0)

def make_master_bias(biases):
    """Master bias via median combine."""
    return median_stack(biases)

def make_master_dark(darks):
    """Master dark via median combine (assumes matching exposure)."""
    return median_stack(darks)

def make_master_flat(flats):
    """Master flat via median combine, then normalize to unit median."""
    mf  = median_stack(flats)
    med = float(np.median(mf))
    if med <= 0:
        raise ValueError("Flat median must be positive to normalize.")
    return mf / med


def apply_calibration(raw, mbias, mdark, mflat):
    """Apply CCD calibration: (raw - mbias - mdark) / mflat."""
    denom = np.where(mflat==0, 1.0, mflat)
    return (raw - mbias - mdark) / denom
EOF

%%bash

# Add pytest tests to lock in behavior and catch regressions
cat << 'EOF' > "$REPO/tests/test_calib.py"
import numpy as np

from ccdmini.calib import (
    median_stack,
    make_master_bias,
    make_master_dark,
    make_master_flat,
    apply_calibration,
)

def test_median_stack_is_pixelwise_median():
    a = np.ones((3,3))
    b = np.ones((3,3)) * 3
    out = median_stack([a, b])
    assert np.allclose(out, 2.0) # median of {1,3} is 2 everywhere

def test_master_bias_and_dark_are_medians():
    mb = make_master_bias([np.full((2,2), 100), np.full((2,2), 102)])
    md = make_master_dark([np.full((2,2), 10),  np.full((2,2), 12)])
    assert np.allclose(mb, 101)
    assert np.allclose(md, 11)

def test_master_flat_normalization_to_unit_median():
    mf = make_master_flat([np.full((2,2), 2.0), np.full((2,2), 4.0)])
    assert np.allclose(mf, 1.0)
    assert np.allclose(np.median(mf), 1.0)

def test_apply_calibration_recovers_signal():
    true_signal = np.ones((4,4)) * 1000.0
    mb = np.ones((4,4)) * 100.0
    md = np.ones((4,4)) * 10.0
    mf = np.ones((4,4)) * 1.0
    
    raw = true_signal + mb + md  # construct a raw that should calibrate back to true_signal

    cal = apply_calibration(raw, mb, md, mf)    
    assert np.allclose(cal, true_signal)
EOF

%%bash

# Track changes with git

cd "$REPO"

git add .
git commit -m "Initial commit --- 'ccdmini' for ASTR 501"

git log

Install and test your package¶

Let’s install ccdmini in “editing” mode. Then run pytest to make sure all the tests are working.

%%bash

cd "$REPO"

python -m pip install -U pip
python -m pip install -e . >/dev/null

pytest

Create Scripts¶

In order to interact with a python package, you very often need to write python scripts. This is not necessarily the best way to develop pipeline. Let’s create a few python scripts that wrap around ccdmini that can be run as standard Unix/Linux (shell) programs. Let’s save them in the $REPO/scripts/ directory.

%%bash

# Make sure the scripts/ and data/ dirs exist
mkdir -p "$REPO/scripts"

%%bash

# Generate tiny synthetic data (NumPy .npy files)

cat << 'EOF' > "$REPO/scripts/mkobs"
#!/usr/bin/env python3

from os import makedirs, path
import numpy as np

rng = np.random.default_rng(13)
makedirs("data/bias", exist_ok=True)
makedirs("data/dark", exist_ok=True)
makedirs("data/flat", exist_ok=True)
makedirs("data/raw",  exist_ok=True)

def save(dir, i, arr):
    np.save(path.join(dir, f"f{i:03d}.npy"), arr)

shape = (64, 64)

# Bias/Dark/Flat
for i in range(10): save("data/bias", i, 100 +     rng.normal(0,1,shape))
for i in range(10): save("data/dark", i,  10 +     rng.normal(0,1,shape))
for i in range(10): save("data/flat", i,   1 + 0.1*rng.normal(0,1,shape))

# Raw frames: a Gaussian "star" + bias + dark + noise
YY, XX = np.indices(shape)
signal = 1000 * np.exp(-((XX-32)**2 + (YY-32)**2)/(2*6**2))

for i in range(100):
    noise = rng.normal(0,5,shape)
    save("data/raw", i, signal + 100 + 10 + noise)
EOF

%%bash

# Build reference

cat << 'EOF' > "$REPO/scripts/mkref"
#!/usr/bin/env python3

from os import path, makedirs
from glob import glob
import numpy as np
from ccdmini.calib import make_master_bias, make_master_dark, make_master_flat

mkref = {
    'bias': make_master_bias,
    'dark': make_master_dark,
    'flat': make_master_flat,    
}

def load_dir(d):
    return [np.load(p) for p in sorted(glob(path.join(d, "*.npy")))]

from sys import argv
if len(argv) < 3:
    print(f'usage: {argv[0]} [bias|dark|flat] DIR')
    exit()

kind = argv[1]
data = argv[2]

makedirs("results/ref", exist_ok=True)
np.save(f"results/ref/{kind}.npy", mkref[kind](load_dir(data)))
EOF

%%bash

# Apply calibration to a small subset (fast demo)

cat << 'EOF' > "$REPO/scripts/calmini"
#!/usr/bin/env python3

from os  import makedirs, path
from sys import argv
import numpy as np

from ccdmini.calib import apply_calibration

from sys import argv
if len(argv) <= 1:
    print(f'usage: {argv[0]} FILE1 FILE2 ... FILEN')
    exit()
files = argv[1:]

rb = np.load("results/ref/bias.npy")
rd = np.load("results/ref/dark.npy")
rf = np.load("results/ref/flat.npy")

makedirs("results", exist_ok=True)
for f in files:
    raw = np.load(f)
    cal = apply_calibration(raw, rb, rd, rf)
    np.save(path.join("results", path.basename(f)), cal)
EOF

%%bash

# Mean stack of calibrated frames as a QA image

cat << 'EOF' > "$REPO/scripts/mkplt"
#!/usr/bin/env python3

from os import makedirs
import numpy as np
import matplotlib.pyplot as plt

from sys import argv
if len(argv) <= 1:
    print(f'usage: {argv[0]} FILE1 FILE2 ... FILEN')
    exit()
files = argv[1:]

stack = np.mean([np.load(p) for p in files], axis=0)

plt.imshow(stack, origin="lower")
plt.colorbar()

makedirs("plots", exist_ok=True)
plt.savefig("plots/mean.png", dpi=150, bbox_inches="tight")
EOF

%%bash

# Make all scripts executable

chmod a+x ${REPO}/scripts/*

%%bash

# Optionally, let's also commit these scripts to git

cd $REPO
git add scripts
git commit -m 'Add calibration scripts'

Test the “pipeline”¶

We can now run all the python scripts one by one and calibrate an image of the mock star!

%%bash

cd $REPO
./scripts/mkobs
./scripts/mkref bias data/bias
./scripts/mkref dark data/dark
./scripts/mkref flat data/flat
./scripts/calmini data/raw/f*.npy
./scripts/mkplt   results/f*.npy

%%bash

# Uncomment the following to clean up

#cd $REPO && rm -rf data/ results/ plots

%%bash

# HANDSON: how would you automate the above "pipeline"?

Introductory `make`¶

In the above hands-on, you probably programed a bash script, e.g.,

#!/usr/bin/env bash

./scripts/mkobs
./scripts/refbias
./scripts/refdark
./scripts/refflat
./scripts/calmini data/raw/f*.npy
./scripts/mkplt   results/f*.npy

called runall. You run ./runall in the top level ccdmini repo, and this bash script just run the python scripts in scripts/ one by one.

It does automate your “pipeline”, but if anything breaks, e.g., the observation fails, some file is corrupted and numpy cannot read it, etc, then the whole pipeline just falls apart.

One simple solution in bash is to chain the different steps with &&. Bash will look at the return value of the program at each step, and “short short circuit” when any process fails. May may even || your chain with an echo statement to print an error message.

%%bash

# HANDSON: try to use `&&` and `||` to chain up multiple Unix/Linux
#          programs and observe the short circuit behavior.

But we can do better than that!

make is a classic tool for automation and workflow management. It was originally designed for compiling software, but the core idea applies to any workflow where some files depend on others.

In make:

A target is something you want to build (by default a file).
Each target has a list of prerequisites (the inputs it depends on).
Each target has a recipe (the commands to run if the target is out of date).

When you run make target, the program:

Checks if the target file exists and whether it is older than its prerequisites.
If the target is missing or stale, it runs the recipe to rebuild it.
This process cascades through the dependency graph.

Why use make?

Rebuild only what changed: If you touch one raw file, only the products depending on that file are rebuilt.
Parallelism for free: Independent targets can run at the same time with make -j.
Readable documentation: The Makefile captures your workflow in a structured, repeatable way.
Extremely lightweight: No databases, no servers, just a single file that works everywhere.

In practice, make lets us move beyond brittle bash scripts. Instead of rerunning all steps every time, we can express the logical dependencies in our workflow and let make decide what needs updating.

v1. Just Targets (`make` as a Better `bash` Runner)¶

Start by mirroring the bash script with named targets. This is already nicer than a one-off shell script because students can run parts (make ref, make cal, etc.).

%%bash

cat << 'EOF' > "$REPO/Makefile"
# v1. Just Targets (`make` as a Better `bash` Runner)

all: obs ref cal plot

obs:  # generate synthetic observations
	./scripts/mkobs

ref:  # build reference bias/dark/flat
	./scripts/mkref bias data/bias
	./scripts/mkref dark data/dark
	./scripts/mkref flat data/flat

cal:  # calibrate all raw frames -> results/f*.npy
	./scripts/calmini data/raw/f*.npy

plot:  # make plot from calibrated frames
	./scripts/mkplt results/f*.npy

clean:
	rm -rf results plots

clean-all:
	rm -rf data results plots
EOF

You may now run just make instead of runall. However, you may also run specific steps, e.g., make ref.

v2. File-Based Targets (`make` Decides What’s Stale)¶

Tell make what files are produced and what they depend on. Assume your scripts write:

results/ref/bias.npy, results/ref/dark.npy, results/ref/flat.npy
results/fNNN.npy
plots/mean.png (or similar)

%%bash

cat << 'EOF' > "$REPO/Makefile"
# v2. File-Based Targets (`make` Decides What's Stale)

all: plots/mean.png

data/raw/f000.npy data/bias/f000.npy data/dark/f000.npy data/flat/f000.npy:
	./scripts/mkobs

results/ref/bias.npy: data/bias/f000.npy
	./scripts/mkref bias data/bias

results/ref/dark.npy: data/dark/f000.npy
	./scripts/mkref dark data/dark

results/ref/flat.npy: data/flat/f000.npy
	./scripts/mkref flat data/flat

results/f000.npy: data/raw/f000.npy results/ref/bias.npy results/ref/dark.npy results/ref/flat.npy
	./scripts/calmini data/raw/f000.npy

plots/mean.png: results/f000.npy
	./scripts/mkplt results/f000.npy

clean:
	rm -rf results plots

clean-all:
	rm -rf data results plots
EOF

# HANDSON: try to run `make clean` and then rerun `make`.
#          What scripts are run?
#          Try to run `make clean-all` and then rerun `make`.
#          What scripts are rerun then?

# HANDSON: try to delete "data/bias/f000.npy" and then rerun `make`.
#          What scripts are rerun then?
#          try to delete "data/raw/f000.npy" and then rerun `make`.
#          What scripts are rerun then?

v3. Pattern rule (`make` Decides What’s Stale)¶

Use a pattern rule so each calibrated frame is rebuilt independently from its matching raw plus the ref frames. This enables minimal rebuilds and parallelism.

%%bash

cat << 'EOF' > "$REPO/Makefile"
# v2. File-Based Targets (`make` Decides What's Stale)

all: plots/mean.png

data/raw/%.npy data/bias/%.npy data/dark/%.npy data/flat/%.npy:
	./scripts/mkobs

results/ref/%.npy: data/%/f000.npy
	./scripts/mkref $$(basename $$(dirname $<)) $$(dirname $<)

results/%.npy: data/raw/%.npy results/ref/bias.npy results/ref/dark.npy results/ref/flat.npy
	./scripts/calmini $<

plots/mean.png: results/f000.npy
	./scripts/mkplt $<

clean:
	rm -rf results plots

clean-all:
	rm -rf data results plots
EOF

# HANDSON: Use `make`'s advanced pattern matching to make the above
#          `Makefile` sensitive to per-file changes

With dependencies specified, we can eaisly use task-based parallelism:

%%bash

cd $REPO
make -j  # run in parallel

Introduction to CI/CD and GitHub Actions¶

When we work on research software, it’s easy to check that things run on our own laptop. But what happens when a collaborator clones the repository, or when we return months later? Will the code still run, will the tests pass, will the workflow produce the same results?

This is where Continuous Integration / Continuous Deployment (CI/CD) comes in.

Continuous Integration (CI)¶

Every time you push new code or open a pull request, CI systems automatically:
- Install your package
- Run the test suite
- Lint or check code style
- Optionally run small end-to-end workflows (like our calibration pipeline)
CI ensures that code remains healthy, reproducible, and trustworthy at all times.

Continuous Deployment (CD)¶

Goes a step further: once code passes CI, it can be automatically deployed or published.
Examples: publish a package to PyPI, push docs to GitHub Pages, or build a Docker image.
In science, “deployment” often means publishing documentation, figures, or artifacts for collaborators.

GitHub Actions¶

GitHub provides a built-in CI/CD service called Actions.

Workflows are described in simple YAML files under .github/workflows/.
Each workflow defines jobs that run on GitHub’s servers whenever certain events occur (e.g. push, pull_request).
Jobs run in a clean environment (like Ubuntu or macOS VMs), so they double as a reproducibility check: if it runs on GitHub’s machines, it will likely run for your collaborators too.

In this lab:

We already have a tested Python package (ccdmini).
We’ve automated tasks with make.
With GitHub Actions, we can make sure that:
- Every commit passes tests.
- Our calibration pipeline runs end-to-end on sample data.
- Our results (plots, docs) are stored as artifacts and can be shared.

This way, our code is not only correct today but remains correct, reproducible, and transparent whenever it evolves.

Step 1. Minimal CI: Run Tests¶

%%bash

# Create the CI folder
mkdir -p "$REPO/.github/workflows"

# v1: minimal tests
cat << 'EOF' > "$REPO/.github/workflows/test.yml"
name: test

on:
  push:
  pull_request:

jobs:
  tests:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - name: Install
        run: |
          pip install -U pip
          pip install -e .
      - name: Run tests
        run: pytest
EOF

%%bash

# Commit the GitHub action and push to GitHub

cd $REPO
git add .github
git commit -m 'Add test as GitHub Action'
git push

Step 2: Add lint (flake8)¶

%%bash

# Step 2: Add lint (flake8)

cat << 'EOF' > "$REPO/.github/workflows/lint.yml"
name: lint

on:
  push:
  pull_request:

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - name: Install
        run: |
          pip install -U pip
          pip install flake8
          pip install -e .
      - name: Lint
        run: flake8 src tests --max-line-length=100
EOF

%%bash

# Commit the GitHub action and push to GitHub

cd $REPO
git add .github
git commit -m 'Add lint as GitHub Action'
git push

# HANDSON: lint probably complain about `ccdmini`.
#          Let's fix it so lint would pass!

Step 3: Test against multiple Python versions (matrix)¶

%%bash

# Step 3: Test against multiple Python versions (matrix)

cat << 'EOF' > "$REPO/.github/workflows/ci.yml"
name: CI

on:
  push:
  pull_request:

jobs:
  tests-lint:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - name: Install
        run: |
          pip install -U pip
          pip install flake8
          pip install -e .
      - name: Test
        run: pytest
      - name: Lint
        run: flake8 src tests --max-line-length=100
EOF

%%bash

# Commit the GitHub action and push to GitHub

cd $REPO
rm -f .github/workflows/{lint,test}.yml
git add .github
git commit -m 'Add CI as GitHub Action'
git push

Conclusion¶

In this lab, we saw how to transform a collection of scripts into a reproducible, automated workflow:

We packaged core calibration functions into a small Python package with unit tests.
We used make to automate tasks and, by adding dependencies, enabled incremental and parallel builds.
Finally, we set up GitHub Actions to automatically run tests, lint code, and execute the pipeline on every push, producing a reproducible artifact.

Together, these tools show the full path from personal scripts to team workflows to community-trustworthy research software. By adopting workflow management and CI/CD practices, you ensure that your computational astrophysics projects are not only correct today, but also reliable, reproducible, and sustainable for collaborators and for your future self.

ASTR 501

Software Environment and Cloud Computing