Using jupytext and pre-commit to version control examples

There are a couple of challenges that I face when writing examples:

  • The examples should be provided as Jupyter Notebooks so that they can include equations, figures, links, etc. and can be included in our documentation.
  • Additionally, the examples should be directly executable so that users can download and run them immediately. Equally important, I can easily import the examples into other scripts without having to rewrite (and maintain) large sections of boilerplate model-setup code.

However, maintaining the examples both as .ipynb and .py files is a PITA.

Enter jupytext. This handy tool allows me to achieve exactly this: synchronizing two versions so that any change in one version is automatically transferred to the other.

To install jupytext, run

pip install jupytext


[conda/mamba] install -c conda-forge jupytext

There are different ways to sync the files. For a detailed introduction, refer to their documentation, but here’s the way I do it:

I write my notebooks in MyST (highly recommended!). For this purpose, I add the following header to the file:

  formats: md:myst,py:percent
  main_language: python
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.14.5
  display_name: Python 3
  name: python3

This configuration registers the file for jupytext. It also defines the file formats to which I want to synchronize the file. Besides md:myst, I use py:percent, a representation of Jupyter notebooks as scripts.

To sync, I can run:

jupytext --sync

which will sync all files in the current directory.

To automate this in my Git workflow, I installed pre-commit, which automatically runs this command on every commit, ensuring that both versions are synced.

# Jupytext
- repo:
  rev: v1.14.5
  - id: jupytext
    files: 'examples/[^/]+/'
    types_or: [markdown, python]
    exclude: |
    args: [--sync]

Note that I used some regex magic to specify which files to sync (and which to ignore).

There are also other useful things to do with pre-commit. For example, you could automatically remove trailing whitespace, or check YAML files (who understands YAML syntax anyway? :sweat_smile:).


# pre-commit-hooks
- repo:
  rev: v4.1.0
    - id: check-merge-conflict
    - id: check-yaml
    - id: end-of-file-fixer
    - id: trailing-whitespace