Using jupytext and pre-commit to version control examples

There are a couple of challenges that I face when writing examples:

  • The examples should be provided as Jupyter Notebooks so that they can include equations, figures, links, etc. and can be included in our documentation.
  • Additionally, the examples should be directly executable so that users can download and run them immediately. Equally important, I can easily import the examples into other scripts without having to rewrite (and maintain) large sections of boilerplate model-setup code.

However, maintaining the examples both as .ipynb and .py files is a PITA.

Enter jupytext. This handy tool allows me to achieve exactly this: synchronizing two versions so that any change in one version is automatically transferred to the other.

To install jupytext, run

pip install jupytext

or

[conda/mamba] install -c conda-forge jupytext

There are different ways to sync the files. For a detailed introduction, refer to their documentation, but here’s the way I do it:

I write my notebooks in MyST (highly recommended!). For this purpose, I add the following header to the file:

---
jupytext:
  formats: md:myst,py:percent
  main_language: python
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.14.5
kernelspec:
  display_name: Python 3
  name: python3
---

This configuration registers the file for jupytext. It also defines the file formats to which I want to synchronize the file. Besides md:myst, I use py:percent, a representation of Jupyter notebooks as scripts.

To sync, I can run:

jupytext --sync

which will sync all files in the current directory.

To automate this in my Git workflow, I installed pre-commit, which automatically runs this command on every commit, ensuring that both versions are synced.

repos:
[...]
# Jupytext
- repo: https://github.com/mwouts/jupytext
  rev: v1.14.5
  hooks:
  - id: jupytext
    files: 'examples/[^/]+/'
    types_or: [markdown, python]
    exclude: |
        (?x)^(
            README.md|
            examples/[^/]+/index.md|
            examples/[^/]+/index.py
        )$
    args: [--sync]

Note that I used some regex magic to specify which files to sync (and which to ignore).

There are also other useful things to do with pre-commit. For example, you could automatically remove trailing whitespace, or check YAML files (who understands YAML syntax anyway? :sweat_smile:).

repos:

# pre-commit-hooks
- repo: https://github.com/pre-commit/pre-commit-hooks
  rev: v4.1.0
  hooks:
    - id: check-merge-conflict
    - id: check-yaml
    - id: end-of-file-fixer
    - id: trailing-whitespace
2 Likes