Skip to content

jupyterlite blog integration¤

i've always thought of blog posts as a means, not an ends. my dream has always been content that myself and others can execute themselves. often this goal has been hindered by the need for infrastructure. advances in the jupyterlite have made it possible to realize this vision without infrastructure.

jupyterlite is jupyterlab running completely in the browser without the need for a local server. this means that folks can redirect from a static post into an interactive version they can try themselves.

jupyterlite notebooks are different¤

when working with a standard jupyter implementation we are interacting with a server within an implicit environment. when we write our notebooks we assume that we can import modules because they are in our environment. however, in jupyterlite we need to explicitly define our dependencies for every notebook.

we define our dependencies by inserting the pip line magic:

%pip install my dependencies

this statement is superfluous under normal circumstances so it doesn't need to exist in the source. instead we use the depfinder project to infer the projects imported by our notebook. the inferred dependencies are then inserted in to the first line of the first code cell of the notebook.

the doit lite implementation¤

in the code that follow we define a doit task that: 1. builds a jupyterlite site for this blog 2. make the dependencies compatible with jupyterlite

    def task_lite():
        """build the jupyter lite site and append requirements"""
        return dict(
            actions=[
                "jupyter lite build --contents tonyfast --output-dir site/run",
                (set_files_imports, (pathlib.Path("site/run/files"),))
            ],
            clean=["rm -rf site/run"]
        )

    import typing, tonyfast, pathlib, textwrap, re, json

discovering imports with depfinder¤

following sections we'll build the methods for discovering imports with depfinder

set_files_imports iterates through a directory and amends notebooks to work in jupyterlite

    def set_files_imports(FILES: typing.Iterable[pathlib.Path]=(
        FILES := (WHERE := pathlib.Path(tonyfast.__file__).parent.parent) / "site/run/files"
    )) -> None:
        for file in FILES.rglob("*.ipynb"):  set_file_imports(file)

get_imports finds the imports in each cell

    def get_imports(cell: dict, pidgy=False) -> set:
        import depfinder
        __import__("requests_cache").install_cache()
        source = "".join(cell["source"])
        if pidgy:
            source = tangle.render(source)
        source = textwrap.dedent(source)
        try:
            found = depfinder.inspection.get_imported_libs(source)
            return found.required_modules.union(found.sketchy_modules)
        except BaseException as e:
            return

get_deps transforms inputs to dependencies.

some dependencies may require extra features to work in jupyterlite and they are appended here.

    mapping = dict(bs4="beautifulsoup4")
    def get_deps(deps: set) -> set:
        if "requests" in deps: deps.add("pyodide-http")
        if "pandas" in deps: deps.add("jinja2")
        return {
            mapping.get(x, x) for x in deps 
            if not x.startswith("_") or x not in {"tonyfast"}
        }

handling pidgy documents¤

some documents might use [pidgy] syntax that need to be dealt with.

    PIDGY = re.compile("^\s*%(re)?load_ext\s*pidgy")
    from midgy import Python; tangle = Python()
    def has_pidgy(nb: dict):
        yes = False
        for _, cell in iter_code_cells(nb):
            yes = yes or PIDGY.match("".join(cell["source"])) and True
        return yes

updating the jupyterlite notebooks¤

these methods are meant to operate on the contents of a jupyterlite not the raw notebooks.

set_file_imports operates in one file discovers dependencies and writes code back to the source.

    def set_file_imports(file: pathlib.Path) -> None:
        data = json.loads(file.read_text())
        deps, first = set(), None
        pidgy = has_pidgy(data)
        for no, cell in iter_code_cells(data):
            if first is None:
                first = no
            if pidgy:
                data["cells"][no]["metadata"].setdefault("jupyter", {})["source_hidden"] = True
            deps.update(get_imports(cell, pidgy) or [])

        deps = get_deps(deps)
        if pidgy:
            deps.add("pidgy")
        if deps and (first is not None):
            cell = data["cells"][first]
            was_str = isinstance(cell["source"], str)
            if was_str:
                cell["source"] = cell["source"].splitlines(1)
            for i, line in enumerate(list(cell["source"])):
                if (left := line.lstrip()):
                    if left.startswith(("%pip install",)):
                        break
                    indent = len(line) - len(left)                    
                    if "pyodide-http" in deps:
                        data["cells"][first]["source"].insert(i, " "*indent + "__import__('pyodide_http').patch_all()\n")
                    data["cells"][first]["source"].insert(i, " "*indent + "%pip install " + " ".join(deps) +"\n")
                    print(F"writing {len(set(deps))} pip requirements to {file}")
                    file.write_text(json.dumps(data, indent=2))
                    break
        else:
            print(F'no deps for {file}')

set_files_imports sets the dependencies for a lot of files.

    def set_files_imports(FILES: typing.Iterable[pathlib.Path]=FILES):
        for file in FILES.rglob("*.ipynb"):
            set_file_imports(file)

iter_code_cells iterates through just the code cells.

    def iter_code_cells(nb: dict) -> typing.Iterator[tuple[int, dict]]:
        for i, cell in enumerate(nb["cells"]):
            if cell["cell_type"] == "code":
                yield i, cell

usage¤

  • from the tonyfast module, requires deps
    if (I := '__file__' not in locals()):
        !python -m tonyfast tasks info lite
lite

build the jupyter lite site and append requirements

status     : run
 * The task has no dependencies.
  • from post with importnb
    if (I := '__file__' not in locals()):
        !importnb -t 2022-12-21-lite-build.ipynb list
lite   build the jupyter lite site and append requirements
  • run this task from hatch in the root of the project. the hatch environment has all the necessary dependencies defined.
    hatch run lite:build