skip to main content

@tonyfast s notebooks

site navigation
notebook summary
title
transforming mast notebooks into nbconvert templates.
description
this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.
cells
23 total
18 code
state
executed out of order
kernel
Python [conda env:p311] *
language
python
name
conda-env-p311-py
lines of code
133
outputs
19
table of contents
{"kernelspec": {"display_name": "Python [conda env:p311] *", "language": "python", "name": "conda-env-p311-py"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3"}, "widgets": {"application/vnd.jupyter.widget-state+json": {"state": {}, "version_major": 2, "version_minor": 0}}, "title": "transforming mast notebooks into nbconvert templates.", "description": "this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates."}
notebook toolbar
Activate
cell ordering
1

transforming mast notebooks into nbconvert templates.

this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.

to build docs or a book from nbconvert templates we need to either shim into mkdocs or build our own thing. sphinx can't because of how docutils is used.

its just easier to build our own thing. hubris, eh? lets find out.

2
    import nbconvert, nbformat, io, midgy
    from toolz.curried import *
    from nobook import *
    from IPython.display import *
3 1 outputs.

start with a string mapping the github repository to disc

s = Series({(repo := 

https://github.com/spacetelescope/mast_notebooks

): "mast_notebooks"}).path()
if not s.path.exists().any():

the whole repository is pretty tedious to clone. lots of pictures? do notebooks suck that bad over a long time? better to use the --depth arg for control

    !git clone $repo --depth 1
4 1 outputs.

this example reuses the mast notebook table of contents for jupyter book. on a mkdocs site we could do the same with mkdocs.yml.

toc = (
    await Index(["mast_notebooks/_toc.yml"], name="path").apath().apath.load()
).aseries()
5
    config = (
        await Index(["mast_notebooks/_config.yml"], name="path").apath().apath.load()
    ).aseries().T.iloc[:,0]
6
    chapters = toc.parts.enumerate("chapter").series()
    sections = chapters.chapters.enumerate("section").series()   
    files = sections.sections.dropna().enumerate("section").series().combine_first(
        sections[["file"]].set_index(Index([0]*len(sections), name="section"), append=True)
    )
7 3 outputs.

explode the chapters, sections, files, and ultimately discover the paths of the notebooks in the documents.

chapters = toc.parts.enumerate("chapter").series()
sections = chapters.chapters.enumerate("section").aseries()   
files = sections.sections.dropna().enumerate("subsection").aseries().combine_first(
    sections[["file"]].set_index(Index([0]*len(sections), name="subsection"), append=True)
)
paths = ("mast_notebooks" / files.file.apath())
print(F"{(~paths.path().path.exists()).sum()} files missing")
paths = paths[await paths.apath().apath.exists()].pipe(Index)
files
4 files missing

file
path chapter section subsection
mast_notebooks/_toc.yml 0 0 0 notebooks/astrocut/making_tess_cubes_and_cutou...
1 0 0 notebooks/astroquery/intro.md
1 0 notebooks/astroquery/beginner_search/beginner_...
1 notebooks/astroquery/beginner_zcut/beginner_zc...
2 notebooks/astroquery/large_downloads/large_dow...
... ... ... ...
10 1 2 notebooks/TESS/asteroid_rotation/asteroid_rota...
2 0 notebooks/TESS/interm_tesscut_dss_overlay/inte...
1 notebooks/TESS/interm_tesscut_requests/interm_...
3 0 notebooks/TESS/interm_tess_prf_retrieve/interm...
1 notebooks/TESS/removing_scattered_light_using_...

65 rows × 1 columns

8

gathering and executing notebooks

9
    notebooks = (await paths.apath().apath.load())
10

filter notebooks to and currently ignore the markdown files in the mix. the markdown files can be represents as a notebook with a markdown cell.

11
    import nbformat, nbclient
12
    notebooks = notebooks[notebooks.index.path.suffix.eq(".ipynb")].apply(
        nbformat.from_dict
    ).to_frame("nb")
13 1 outputs.

dependencies

i wanted to see if could build an environment that all these notebooks could run in. we collect the dependencies from the requirements.txt files in the mast notebook directory. we structure the dependencies using a regular expresssion so we can extract verion information too.

dependencies = await (await (
    Index(["mast_notebooks/"]).apath().apath.rglob("requirements.txt")
)).pipe(Index).apath.read_text()
versions = dependencies.apply(str.splitlines).explode().str.extract(
    "^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
)
14 1 outputs.

create an environment.yml file from the verions information previously collected

import yaml; from pathlib import Path
deps = versions.package.dropna().drop_duplicates().tolist()
deps = [{"git": "GitPython"}.get(x,x) for x in deps ]
Path("environment.yml").write_text(yaml.safe_dump(dict(
    name="mast_notebooks",
    channels=["conda-forge"],
    dependencies=["pip", dict(
        pip=deps+ ["ipykernel", "astrocut"]
    )]
)))

uncomment the code below to create or update the environment the environment

mamba env create -p.mast_nb -f environment.yml
mamba env update -p.mast_nb -f environment.yml

create a kernel for the environment to run the notebooks in

mamba run  -p.mast_nb python -m ipykernel install --user --name mast_nb
15 2 outputs.

executing the notebooks

execute the notebooks using the nbclient library

client = notebooks.nb.apply(nbformat.from_dict).apply(
    nbclient.NotebookClient, kernel_name="mast_nb", allow_errors=True
)

recombine the executed notebooks with the original ones.

notebooks.nb = (
    await client.head(3).apply(nbclient.NotebookClient.async_execute).gather()
).combine_first(notebooks.nb)
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.

16 1 outputs.

merge our default async jinja exporter with the legacy nbconvert importer. the async template is a massive improvement of the sync nbconvert.

exporter = nbconvert.get_exporter("a11y")()

generate the default resources object for templating the notebook

_, resources = exporter.from_notebook_node(nbformat.reads("""{"cells": [], "metadata": {}}""", 4))    
ours,theirs = Series().template.environment,exporter.environment
ours.loader = theirs.loader
ours.filters.update({**theirs.filters, **ours.filters})
ours.globals.update({**theirs.globals, **ours.globals})
17 2 outputs.

create the footer for all of the pages. currently there is no specific license identified, but if there were we should use the license microformat .

footer = ours.filters["markdown"](F"""

By {config.author}

© Copyright {config.copyright}

""")
footer
'<p>By STScI</p>\n<p>© Copyright 2022-2024</p>\n'
18 1 outputs.

add site navigation in the post processing step previous and next still needs to be added previous and next is relative. we'll want the config to do this work too. it needs to be passed to the template to construct license and link information

19
    htmls = (
        await notebooks.head().template.render_template(
            "a11y/table.html.j2", resources=resources, config=config,
            footer=footer
        )
    ).apply(exporter.post_process_html)
    htmls.to_frame().T
1 outputs.
file mast_notebooks/notebooks/astrocut/making_tess_cubes_and_cutouts/making_tess_cubes_and_cutouts.ipynb mast_notebooks/notebooks/astroquery/beginner_search/beginner_search.ipynb mast_notebooks/notebooks/astroquery/beginner_zcut/beginner_zcut.ipynb mast_notebooks/notebooks/astroquery/large_downloads/large_downloads.ipynb mast_notebooks/notebooks/astroquery/historic_quasar_observations/historic_quasar_observations.ipynb
0 <!DOCTYPE html>\n<html lang="en">\n <head>\n ... <!DOCTYPE html>\n<html lang="en">\n <head>\n ... <!DOCTYPE html>\n<html lang="en">\n <head>\n ... <!DOCTYPE html>\n<html lang="en">\n <head>\n ... <!DOCTYPE html>\n<html lang="en">\n <head>\n ...
20

view the generated documentation as inline iframes

21
    iframe = """<iframe height=600 width="100%" srcdoc="{}"></iframe>"""
    iframes = htmls.apply(compose_left(__import__("html").escape, iframe.format))
    display(*iframes.apply(HTML))
5 outputs.
22

closing thoughts

these notebooks have unexecuted code and nbconvert-a11y had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33

currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly). i'd like to use the jupyter toc as the toc to generate these templates. this format for the documentation is a lot more flexible to modify than standard documentation systems. we are dealing with our documentation as an intermediate table to offers introspection and manipulation.

search is a last thing to implement are the colab and binder links even valid? we'll need templates to handle myst admonitions

23