# transforming mast notebooks into nbconvert templates.

this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.

to build docs or a book from nbconvert templates we need to either shim into mkdocs
or build our own thing. sphinx can't because of how `docutils` is used.

its just easier to build our own thing. hubris, eh? lets find out.

# transforming mast notebooks into nbconvert templates.

this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.

to build docs or a book from nbconvert templates we need to either shim into mkdocs
or build our own thing. sphinx can't because of how `docutils` is used.

its just easier to build our own thing. hubris, eh? lets find out.


        {'data': {'text/html': "transforming mast notebooks into nbconvert templates.
\nthis is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.
\nto build docs or a book from nbconvert templates we need to either shim into mkdocs\nor build our own thing. sphinx can't because of how docutils is used.
\nits just easier to build our own thing. hubris, eh? lets find out.\n"}}

transforming mast notebooks into nbconvert templates.

this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.

to build docs or a book from nbconvert templates we need to either shim into mkdocs or build our own thing. sphinx can't because of how docutils is used.

its just easier to build our own thing. hubris, eh? lets find out.

    import nbconvert, nbformat, io, midgy
    from toolz.curried import *
    from nobook import *
    from IPython.display import *

    import nbconvert, nbformat, io, midgy
    from toolz.curried import *
    from nobook import *
    from IPython.display import *

%%
start with a string mapping the github repository to disc

    s = Series({(repo := 
https://github.com/spacetelescope/mast_notebooks
              
    ): "mast_notebooks"}).path()
    if not s.path.exists().any():
the whole repository is pretty tedious to clone. lots of pictures? do notebooks suck that bad over a long time?
better to use the `--depth` arg for control
        
        !git clone $repo --depth 1

%%
start with a string mapping the github repository to disc

    s = Series({(repo := 
https://github.com/spacetelescope/mast_notebooks
              
    ): "mast_notebooks"}).path()
    if not s.path.exists().any():
the whole repository is pretty tedious to clone. lots of pictures? do notebooks suck that bad over a long time?
better to use the `--depth` arg for control
        
        !git clone $repo --depth 1

start with a string mapping the github repository to disc

s = Series({(repo :=

https://github.com/spacetelescope/mast_notebooks

): "mast_notebooks"}).path()
if not s.path.exists().any():

the whole repository is pretty tedious to clone. lots of pictures? do notebooks suck that bad over a long time? better to use the --depth arg for control

    !git clone $repo --depth 1

%%
this example reuses the mast notebook table of contents for jupyter book. on a mkdocs site we could do the same with mkdocs.yml.

    toc = (
        await Index(["mast_notebooks/_toc.yml"], name="path").apath().apath.load()
    ).aseries()

%%
this example reuses the mast notebook table of contents for jupyter book. on a mkdocs site we could do the same with mkdocs.yml.

    toc = (
        await Index(["mast_notebooks/_toc.yml"], name="path").apath().apath.load()
    ).aseries()

this example reuses the mast notebook table of contents for jupyter book. on a mkdocs site we could do the same with mkdocs.yml.

toc = (
    await Index(["mast_notebooks/_toc.yml"], name="path").apath().apath.load()
).aseries()

    config = (
        await Index(["mast_notebooks/_config.yml"], name="path").apath().apath.load()
    ).aseries().T.iloc[:,0]

    config = (
        await Index(["mast_notebooks/_config.yml"], name="path").apath().apath.load()
    ).aseries().T.iloc[:,0]

    chapters = toc.parts.enumerate("chapter").series()
    sections = chapters.chapters.enumerate("section").series()   
    files = sections.sections.dropna().enumerate("section").series().combine_first(
        sections[["file"]].set_index(Index([0]*len(sections), name="section"), append=True)
    )

    chapters = toc.parts.enumerate("chapter").series()
    sections = chapters.chapters.enumerate("section").series()   
    files = sections.sections.dropna().enumerate("section").series().combine_first(
        sections[["file"]].set_index(Index([0]*len(sections), name="section"), append=True)
    )

%%
explode the chapters, sections, files, and ultimately discover the paths of the notebooks in the documents.

    chapters = toc.parts.enumerate("chapter").series()
    sections = chapters.chapters.enumerate("section").aseries()   
    files = sections.sections.dropna().enumerate("subsection").aseries().combine_first(
        sections[["file"]].set_index(Index([0]*len(sections), name="subsection"), append=True)
    )
    paths = ("mast_notebooks" / files.file.apath())
    print(F"{(~paths.path().path.exists()).sum()} files missing")
    paths = paths[await paths.apath().apath.exists()].pipe(Index)
    files

%%
explode the chapters, sections, files, and ultimately discover the paths of the notebooks in the documents.

    chapters = toc.parts.enumerate("chapter").series()
    sections = chapters.chapters.enumerate("section").aseries()   
    files = sections.sections.dropna().enumerate("subsection").aseries().combine_first(
        sections[["file"]].set_index(Index([0]*len(sections), name="subsection"), append=True)
    )
    paths = ("mast_notebooks" / files.file.apath())
    print(F"{(~paths.path().path.exists()).sum()} files missing")
    paths = paths[await paths.apath().apath.exists()].pipe(Index)
    files

explode the chapters, sections, files, and ultimately discover the paths of the notebooks in the documents.

chapters = toc.parts.enumerate("chapter").series()
sections = chapters.chapters.enumerate("section").aseries()   
files = sections.sections.dropna().enumerate("subsection").aseries().combine_first(
    sections[["file"]].set_index(Index([0]*len(sections), name="subsection"), append=True)
)
paths = ("mast_notebooks" / files.file.apath())
print(F"{(~paths.path().path.exists()).sum()} files missing")
paths = paths[await paths.apath().apath.exists()].pipe(Index)
files

4 files missing

				file
path	chapter	section	subsection
mast_notebooks/_toc.yml	0	0	0	notebooks/astrocut/making_tess_cubes_and_cutou...
	1	0	0	notebooks/astroquery/intro.md
		1	0	notebooks/astroquery/beginner_search/beginner_...
			1	notebooks/astroquery/beginner_zcut/beginner_zc...
			2	notebooks/astroquery/large_downloads/large_dow...
	...	...	...	...
	10	1	2	notebooks/TESS/asteroid_rotation/asteroid_rota...
		2	0	notebooks/TESS/interm_tesscut_dss_overlay/inte...
		2	1	notebooks/TESS/interm_tesscut_requests/interm_...
		3	0	notebooks/TESS/interm_tess_prf_retrieve/interm...
		3	1	notebooks/TESS/removing_scattered_light_using_...

65 rows × 1 columns

gathering and executing notebooks

    notebooks = (await paths.apath().apath.load())

    notebooks = (await paths.apath().apath.load())

filter notebooks to and currently ignore the markdown files in the mix.
the markdown files can be represents as a notebook with a markdown cell.

filter notebooks to and currently ignore the markdown files in the mix.
the markdown files can be represents as a notebook with a markdown cell.


        {'data': {'text/html': 'filter notebooks to and currently ignore the markdown files in the mix.\nthe markdown files can be represents as a notebook with a markdown cell.\n'}}

filter notebooks to and currently ignore the markdown files in the mix. the markdown files can be represents as a notebook with a markdown cell.

    import nbformat, nbclient

    import nbformat, nbclient

    notebooks = notebooks[notebooks.index.path.suffix.eq(".ipynb")].apply(
        nbformat.from_dict
    ).to_frame("nb")

    notebooks = notebooks[notebooks.index.path.suffix.eq(".ipynb")].apply(
        nbformat.from_dict
    ).to_frame("nb")

%%
## dependencies

i wanted to see if could build an environment that all these notebooks could run in.
we collect the dependencies from the requirements.txt files in the mast notebook directory.
we structure the dependencies using a regular expresssion so we can extract verion information too. 
    
    dependencies = await (await (
        Index(["mast_notebooks/"]).apath().apath.rglob("requirements.txt")
    )).pipe(Index).apath.read_text()
    versions = dependencies.apply(str.splitlines).explode().str.extract(
        "^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
    )</version></constraint></package>

%%
## dependencies

i wanted to see if could build an environment that all these notebooks could run in.
we collect the dependencies from the requirements.txt files in the mast notebook directory.
we structure the dependencies using a regular expresssion so we can extract verion information too. 
    
    dependencies = await (await (
        Index(["mast_notebooks/"]).apath().apath.rglob("requirements.txt")
    )).pipe(Index).apath.read_text()
    versions = dependencies.apply(str.splitlines).explode().str.extract(
        "^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
    )

dependencies

i wanted to see if could build an environment that all these notebooks could run in. we collect the dependencies from the requirements.txt files in the mast notebook directory. we structure the dependencies using a regular expresssion so we can extract verion information too.

dependencies = await (await (
    Index(["mast_notebooks/"]).apath().apath.rglob("requirements.txt")
)).pipe(Index).apath.read_text()
versions = dependencies.apply(str.splitlines).explode().str.extract(
    "^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
)

%%
create an environment.yml file from the verions information previously collected
    
    import yaml; from pathlib import Path
    deps = versions.package.dropna().drop_duplicates().tolist()
    deps = [{"git": "GitPython"}.get(x,x) for x in deps ]
    Path("environment.yml").write_text(yaml.safe_dump(dict(
        name="mast_notebooks",
        channels=["conda-forge"],
        dependencies=["pip", dict(
            pip=deps+ ["ipykernel", "astrocut"]
        )]
    )))

uncomment the code below to create or update the environment the environment

```bash
mamba env create -p.mast_nb -f environment.yml
mamba env update -p.mast_nb -f environment.yml
```
create a kernel for the environment to run the notebooks in

```bash
mamba run  -p.mast_nb python -m ipykernel install --user --name mast_nb
```

%%
create an environment.yml file from the verions information previously collected
    
    import yaml; from pathlib import Path
    deps = versions.package.dropna().drop_duplicates().tolist()
    deps = [{"git": "GitPython"}.get(x,x) for x in deps ]
    Path("environment.yml").write_text(yaml.safe_dump(dict(
        name="mast_notebooks",
        channels=["conda-forge"],
        dependencies=["pip", dict(
            pip=deps+ ["ipykernel", "astrocut"]
        )]
    )))

uncomment the code below to create or update the environment the environment

```bash
mamba env create -p.mast_nb -f environment.yml
mamba env update -p.mast_nb -f environment.yml
```
create a kernel for the environment to run the notebooks in

```bash
mamba run  -p.mast_nb python -m ipykernel install --user --name mast_nb
```

create an environment.yml file from the verions information previously collected

import yaml; from pathlib import Path
deps = versions.package.dropna().drop_duplicates().tolist()
deps = [{"git": "GitPython"}.get(x,x) for x in deps ]
Path("environment.yml").write_text(yaml.safe_dump(dict(
    name="mast_notebooks",
    channels=["conda-forge"],
    dependencies=["pip", dict(
        pip=deps+ ["ipykernel", "astrocut"]
    )]
)))

uncomment the code below to create or update the environment the environment

mamba env create -p.mast_nb -f environment.yml
mamba env update -p.mast_nb -f environment.yml

create a kernel for the environment to run the notebooks in

mamba run  -p.mast_nb python -m ipykernel install --user --name mast_nb

%%
### executing the notebooks

execute the notebooks using the `nbclient` library

    client = notebooks.nb.apply(nbformat.from_dict).apply(
        nbclient.NotebookClient, kernel_name="mast_nb", allow_errors=True
    )

recombine the executed notebooks with the original ones.

    notebooks.nb = (
        await client.head(3).apply(nbclient.NotebookClient.async_execute).gather()
    ).combine_first(notebooks.nb)

%%
### executing the notebooks

execute the notebooks using the `nbclient` library

    client = notebooks.nb.apply(nbformat.from_dict).apply(
        nbclient.NotebookClient, kernel_name="mast_nb", allow_errors=True
    )

recombine the executed notebooks with the original ones.

    notebooks.nb = (
        await client.head(3).apply(nbclient.NotebookClient.async_execute).gather()
    ).combine_first(notebooks.nb)

executing the notebooks

execute the notebooks using the nbclient library

client = notebooks.nb.apply(nbformat.from_dict).apply(
    nbclient.NotebookClient, kernel_name="mast_nb", allow_errors=True
)

recombine the executed notebooks with the original ones.

notebooks.nb = (
    await client.head(3).apply(nbclient.NotebookClient.async_execute).gather()
).combine_first(notebooks.nb)

0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.

%%
merge our default async jinja exporter with the legacy nbconvert importer. the async template is a massive improvement of the sync nbconvert.

    
    exporter = nbconvert.get_exporter("a11y")()
generate the default `resources` object for templating the notebook

    _, resources = exporter.from_notebook_node(nbformat.reads("""{"cells": [], "metadata": {}}""", 4))    
    ours,theirs = Series().template.environment,exporter.environment
    ours.loader = theirs.loader
    ours.filters.update({**theirs.filters, **ours.filters})
    ours.globals.update({**theirs.globals, **ours.globals})

%%
merge our default async jinja exporter with the legacy nbconvert importer. the async template is a massive improvement of the sync nbconvert.

    
    exporter = nbconvert.get_exporter("a11y")()
generate the default `resources` object for templating the notebook

    _, resources = exporter.from_notebook_node(nbformat.reads("""{"cells": [], "metadata": {}}""", 4))    
    ours,theirs = Series().template.environment,exporter.environment
    ours.loader = theirs.loader
    ours.filters.update({**theirs.filters, **ours.filters})
    ours.globals.update({**theirs.globals, **ours.globals})

merge our default async jinja exporter with the legacy nbconvert importer. the async template is a massive improvement of the sync nbconvert.

exporter = nbconvert.get_exporter("a11y")()

generate the default resources object for templating the notebook

_, resources = exporter.from_notebook_node(nbformat.reads("""{"cells": [], "metadata": {}}""", 4))    
ours,theirs = Series().template.environment,exporter.environment
ours.loader = theirs.loader
ours.filters.update({**theirs.filters, **ours.filters})
ours.globals.update({**theirs.globals, **ours.globals})

%%
create the footer for all of the pages. currently there is no specific license identified, 
but if there were we should use the [license microformat](https://microformats.org/wiki/rel-license).

    footer = ours.filters["markdown"](F"""
By {config.author}
    
© Copyright {config.copyright}

    """)
    footer

%%
create the footer for all of the pages. currently there is no specific license identified, 
but if there were we should use the [license microformat](https://microformats.org/wiki/rel-license).

    footer = ours.filters["markdown"](F"""
By {config.author}
    
© Copyright {config.copyright}

    """)
    footer

create the footer for all of the pages. currently there is no specific license identified, but if there were we should use the license microformat .

footer = ours.filters["markdown"](F"""

By {config.author}

""")
footer

'<p>By STScI</p>\n<p>© Copyright 2022-2024</p>\n'

%%
add site navigation in the post processing step
previous and next still needs to be added previous and next is relative.
we'll want the config to do this work too. it needs to be passed to the template to construct license and link information

%%
add site navigation in the post processing step
previous and next still needs to be added previous and next is relative.
we'll want the config to do this work too. it needs to be passed to the template to construct license and link information

add site navigation in the post processing step previous and next still needs to be added previous and next is relative. we'll want the config to do this work too. it needs to be passed to the template to construct license and link information

    htmls = (
        await notebooks.head().template.render_template(
            "a11y/table.html.j2", resources=resources, config=config,
            footer=footer
        )
    ).apply(exporter.post_process_html)
    htmls.to_frame().T

    htmls = (
        await notebooks.head().template.render_template(
            "a11y/table.html.j2", resources=resources, config=config,
            footer=footer
        )
    ).apply(exporter.post_process_html)
    htmls.to_frame().T

file	mast_notebooks/notebooks/astrocut/making_tess_cubes_and_cutouts/making_tess_cubes_and_cutouts.ipynb	mast_notebooks/notebooks/astroquery/beginner_search/beginner_search.ipynb	mast_notebooks/notebooks/astroquery/beginner_zcut/beginner_zcut.ipynb	mast_notebooks/notebooks/astroquery/large_downloads/large_downloads.ipynb	mast_notebooks/notebooks/astroquery/historic_quasar_observations/historic_quasar_observations.ipynb
0	<!DOCTYPE html>\n<html lang="en">\n <head>\n ...	<!DOCTYPE html>\n<html lang="en">\n <head>\n ...	<!DOCTYPE html>\n<html lang="en">\n <head>\n ...	<!DOCTYPE html>\n<html lang="en">\n <head>\n ...	<!DOCTYPE html>\n<html lang="en">\n <head>\n ...

view the generated documentation as inline iframes

    iframe = """<iframe height="600" srcdoc="{}" width="100%"></iframe>"""
    iframes = htmls.apply(compose_left(__import__("html").escape, iframe.format))
    display(*iframes.apply(HTML))

    iframe = """<iframe height=600 width="100%" srcdoc="{}"></iframe>"""
    iframes = htmls.apply(compose_left(__import__("html").escape, iframe.format))
    display(*iframes.apply(HTML))

## closing thoughts 

these notebooks have unexecuted code and `nbconvert-a11y` had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33

currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly).
i'd like to use the jupyter toc as the toc to generate these templates.
this format for the documentation is a lot more flexible to modify than standard documentation systems.
we are dealing with our documentation as an intermediate table to offers introspection and manipulation.

search is a last thing to implement
are the colab and binder links even valid?
we'll need templates to handle myst admonitions

## closing thoughts 

these notebooks have unexecuted code and `nbconvert-a11y` had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33

currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly).
i'd like to use the jupyter toc as the toc to generate these templates.
this format for the documentation is a lot more flexible to modify than standard documentation systems.
we are dealing with our documentation as an intermediate table to offers introspection and manipulation.

search is a last thing to implement
are the colab and binder links even valid?
we'll need templates to handle myst admonitions


        {'jp-MarkdownHeadingCollapsed': True, 'data': {'text/html': 'closing thoughts
\nthese notebooks have unexecuted code and nbconvert-a11y had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33
\ncurrently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly).\ni\'d like to use the jupyter toc as the toc to generate these templates.\nthis format for the documentation is a lot more flexible to modify than standard documentation systems.\nwe are dealing with our documentation as an intermediate table to offers introspection and manipulation.
\nsearch is a last thing to implement\nare the colab and binder links even valid?\nwe\'ll need templates to handle myst admonitions\n'}}

closing thoughts

these notebooks have unexecuted code and nbconvert-a11y had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33

currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly). i'd like to use the jupyter toc as the toc to generate these templates. this format for the documentation is a lot more flexible to modify than standard documentation systems. we are dealing with our documentation as an intermediate table to offers introspection and manipulation.

search is a last thing to implement are the colab and binder links even valid? we'll need templates to handle myst admonitions

	index	execution_count	cell_type	source	outputs	metadata	toolbar	loc
code
markdown
raw

	cell	source	outputs
code
markdown
raw