index
execution_count
cell_type
toolbar
started_at
completed_at
source
loc
metadata
outputs
1
unexecuted
In
[
]
markdown
# transforming mast notebooks into nbconvert templates.
this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.
to build docs or a book from nbconvert templates we need to either shim into mkdocs
or build our own thing. sphinx can't because of how `docutils` is used.
its just easier to build our own thing. hubris, eh? lets find out.
metadata
8
this is rough, first pass of an end-to-end sketch of the MAST notebooks using dataframes and nbconvert a11y templates.
to build docs or a book from nbconvert templates we need to either shim into mkdocs
or build our own thing. sphinx can't because of how
docutils
is used.
its just easier to build our own thing. hubris, eh? lets find out.
2
executed
In
[
1
]
code
import nbconvert , nbformat , io , midgy
from toolz.curried import *
from nobook import *
from IPython.display import *
metadata
4
0 outputs.
Out
[
1
]
3
executed
In
[
2
]
code
%%
start with a string mapping the github repository to disc
s = Series ({( repo :=
https : // github . com / spacetelescope / mast_notebooks
): "mast_notebooks" }) . path ()
if not s . path . exists () . any ():
the whole repository is pretty tedious to clone . lots of pictures ? do notebooks suck that bad over a long time ?
better to use the ` -- depth ` arg for control
! git clone $ repo -- depth 1
metadata
12
1 outputs.
Out
[
2
]
start with a string mapping the github repository to disc
s = Series({(repo :=
https://github.com/spacetelescope/mast_notebooks
): "mast_notebooks"}).path()
if not s.path.exists().any():
the whole repository is pretty tedious to clone. lots of pictures? do notebooks suck that bad over a long time?
better to use the
--depth
arg for control
!git clone $repo --depth 1
4
executed
In
[
3
]
code
%%
this example reuses the mast notebook table of contents for jupyter book . on a mkdocs site we could do the same with mkdocs . yml .
toc = (
await Index ([ "mast_notebooks/_toc.yml" ], name = "path" ) . apath () . apath . load ()
) . aseries ()
metadata
6
1 outputs.
Out
[
3
]
this example reuses the mast notebook table of contents for jupyter book. on a mkdocs site we could do the same with mkdocs.yml.
toc = (
await Index(["mast_notebooks/_toc.yml"], name="path").apath().apath.load()
).aseries()
5
executed
In
[
4
]
code
config = (
await Index ([ "mast_notebooks/_config.yml" ], name = "path" ) . apath () . apath . load ()
) . aseries () . T . iloc [:, 0 ]
metadata
3
0 outputs.
Out
[
4
]
6
executed
In
[
5
]
code
chapters = toc . parts . enumerate ( "chapter" ) . series ()
sections = chapters . chapters . enumerate ( "section" ) . series ()
files = sections . sections . dropna () . enumerate ( "section" ) . series () . combine_first (
sections [[ "file" ]] . set_index ( Index ([ 0 ] * len ( sections ), name = "section" ), append = True )
)
metadata
5
0 outputs.
Out
[
5
]
7
executed
In
[
6
]
code
%%
explode the chapters , sections , files , and ultimately discover the paths of the notebooks in the documents .
chapters = toc . parts . enumerate ( "chapter" ) . series ()
sections = chapters . chapters . enumerate ( "section" ) . aseries ()
files = sections . sections . dropna () . enumerate ( "subsection" ) . aseries () . combine_first (
sections [[ "file" ]] . set_index ( Index ([ 0 ] * len ( sections ), name = "subsection" ), append = True )
)
paths = ( "mast_notebooks" / files . file . apath ())
print ( F " { ( ~ paths . path () . path . exists ()) . sum () } files missing" )
paths = paths [ await paths . apath () . apath . exists ()] . pipe ( Index )
files
metadata
12
3 outputs.
Out
[
6
]
explode the chapters, sections, files, and ultimately discover the paths of the notebooks in the documents.
chapters = toc.parts.enumerate("chapter").series()
sections = chapters.chapters.enumerate("section").aseries()
files = sections.sections.dropna().enumerate("subsection").aseries().combine_first(
sections[["file"]].set_index(Index([0]*len(sections), name="subsection"), append=True)
)
paths = ("mast_notebooks" / files.file.apath())
print(F"{(~paths.path().path.exists()).sum()} files missing")
paths = paths[await paths.apath().apath.exists()].pipe(Index)
files
4 files missing
file
path
chapter
section
subsection
mast_notebooks/_toc.yml
0
0
0
notebooks/astrocut/making_tess_cubes_and_cutou...
1
0
0
notebooks/astroquery/intro.md
1
0
notebooks/astroquery/beginner_search/beginner_...
1
notebooks/astroquery/beginner_zcut/beginner_zc...
2
notebooks/astroquery/large_downloads/large_dow...
...
...
...
...
10
1
2
notebooks/TESS/asteroid_rotation/asteroid_rota...
2
0
notebooks/TESS/interm_tesscut_dss_overlay/inte...
1
notebooks/TESS/interm_tesscut_requests/interm_...
3
0
notebooks/TESS/interm_tess_prf_retrieve/interm...
1
notebooks/TESS/removing_scattered_light_using_...
65 rows × 1 columns
8
unexecuted
In
[
]
markdown
## gathering and executing notebooks
metadata
1
9
executed
In
[
7
]
code
notebooks = ( await paths . apath () . apath . load ())
metadata
1
0 outputs.
Out
[
7
]
10
unexecuted
In
[
]
markdown
filter notebooks to and currently ignore the markdown files in the mix.
the markdown files can be represents as a notebook with a markdown cell.
metadata
2
filter notebooks to and currently ignore the markdown files in the mix.
the markdown files can be represents as a notebook with a markdown cell.
11
executed
In
[
8
]
code
import nbformat , nbclient
metadata
1
0 outputs.
Out
[
8
]
12
executed
In
[
9
]
code
notebooks = notebooks [ notebooks . index . path . suffix . eq ( ".ipynb" )] . apply (
nbformat . from_dict
) . to_frame ( "nb" )
metadata
3
0 outputs.
Out
[
9
]
13
executed
In
[
10
]
code
%%
## dependencies
i wanted to see if could build an environment that all these notebooks could run in .
we collect the dependencies from the requirements . txt files in the mast notebook directory .
we structure the dependencies using a regular expresssion so we can extract verion information too .
dependencies = await ( await (
Index ([ "mast_notebooks/" ]) . apath () . apath . rglob ( "requirements.txt" )
)) . pipe ( Index ) . apath . read_text ()
versions = dependencies . apply ( str . splitlines ) . explode () . str . extract (
"^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
)
metadata
13
1 outputs.
Out
[
10
]
i wanted to see if could build an environment that all these notebooks could run in.
we collect the dependencies from the requirements.txt files in the mast notebook directory.
we structure the dependencies using a regular expresssion so we can extract verion information too.
dependencies = await (await (
Index(["mast_notebooks/"]).apath().apath.rglob("requirements.txt")
)).pipe(Index).apath.read_text()
versions = dependencies.apply(str.splitlines).explode().str.extract(
"^(?P<package>[a-z|A-Z|_|-|0-9]+)\s*(?P<constraint>[\>|\<|=]*)?\s*(?P<version>\S*)?"
)
14
executed
In
[
11
]
code
%%
create an environment . yml file from the verions information previously collected
import yaml ; from pathlib import Path
deps = versions . package . dropna () . drop_duplicates () . tolist ()
deps = [{ "git" : "GitPython" } . get ( x , x ) for x in deps ]
Path ( "environment.yml" ) . write_text ( yaml . safe_dump ( dict (
name = "mast_notebooks" ,
channels = [ "conda-forge" ],
dependencies = [ "pip" , dict (
pip = deps + [ "ipykernel" , "astrocut" ]
)]
)))
uncomment the code below to create or update the environment the environment
``` bash
mamba env create - p . mast_nb - f environment . yml
mamba env update - p . mast_nb - f environment . yml
```
create a kernel for the environment to run the notebooks in
``` bash
mamba run - p . mast_nb python - m ipykernel install -- user -- name mast_nb
```
metadata
25
1 outputs.
Out
[
11
]
create an environment.yml file from the verions information previously collected
import yaml; from pathlib import Path
deps = versions.package.dropna().drop_duplicates().tolist()
deps = [{"git": "GitPython"}.get(x,x) for x in deps ]
Path("environment.yml").write_text(yaml.safe_dump(dict(
name="mast_notebooks",
channels=["conda-forge"],
dependencies=["pip", dict(
pip=deps+ ["ipykernel", "astrocut"]
)]
)))
uncomment the code below to create or update the environment the environment
mamba env create -p.mast_nb -f environment.yml
mamba env update -p.mast_nb -f environment.yml
create a kernel for the environment to run the notebooks in
mamba run -p.mast_nb python -m ipykernel install --user --name mast_nb
15
executed
In
[
12
]
code
%%
### executing the notebooks
execute the notebooks using the ` nbclient ` library
client = notebooks . nb . apply ( nbformat . from_dict ) . apply (
nbclient . NotebookClient , kernel_name = "mast_nb" , allow_errors = True
)
recombine the executed notebooks with the original ones .
notebooks . nb = (
await client . head ( 3 ) . apply ( nbclient . NotebookClient . async_execute ) . gather ()
) . combine_first ( notebooks . nb )
metadata
14
2 outputs.
Out
[
12
]
execute the notebooks using the
nbclient
library
client = notebooks.nb.apply(nbformat.from_dict).apply(
nbclient.NotebookClient, kernel_name="mast_nb", allow_errors=True
)
recombine the executed notebooks with the original ones.
notebooks.nb = (
await client.head(3).apply(nbclient.NotebookClient.async_execute).gather()
).combine_first(notebooks.nb)
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
16
executed
In
[
13
]
code
%%
merge our default async jinja exporter with the legacy nbconvert importer . the async template is a massive improvement of the sync nbconvert .
exporter = nbconvert . get_exporter ( "a11y" )()
generate the default ` resources ` object for templating the notebook
_ , resources = exporter . from_notebook_node ( nbformat . reads ( """{"cells": [], "metadata": {} }""" , 4 ))
ours , theirs = Series () . template . environment , exporter . environment
ours . loader = theirs . loader
ours . filters . update ({ ** theirs . filters , ** ours . filters })
ours . globals . update ({ ** theirs . globals , ** ours . globals })
metadata
12
1 outputs.
Out
[
13
]
merge our default async jinja exporter with the legacy nbconvert importer. the async template is a massive improvement of the sync nbconvert.
exporter = nbconvert.get_exporter("a11y")()
generate the default
resources
object for templating the notebook
_, resources = exporter.from_notebook_node(nbformat.reads("""{"cells": [], "metadata": {}}""", 4))
ours,theirs = Series().template.environment,exporter.environment
ours.loader = theirs.loader
ours.filters.update({**theirs.filters, **ours.filters})
ours.globals.update({**theirs.globals, **ours.globals})
17
executed
In
[
14
]
code
%%
create the footer for all of the pages . currently there is no specific license identified ,
but if there were we should use the [ license microformat ]( https : // microformats . org / wiki / rel - license ) .
footer = ours . filters [ "markdown" ]( F """
By { config . author }
© Copyright { config . copyright }
""" )
footer
metadata
11
2 outputs.
Out
[
14
]
create the footer for all of the pages. currently there is no specific license identified,
but if there were we should use the
license microformat
.
footer = ours.filters["markdown"](F"""
By {config.author}
© Copyright {config.copyright}
""")
footer
'<p>By STScI</p>\n<p>© Copyright 2022-2024</p>\n'
18
executed
In
[
15
]
code
%%
add site navigation in the post processing step
previous and next still needs to be added previous and next is relative .
we 'll want the config to do this work too. it needs to be passed to the template to construct license and link information
metadata
4
1 outputs.
Out
[
15
]
add site navigation in the post processing step
previous and next still needs to be added previous and next is relative.
we'll want the config to do this work too. it needs to be passed to the template to construct license and link information
19
executed
In
[
16
]
code
htmls = (
await notebooks . head () . template . render_template (
"a11y/table.html.j2" , resources = resources , config = config ,
footer = footer
)
) . apply ( exporter . post_process_html )
htmls . to_frame () . T
metadata
7
1 outputs.
Out
[
16
]
file
mast_notebooks/notebooks/astrocut/making_tess_cubes_and_cutouts/making_tess_cubes_and_cutouts.ipynb
mast_notebooks/notebooks/astroquery/beginner_search/beginner_search.ipynb
mast_notebooks/notebooks/astroquery/beginner_zcut/beginner_zcut.ipynb
mast_notebooks/notebooks/astroquery/large_downloads/large_downloads.ipynb
mast_notebooks/notebooks/astroquery/historic_quasar_observations/historic_quasar_observations.ipynb
0
<!DOCTYPE html>\n<html lang="en">\n <head>\n ...
<!DOCTYPE html>\n<html lang="en">\n <head>\n ...
<!DOCTYPE html>\n<html lang="en">\n <head>\n ...
<!DOCTYPE html>\n<html lang="en">\n <head>\n ...
<!DOCTYPE html>\n<html lang="en">\n <head>\n ...
20
unexecuted
In
[
]
markdown
view the generated documentation as inline iframes
metadata
1
view the generated documentation as inline iframes
21
executed
In
[
18
]
code
iframe = """<iframe height=600 width="100%" srcdoc=" {} "></iframe>"""
iframes = htmls . apply ( compose_left ( __import__ ( "html" ) . escape , iframe . format ))
display ( * iframes . apply ( HTML ))
metadata
3
5 outputs.
Out
[
18
]
22
unexecuted
In
[
]
markdown
## closing thoughts
these notebooks have unexecuted code and `nbconvert-a11y` had to updated to handle that case. cf https://github.com/deathbeds/nbconvert-a11y/issues/33
currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly).
i'd like to use the jupyter toc as the toc to generate these templates.
this format for the documentation is a lot more flexible to modify than standard documentation systems.
we are dealing with our documentation as an intermediate table to offers introspection and manipulation.
search is a last thing to implement
are the colab and binder links even valid?
we'll need templates to handle myst admonitions
metadata
12
these notebooks have unexecuted code and
nbconvert-a11y
had to updated to handle that case. cf
https://github.com/deathbeds/nbconvert-a11y/issues/33
currently i cannot handle site navigation~~ and licensing~~, but that is on the roadmap. (we want to think about them accessibly).
i'd like to use the jupyter toc as the toc to generate these templates.
this format for the documentation is a lot more flexible to modify than standard documentation systems.
we are dealing with our documentation as an intermediate table to offers introspection and manipulation.
search is a last thing to implement
are the colab and binder links even valid?
we'll need templates to handle myst admonitions
23
unexecuted
In
[
None
]
code
metadata
0
Out
[
None
]