skip to main content

@tonyfast s notebooks

site navigation
notebook summary
title
automating blog posts to work with jupyter-lite
description
there is a rare occasion that i write notebooks completely in lite. most often i write in a conventional environment then need to ammend the content to work when we are in jupyterlite.
cells
19 total
13 code
state
executed in order
kernel
pidgy
language
markdown
name
pidgy
lines of code
71
outputs
14
table of contents
{"kernelspec": {"display_name": "pidgy", "language": "markdown", "name": "pidgy"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13"}, "title": "automating blog posts to work with jupyter-lite", "description": "there is a rare occasion that i write notebooks completely in lite.\nmost often i write in a conventional environment then need to ammend the content\nto work when we are in jupyterlite."}
notebook toolbar
Activate
cell ordering
1

automating blog posts to work with jupyter-lite

there is a rare occasion that i write notebooks completely in lite. most often i write in a conventional environment then need to ammend the content to work when we are in jupyterlite.

2 1 outputs.
%reload_ext pidgy
3 1 outputs.

what do we need to do to make a post work in lite?

  • explicitly defined dependencies.

    working on a virtual file system is different than your standard file system. normally we don't have to define our environment each time, but independent of a real file system - in the browser - we need to install packages each time.

  • patching shit

    if we use requests then we should used https://github.com/koenvo/pyodide-http

  • dealing with pidgy and extensions.

  • some packages won't work in lite so we will throw a warning when we know this fo

    we can infer this information or provide it explicitly in the metadata

4 1 outputs.

sometimes i forget imports

5 1 outputs.

depfinder

some of my personal style choices might fail like when i use __import__ , maybe this is a way to cut dependencies from the list.

6 1 outputs.
from pathlib import Path
import depfinder, pandas
__import__("requests_cache").install_cache()
7 1 outputs.
def get_files(dir="", glob="*.ipynb") -> pandas.Index:
    return pandas.Index(Path(dir).rglob(glob)).rename("files")
8 1 outputs.
def get_cells(files: pandas.Index) -> pandas.DataFrame:
    df = (
        files.to_series().apply(Path.read_text)
        .apply(json.loads).apply(pandas.Series)
        .cells.apply(pandas.Series).stack().apply(pandas.Series)
    )
    return df.join(get_has_pidgy(df))
9

can haz pidgy?

10

some of these posts are in pidgy , i'll use %reload_ext pidgy when that is the situation. peek in the cells to find pidgy notebooks.

11 1 outputs.
def get_has_pidgy(cells):
    return cells[cells.cell_type.eq("code")].source.apply("".join).groupby(
        pandas.Grouper(level=0)
    ).apply(lambda df: df.str.contains("%[re]*load_ext pidgy").any()).rename("pidgy")
12 1 outputs.
cells = get_cells(get_files())
13 1 outputs.

get the imports

def get_import(row: pandas.Series) -> dict:

get_import normalizes the cell source code for analysis by depfinder . this method catches those situations or returns the attributes of depfinder.inspection.ImportFinder

    source = "".join(row.source)
    if row.pidgy:
        source = midgy.python.Python().render(source)
    try:
        return vars(depfinder.inspection.get_imported_libs(textwrap.dedent(source), row.name[0]))
    except BaseException as e:
        return None
14

evaluate the sources

15 1 outputs.
import depfinder, pandas, midgy
__import__("requests_cache").install_cache()
Ø = __name__ == "__main__" and "__file__" not in locals()
16 1 outputs.
def get_modules(cells):
    return (
        (
            results:=
            cells[cells.cell_type.eq("code")].apply(get_import, axis=1)
            .dropna().apply(functools.partial(pandas.Series, dtype="O"))
        )[results.columns[results.columns.str.endswith("_modules")]]
    )
17

a snapshot of the modules import within the content

18 2 outputs.
pandas pathlib requests tonyfast IPython json midgy functools toolz re depfinder shlex importnb dataclasses typer pidgy nbconvert ipywidgets info textwrap typing traitlets doit dask pluggy sys requests_cache jsonref playwright nbformat inspect types ast uritemplate __ jinja2 numpy io gc pyld orjson jsonpointer urllib.parse linecache operator poser matplotlib rich mpl_toolkits importlib anyio unittest.mock __static_notebook_tags nbconvert_html5 bs4 mkdocs icalendar yaml hvplot jupyter_core shutil arrow __11_12_async_import doctest unittest __09_pyproject_analysis html tomli pytest __12_09_pyproject_analysis warnings _022_10_21_markdown_future a_little_markdown_program click markdown sympy __better_dask_shape asyncio dis traceback micropip __pycache__ ibis duckdb pyarrow nest_asyncio collections abc tqdm
0 41 30 14 14 14 13 12 12 11 10 10 7 7 7 7 6 6 6 6 6 6 5 5 5 5 5 5 5 5 5 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
if Ø:
    (cells := get_cells(get_files()))
    (cells := cells.join(get_modules(cells)))
    modules = cells[cells.columns[cells.columns.str.endswith("_modules")]]
    modules = modules.stack().apply(list).apply(pandas.Series, dtype="O").stack()
    return HTML(modules.value_counts().to_frame().T.to_html())
19

todo

  • inject the imports back into the notebooks. where though?
  • find magics