notebook toolbar

Activate

index execution_count cell_type toolbar started_at completed_at source loc metadata outputs

markdown

# automating blog posts to work with jupyter-lite

there is a rare occasion that i write notebooks completely in lite.
most often i write in a conventional environment then need to ammend the content
to work when we are in jupyterlite.

# automating blog posts to work with jupyter-lite

there is a rare occasion that i write notebooks completely in lite.
most often i write in a conventional environment then need to ammend the content
to work when we are in jupyterlite.


        {'data': {'text/html': 'automating blog posts to work with jupyter-lite
\nthere is a rare occasion that i write notebooks completely in lite.\nmost often i write in a conventional environment then need to ammend the content\nto work when we are in jupyterlite.\n'}}

automating blog posts to work with jupyter-lite

there is a rare occasion that i write notebooks completely in lite. most often i write in a conventional environment then need to ammend the content to work when we are in jupyterlite.

In 1

code

    %reload_ext pidgy

    %reload_ext pidgy

%reload_ext pidgy

In 2

code

## what do we need to do to make a post work in lite?

* explicitly defined dependencies.

    working on a virtual file system is different than your standard file system.
    normally we don't have to define our environment each time,
    but independent of a real file system - in the browser - we need to install packages each time.
    
* patching shit

    if we use requests then we should used https://github.com/koenvo/pyodide-http

* dealing with `pidgy` and extensions.
    
* some packages won't work in lite so we will throw a warning when we know this fo


    we can infer this information or provide it explicitly in the metadata

## what do we need to do to make a post work in lite?

* explicitly defined dependencies.

    working on a virtual file system is different than your standard file system.
    normally we don't have to define our environment each time,
    but independent of a real file system - in the browser - we need to install packages each time.
    
* patching shit

    if we use requests then we should used https://github.com/koenvo/pyodide-http

* dealing with `pidgy` and extensions.
    
* some packages won't work in lite so we will throw a warning when we know this fo


    we can infer this information or provide it explicitly in the metadata

what do we need to do to make a post work in lite?

explicitly defined dependencies.

working on a virtual file system is different than your standard file system. normally we don't have to define our environment each time, but independent of a real file system - in the browser - we need to install packages each time.
patching shit

if we use requests then we should used https://github.com/koenvo/pyodide-http
dealing with pidgy and extensions.
some packages won't work in lite so we will throw a warning when we know this fo

we can infer this information or provide it explicitly in the metadata

In 3

code

sometimes i forget imports

sometimes i forget imports

sometimes i forget imports

In 4

code

## [`depfinder`](https://github.com/ericdill/depfinder) to find packages in a notebook for python

some of my personal style choices might fail like when i use `__import__`, maybe this is a way to cut dependencies from the list.

## [`depfinder`](https://github.com/ericdill/depfinder) to find packages in a notebook for python

some of my personal style choices might fail like when i use `__import__`, maybe this is a way to cut dependencies from the list.

`depfinder`

some of my personal style choices might fail like when i use __import__ , maybe this is a way to cut dependencies from the list.

In 5

code

    from pathlib import Path
    import depfinder, pandas
    __import__("requests_cache").install_cache()

    from pathlib import Path
    import depfinder, pandas
    __import__("requests_cache").install_cache()

from pathlib import Path
import depfinder, pandas
__import__("requests_cache").install_cache()

In 6

code

    def get_files(dir="", glob="*.ipynb") -> pandas.Index:
        return pandas.Index(Path(dir).rglob(glob)).rename("files")

    def get_files(dir="", glob="*.ipynb") -> pandas.Index:
        return pandas.Index(Path(dir).rglob(glob)).rename("files")

def get_files(dir="", glob="*.ipynb") -> pandas.Index:
    return pandas.Index(Path(dir).rglob(glob)).rename("files")

In 7

code

    def get_cells(files: pandas.Index) -> pandas.DataFrame:
        df = (
            files.to_series().apply(Path.read_text)
            .apply(json.loads).apply(pandas.Series)
            .cells.apply(pandas.Series).stack().apply(pandas.Series)
        )
        return df.join(get_has_pidgy(df))

    def get_cells(files: pandas.Index) -> pandas.DataFrame:
        df = (
            files.to_series().apply(Path.read_text)
            .apply(json.loads).apply(pandas.Series)
            .cells.apply(pandas.Series).stack().apply(pandas.Series)
        )
        return df.join(get_has_pidgy(df))

def get_cells(files: pandas.Index) -> pandas.DataFrame:
    df = (
        files.to_series().apply(Path.read_text)
        .apply(json.loads).apply(pandas.Series)
        .cells.apply(pandas.Series).stack().apply(pandas.Series)
    )
    return df.join(get_has_pidgy(df))

markdown

### can haz pidgy?

### can haz pidgy?

can haz pidgy?

markdown

some of these posts are in `pidgy`, i'll use `%reload_ext pidgy` when that is the situation.
peek in the `cells` to find `pidgy` notebooks.

some of these posts are in `pidgy`, i'll use `%reload_ext pidgy` when that is the situation.
peek in the `cells` to find `pidgy` notebooks.


        {'data': {'text/html': "some of these posts are in pidgy, i'll use %reload_ext pidgy when that is the situation.\npeek in the cells to find pidgy notebooks.\n"}}

some of these posts are in pidgy , i'll use %reload_ext pidgy when that is the situation. peek in the cells to find pidgy notebooks.

In 8

code

    def get_has_pidgy(cells):
        return cells[cells.cell_type.eq("code")].source.apply("".join).groupby(
            pandas.Grouper(level=0)
        ).apply(lambda df: df.str.contains("%[re]*load_ext pidgy").any()).rename("pidgy")

    def get_has_pidgy(cells):
        return cells[cells.cell_type.eq("code")].source.apply("".join).groupby(
            pandas.Grouper(level=0)
        ).apply(lambda df: df.str.contains("%[re]*load_ext pidgy").any()).rename("pidgy")

def get_has_pidgy(cells):
    return cells[cells.cell_type.eq("code")].source.apply("".join).groupby(
        pandas.Grouper(level=0)
    ).apply(lambda df: df.str.contains("%[re]*load_ext pidgy").any()).rename("pidgy")

In 9

code

    cells = get_cells(get_files())

    cells = get_cells(get_files())

cells = get_cells(get_files())

In 10

code

### get the imports
    def get_import(row: pandas.Series) -> dict:
`get_import` normalizes the cell source code for analysis by `depfinder`.
this method catches those situations or returns the attributes of `depfinder.inspection.ImportFinder`

        source = "".join(row.source)
        if row.pidgy:
            source = midgy.python.Python().render(source)
        try:
            return vars(depfinder.inspection.get_imported_libs(textwrap.dedent(source), row.name[0]))
        except BaseException as e:
            return None

### get the imports
    def get_import(row: pandas.Series) -> dict:
`get_import` normalizes the cell source code for analysis by `depfinder`.
this method catches those situations or returns the attributes of `depfinder.inspection.ImportFinder`

        source = "".join(row.source)
        if row.pidgy:
            source = midgy.python.Python().render(source)
        try:
            return vars(depfinder.inspection.get_imported_libs(textwrap.dedent(source), row.name[0]))
        except BaseException as e:
            return None

get the imports

def get_import(row: pandas.Series) -> dict:

get_import normalizes the cell source code for analysis by depfinder . this method catches those situations or returns the attributes of depfinder.inspection.ImportFinder

    source = "".join(row.source)
    if row.pidgy:
        source = midgy.python.Python().render(source)
    try:
        return vars(depfinder.inspection.get_imported_libs(textwrap.dedent(source), row.name[0]))
    except BaseException as e:
        return None

markdown

### evaluate the sources

### evaluate the sources

evaluate the sources

In 11

code

    import depfinder, pandas, midgy
    __import__("requests_cache").install_cache()
    Ø = __name__ == "__main__" and "__file__" not in locals()

    import depfinder, pandas, midgy
    __import__("requests_cache").install_cache()
    Ø = __name__ == "__main__" and "__file__" not in locals()

import depfinder, pandas, midgy
__import__("requests_cache").install_cache()
Ø = __name__ == "__main__" and "__file__" not in locals()

In 12

code

    def get_modules(cells):
        return (
            (
                results:=
                cells[cells.cell_type.eq("code")].apply(get_import, axis=1)
                .dropna().apply(functools.partial(pandas.Series, dtype="O"))
            )[results.columns[results.columns.str.endswith("_modules")]]
        )

    def get_modules(cells):
        return (
            (
                results:=
                cells[cells.cell_type.eq("code")].apply(get_import, axis=1)
                .dropna().apply(functools.partial(pandas.Series, dtype="O"))
            )[results.columns[results.columns.str.endswith("_modules")]]
        )

def get_modules(cells):
    return (
        (
            results:=
            cells[cells.cell_type.eq("code")].apply(get_import, axis=1)
            .dropna().apply(functools.partial(pandas.Series, dtype="O"))
        )[results.columns[results.columns.str.endswith("_modules")]]
    )

markdown

a snapshot of the modules import within the content

a snapshot of the modules import within the content

a snapshot of the modules import within the content

In 13

code

    if Ø:
        (cells := get_cells(get_files()))
        (cells := cells.join(get_modules(cells)))
        modules = cells[cells.columns[cells.columns.str.endswith("_modules")]]
        modules = modules.stack().apply(list).apply(pandas.Series, dtype="O").stack()
        return HTML(modules.value_counts().to_frame().T.to_html())

    if Ø:
        (cells := get_cells(get_files()))
        (cells := cells.join(get_modules(cells)))
        modules = cells[cells.columns[cells.columns.str.endswith("_modules")]]
        modules = modules.stack().apply(list).apply(pandas.Series, dtype="O").stack()
        return HTML(modules.value_counts().to_frame().T.to_html())

	pandas	pathlib	requests	tonyfast	IPython	json	midgy	functools	toolz	re	depfinder	shlex	importnb	dataclasses	typer	pidgy	nbconvert	ipywidgets	info	textwrap	typing	traitlets	doit	dask	pluggy	sys	requests_cache	jsonref	playwright	nbformat	inspect	types	ast	uritemplate	__	jinja2	numpy	io	gc	pyld	orjson	jsonpointer	urllib.parse	linecache	operator	poser	matplotlib	rich	mpl_toolkits	importlib	anyio	unittest.mock	__static_notebook_tags	nbconvert_html5	bs4	mkdocs	icalendar	yaml	hvplot	jupyter_core	shutil	arrow	__11_12_async_import	doctest	unittest	__09_pyproject_analysis	html	tomli	pytest	__12_09_pyproject_analysis	warnings	_022_10_21_markdown_future	a_little_markdown_program	click	markdown	sympy	__better_dask_shape	asyncio	dis	traceback	micropip	__pycache__	ibis	duckdb	pyarrow	nest_asyncio	collections	abc	tqdm
0	41	30	14	14	14	13	12	12	11	10	10	7	7	7	7	6	6	6	6	6	6	5	5	5	5	5	5	5	5	5	4	4	4	4	4	4	3	3	3	3	3	3	3	3	3	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	2	1	1	1	1	1	1	1	1	1	1	1	1	1	1

if Ø:
    (cells := get_cells(get_files()))
    (cells := cells.join(get_modules(cells)))
    modules = cells[cells.columns[cells.columns.str.endswith("_modules")]]
    modules = modules.stack().apply(list).apply(pandas.Series, dtype="O").stack()
    return HTML(modules.value_counts().to_frame().T.to_html())

markdown

## todo

* inject the imports back into the notebooks. where though?
* find magics

## todo

* inject the imports back into the notebooks. where though?
* find magics

todo

inject the imports back into the notebooks. where though?
find magics