skip to main content

@tonyfast s notebooks

site navigation
notebook summary
title
extracting a dependency graph from importlib_metadata
description
get_tidy_dist creates a tidy dataframe of the required distributions in this environment
cells
12 total
7 code
state
executed in order
kernel
Python [conda env:root] *
language
python
name
conda-root-py
lines of code
26
outputs
10
table of contents
{"kernelspec": {"display_name": "Python [conda env:root] *", "language": "python", "name": "conda-root-py"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13"}, "widgets": {"application/vnd.jupyter.widget-state+json": {"state": {}, "version_major": 2, "version_minor": 0}}, "title": "extracting a dependency graph from importlib_metadata", "description": "get_tidy_dist creates a tidy dataframe of the required distributions in this environment"}
notebook toolbar
Activate
cell ordering
1

extracting a dependency graph from

2 1 outputs.
3 1 outputs.
@functools.lru_cache # cache this because the result will always be the same and parsing can be costly
def get_tidy_dist() -> pandas.DataFrame:

get_tidy_dist creates a tidy dataframe of the required distributions in this environment

    return get_dists().loc["Requires-Dist"].apply(
        compose_left(pkg_resources.parse_requirements, first, vars, pandas.Series)
    )
4 1 outputs.
def get_dists():

get_dists iterates through the importlib_metadata.distributions extracting the known metadata.

    return pandas.Series(
        dict((x.name, x.metadata._headers) for x in importlib_metadata.distributions())
    ).rename_axis(index=["project"]).explode().apply(
        pandas.Series, index=["key", "value"]
    ).set_index("key", append=True).reorder_levels((1, 0), 0)["value"]
5

applying the functionss

6

generate the pandas.DataFrame of the dependency graph and metadata

7 2 outputs.
the metadata associated with my known python dependecies
  name url extras specifier marker unsafe_name project_name key specs hashCmp _Requirement__hash
project                      
retrolab jupyterlab None () ~=3.3.0 None jupyterlab jupyterlab jupyterlab [('~=', '3.3.0')] ('jupyterlab', None, , frozenset(), None) -7176474069730581504
retrolab jupyterlab-server None () ~=2.3 None jupyterlab-server jupyterlab-server jupyterlab-server [('~=', '2.3')] ('jupyterlab-server', None, , frozenset(), None) 4027949953799964160
retrolab jupyter-server None () ~=1.4 None jupyter-server jupyter-server jupyter-server [('~=', '1.4')] ('jupyter-server', None, , frozenset(), None) -3631246956991118336
retrolab nbclassic None () ~=0.2 None nbclassic nbclassic nbclassic [('~=', '0.2')] ('nbclassic', None, , frozenset(), None) 6600834436348797952
retrolab tornado None () >=6.1.0 None tornado tornado tornado [('>=', '6.1.0')] ('tornado', None, =6.1.0')>, frozenset(), None) 8619901927024904192
(df := get_tidy_dist()).head().style.set_caption("the metadata associated with my known python dependecies")
8

cast the tidy data as a networkx graph

9 1 outputs.
G =df.reset_index().pipe(networkx.from_pandas_edgelist, source="project", target="name")
10 2 outputs.
a table counting the frequency of specific distributions.
  pytest sphinx pytest-cov matplotlib flake8 numpy coverage requests pre-commit typing-extensions pandas importlib-metadata ipython six packaging black pyyaml jinja2 click ipywidgets
count 147 78 70 55 51 48 45 41 36 34 33 32 31 30 30 29 28 27 25 25
df.name.value_counts().to_frame("count").head(20).T.style.set_caption("a table counting the frequency of specific distributions.")
11

draw the graph n matplotlib

12 2 outputs.
No description has been provided for this image
matplotlib.pyplot.gcf().set_size_inches((20, 20))
networkx.draw_networkx(G)