index
execution_count
cell_type
toolbar
started_at
completed_at
source
loc
metadata
outputs
1
unexecuted
In
[
]
markdown
# analyzing `pyproject.toml` configurations of popular projects
metadata
1
2
unexecuted
In
[
]
markdown
`get_pyproject_config_data` loads a dataframe created in a [previous post].
it contains the results of a <var>graphql</var> query to posted to the github api
that returnsthe pyproject files for some of the most popular python projects on github.
[previous post ]: 2022-12-09-pyproject-analysis.ipynb
metadata
5
get_pyproject_config_data
loads a dataframe created in a
previous post
.
it contains the results of a
graphql query to posted to the github api
that returnsthe pyproject files for some of the most popular python projects on github.
3
executed
In
[
1
]
code
def get_pyproject_config_data () -> "pandas.DataFrame" :
with importnb . Notebook (): from __09_pyproject_analysis import tidy_configs , tidy_responses , gather , pyproject_query
return tidy_configs ( df := tidy_responses ( responses := gather ( pyproject_query , max = 15 )))
metadata
3
0 outputs.
Out
[
1
]
4
unexecuted
In
[
]
markdown
the shape of our dataframe - `df` from `get_pyproject_config_data` - is:
* on the rows: one project per row
* on the columns: the keys found in the projects `pyproject.toml`
metadata
4
the shape of our dataframe -
df
from
get_pyproject_config_data
- is:
on the rows: one project per row
on the columns: the keys found in the projects
pyproject.toml
5
executed
In
[
2
]
code
import importnb , pandas
display (( df := get_pyproject_config_data ()) . head ( 3 ))
F "there are { len ( df ) } `pyproject.toml` files in the dataset."
metadata
3
2 outputs.
Out
[
2
]
tool
build-system
project
mypy
flake8
url
https://github.com/open-telemetry/opentelemetry-python
{'black': {'line-length': 79, 'exclude': '(
/( # generated files
.tox|
venv|
.*/build/lib/.*|
exporter/opentelemetry-exporter-jaeger-proto-grpc/src/opentelemetry/exporter/jaeger/proto/grpc/gen|
exporter/opentelemetry-exporter-jaeger-thrift/src/opentelemetry/exporter/jaeger/thrift/gen|
exporter/opentelemetry-exporter-zipkin-proto-http/src/opentelemetry/exporter/zipkin/proto/http/v2/gen|
opentelemetry-proto/src/opentelemetry/proto/.*/.*|
scripts
)/
)
'}, 'pytest': {'ini_options': {'addopts': '-rs -v', 'log_cli': True, 'log_cli_level': 'warning'}}}
NaN
NaN
NaN
NaN
https://github.com/freemocap/freemocap
{'taskipy': {'tasks': {'setup': 'pre-commit install', 'test': 'python -m unittest src/tests/**/test_*', 'installer': './bin/installer.sh', 'format': 'black src/'}}}
NaN
NaN
NaN
NaN
https://github.com/3b1b/manim
NaN
{'requires': ['setuptools', 'wheel']}
NaN
NaN
NaN
'there are 234 `pyproject.toml` files in the dataset.'
6
unexecuted
In
[
]
markdown
### what tools are used most?
PEPXXX defines the `tool` key as a place that third party applications can store configuration information.
metadata
3
PEPXXX defines the
tool
key as a place that third party applications can store configuration information.
7
unexecuted
In
[
]
markdown
when we explode the `df.tool` in `tools` we find a frame with all the third party tools named.
metadata
1
when we explode the
df.tool
in
tools
we find a frame with all the third party tools named.
8
executed
In
[
3
]
code
( tools := df . tool . dropna () . apply ( pandas . Series )) . head ( 3 ) . fillna ( "" )
metadata
1
1 outputs.
Out
[
3
]
black
pytest
taskipy
pyright
hatch
isort
mutmut
check-wheel-contents
flit
coverage
...
jupyter-releaser
check-manifest
vendoring
commitizen
scriv
autoflake
tbump
autopub
poetry-version-plugin
typeshed
url
https://github.com/open-telemetry/opentelemetry-python
{'line-length': 79, 'exclude': '(
/( # generated files
.tox|
venv|
.*/build/lib/.*|
exporter/opentelemetry-exporter-jaeger-proto-grpc/src/opentelemetry/exporter/jaeger/proto/grpc/gen|
exporter/opentelemetry-exporter-jaeger-thrift/src/opentelemetry/exporter/jaeger/thrift/gen|
exporter/opentelemetry-exporter-zipkin-proto-http/src/opentelemetry/exporter/zipkin/proto/http/v2/gen|
opentelemetry-proto/src/opentelemetry/proto/.*/.*|
scripts
)/
)
'}
{'ini_options': {'addopts': '-rs -v', 'log_cli': True, 'log_cli_level': 'warning'}}
...
https://github.com/freemocap/freemocap
{'tasks': {'setup': 'pre-commit install', 'test': 'python -m unittest src/tests/**/test_*', 'installer': './bin/installer.sh', 'format': 'black src/'}}
...
https://github.com/openai/gym
{'ini_options': {'filterwarnings': ['ignore:.*step API.*:DeprecationWarning']}}
{'include': ['gym/**', 'tests/**'], 'exclude': ['**/node_modules', '**/__pycache__'], 'strict': [], 'typeCheckingMode': 'basic', 'pythonVersion': '3.6', 'pythonPlatform': 'All', 'typeshedPath': 'typeshed', 'enableTypeIgnoreComments': True, 'reportMissingImports': 'none', 'reportMissingTypeStubs': False, 'reportInvalidTypeVarUse': 'none', 'reportGeneralTypeIssues': 'none', 'reportUntypedFunctionDecorator': 'none', 'reportPrivateUsage': 'warning', 'reportUnboundVariable': 'warning'}
...
3 rows × 54 columns
9
executed
In
[
4
]
code
F "there are { len ( tools . columns ) } tools used in the { len ( df ) } pyproject.toml files."
metadata
1
1 outputs.
Out
[
4
]
'there are 54 tools used in the 234 pyproject.toml files.'
10
unexecuted
In
[
]
markdown
the `top12` most frequently defined tools in the `pyproject.toml` s are
metadata
1
the
top12
most frequently defined tools in the
pyproject.toml
s are
11
executed
In
[
8
]
code
tool_counts = tools . isna () . astype ( int ) . sub ( 1 ) . abs () . sum () . sort_values ( ascending = False )
( top12 := tool_counts . iloc [: 12 ]) . to_frame ( "counts" ) . T
metadata
2
1 outputs.
Out
[
8
]
black
isort
pytest
mypy
coverage
poetry
setuptools_scm
hatch
setuptools
pylint
towncrier
pyright
counts
123
85
67
42
34
32
21
15
14
14
11
10
12
unexecuted
In
[
]
markdown
from the perspective of these popular projects:
* there is strong community adoption of `black` and `isort` .
from this data it might be a recommended convention to format your code and sort your imports.
* `pytest` 's third place popularity recommends that we test our projects
* `mypy` suggests that type hinting is feature of some popular projects
* `coverage`
* `poetry`
* `setuptools_scm`
* `hatch`
next we find
metadata
12
from the perspective of these popular projects:
there is strong community adoption of
black
and
isort
.
from this data it might be a recommended convention to format your code and sort your imports.
pytest
's third place popularity recommends that we test our projects
mypy
suggests that type hinting is feature of some popular projects
coverage
poetry
setuptools_scm
hatch
next we find
13
unexecuted
In
[
]
markdown
metadata
1
14
executed
In
[
9
]
code
df [ "build-system" ] . dropna () . apply ( pandas . Series )
metadata
1
1 outputs.
Out
[
9
]
requires
build-backend
dependencies
url
https://github.com/3b1b/manim
[setuptools, wheel]
NaN
NaN
https://github.com/deepmind/hanabi-learning-environment
[setuptools, wheel, scikit-build, cmake, ninja]
NaN
NaN
https://github.com/miguelgrinberg/Flask-SocketIO
[setuptools>=42, wheel]
setuptools.build_meta
NaN
https://github.com/pypa/pipx
[hatchling>=0.15.0]
hatchling.build
NaN
https://github.com/py-pdf/PyPDF2
[flit_core >=3.2,<4]
flit_core.buildapi
NaN
...
...
...
...
https://github.com/deepset-ai/haystack
[hatchling>=1.8.0]
hatchling.build
NaN
https://github.com/dedupeio/dedupe
[setuptools==63, wheel, cython]
setuptools.build_meta
NaN
https://github.com/sktime/sktime
[setuptools>61, wheel, toml, build]
setuptools.build_meta
NaN
https://github.com/enthought/mayavi
[oldest-supported-numpy, setuptools, vtk, wheel]
NaN
NaN
https://github.com/holoviz/datashader
[param, pyct, setuptools]
setuptools.build_meta
NaN
173 rows × 3 columns
15
unexecuted
In
[
]
markdown
metadata
1
16
executed
In
[
10
]
code
df . project . dropna () . apply ( pandas . Series ) . head ( 0 )
metadata
1
1 outputs.
Out
[
10
]
name
description
readme
license
requires-python
keywords
authors
classifiers
dependencies
dynamic
urls
scripts
maintainers
optional-dependencies
entry-points
gui-scripts
version
url