adding an http attribrute to pandas series¤
i really like fluent programming styles which is why i love using pandas objects as first class citizens in my programs. i've been adding accessors to pandas over time to improve my practice. they do things like jinja templates, subprocess calls, and awaitables.
a feature i have not added is the ability to call http methods from a series of urls. also, an appropriately schematized dataframe could support more fine tuning. we ignore the dataframe case in this notebook and work off the series case.|
import midgy
from nobook.utils import *
from nobook.utils import Accessor, SERIES, INDEX, loads
from functools import partial
use aiohttpx
as an async aware http requests tool with a caching client for edge buzz buzz.
class _http(Accessor, types=SERIES | INDEX, name="http"):
def __init__(self, object):
import aiohttp, aiohttp_client_cache
super().__init__(object)
self._client = aiohttp.ClientSession
self._client_cache = partial(
aiohttp_client_cache.CachedSession, cache=aiohttp_client_cache.SQLiteBackend(".pandas.cache.sqlite")
)
def client(self, cache=True):
if cache:
return self._client_cache()
return self._client()
async def get(self, mime=None, cache=True):
async with self.client(cache) as session:
data = await apply(self.object, self._get_one, mime=mime, session=session)
return data
async def _get_one(self, url, mime=None, session=None):
async with session.get(url) as response:
if mime is None:
mime = response.headers.get("content-type", "text/plain")
mime, *_ = mime.partition(";")
data = loads(await response.text(), mime)
return data
example grabbing data about my gist history.
gists = (
await (
"https://api.github.com/users/tonyfast/gists" + Index(range(1, 5)).map("?page={}".format)
).http.get("application/json")
).explode().series().set_index("id", append=True)
(files := gists.files.series().stack().series())
conclusion¤
- this is the densest i've ever writtent his practice. i love it.
aiohttp
includes a server that could also be attached to the dataframe to serve it; imaginedf.serve()
.