Skip to content

proper html tables with multiple indexes¤

our goal is reduce the empty cells in tables, especially where headers should. empty cells diminish the experience for assistive technology users. through this study we'll design some accessible options we could generically use to represent dataframes.

    import pandas, bs4, enum, numpy, midgy
    get_ipython().display_formatter.formatters["text/html"].for_type(bs4.BeautifulSoup, str);
/tmp/ipykernel_48634/2026141445.py:1: DeprecationWarning: 
Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

  import pandas, bs4, enum, numpy, midgy
%%
<style>
:is(.jp-OutputArea-output.jp-RenderedHTMLCommon, .nb-outputs) :is(td,th) {
    border: 1px solid;
}
</style>

create a sample dataframe to work with that has multiple indexes on both axes. this facilitates our study because it is easier to remove axes than add them later. the code snippet below provides our expected outcome.

    index = pandas.MultiIndex.from_product([
        ["A", "Z"], ["M", "N", "O"], [1, 2, 3]
    ], names=[*"JKL"])
    (df := pandas.DataFrame(columns=index, index=index).rename_axis(columns=[10, 100, 1000]).head())
    single = df.droplevel((0, 1), 0).droplevel((0, 1), 1).rename_axis(None, axis=1).rename_axis(None, axis=0)
    df
10 A Z
100 M N O M N O
1000 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
J K L
A M 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
N 1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

accessibility html recommendations¤

there is a long history of html table layouts, they have existed since html 3.2 in january 1997, these standards precede a lot of the history of mass data literacy. tables are introduced to present 2-D data structures

> VISICALC represented a new idea of a way to use a computer and a new way of thinking about the world. Where conventional programming was thought of as a sequence of steps, this new thing was no longer sequential in effect: When you made a change in one place, all other things changed instantly and automatically. >> Ted Nelson[13]

considering the accessibility of tables means we need extend the visual representation to the tactile and audible experiences.

let's start with some popular advice from rachele ditullio about 5 ways to improve table accessibility

  1. Caption that table
  2. Include header text for every column
  3. Use alt attributes meaningfully
  4. Have data in every table cell
  5. Check your (con)text

this is where we start because these recommendations represent users needs before needs. accessibility requires us to center a user's visual, audible, and tactile experience when working with data.

%% -u
## testing an actual dataframe

based off these suggestions we can connect dataframe parlance to the consistent standards of html.
what follows are comments on how each all 5 of the suggestions apply to `pandas.DataFrame` objects.

1. `pandas` tables typically lack a `caption` unless the code author is aware of `df.style.set_caption`.
the `caption` element provides an aria label that`  gives assistive technology users more context as they navigate information.

            df.style.set_caption("the public api for adding a caption to a dataframe.")

2. as of v2.2, there are conformations pandas columns and indexes that generate representations containing empty headers.
    an accessible, assistive experience will avoid empty cells, especially header cells.
    the first cell in the table is empty for the cases where `count_empty_cell` reveals non-zero results.
    this means that assistive technology users will find empty cells in most of the pandas dataframe
    representations available online. this oversight is costly because technologies like screen readers 
    and braille displays require parsing information serially rather than our parallel vision experience.

            def count_empty_cell(df):
    `count_empty_cell` will count the empty `th` and `td` elements in a rendered dataframe.
    it was created to demonstrate the different conditions on dataframe indexes and columns
    that influence the current visual form of the dataframe.

    * our test dataframe has empty cells because the index and columns are unnamed.

            &gt;&gt;&gt; assert count_empty_cell(df) &gt; 0 

    * a dataframe with a named column index has no empty cells.

            &gt;&gt;&gt; assert count_empty_cell(single.rename_axis(columns="upper")) == 0

    * there are empty cells when the index is named because the index name is given its own row.

            &gt;&gt;&gt; assert count_empty_cell(single.rename_axis(index="lower")) &gt; 0

                table_cells = pandas.Series(bs4.BeautifulSoup(df.to_html(), features="lxml").select("th,td"))
                return table_cells.apply(any).__invert__().sum()

4. adding alt text to images is out of scope for this investigation. it is very valid. pandas dataframes 
    may contain various mimetypes of content and their representations should be assistive. 
    however, for this study, the index and columns are the primary axes for building an accessible table substrate.

5. `null` and `Nan` need semantically meaningful representations.
    the programming collquialisms for empty content may not translate to assistive technology.
    _what about braille_?
    it is unlikely there is a best placeholder for this values so this value should be configurable.

            placeholder = "not a number"
            df.fillna(F"<span class="sro">{placeholder}</span>").style

6. yes abbreviations and punctuation should be considered, 
    but this is an advanced technique that requires manually screen reader testing literacies.

the comparison between rachel's advice and `pandas` dataframes is just a start down the rabbit hole.
we'll begin to bring in other articles, standards, and specifications to design ARIA first rule implementations
of pandas tidy frames.

testing an actual dataframe¤

based off these suggestions we can connect dataframe parlance to the consistent standards of html. what follows are comments on how each all 5 of the suggestions apply to pandas.DataFrame objects.

  1. pandas tables typically lack a caption unless the code author is aware of df.style.set_caption. the caption element provides an aria label that` gives assistive technology users more context as they navigate information.

        df.style.set_caption("the public api for adding a caption to a dataframe.")
    
  2. as of v2.2, there are conformations pandas columns and indexes that generate representations containing empty headers. an accessible, assistive experience will avoid empty cells, especially header cells. the first cell in the table is empty for the cases where count_empty_cell reveals non-zero results. this means that assistive technology users will find empty cells in most of the pandas dataframe representations available online. this oversight is costly because technologies like screen readers and braille displays require parsing information serially rather than our parallel vision experience.

        def count_empty_cell(df):
    

    count_empty_cell will count the empty th and td elements in a rendered dataframe. it was created to demonstrate the different conditions on dataframe indexes and columns that influence the current visual form of the dataframe.

    • our test dataframe has empty cells because the index and columns are unnamed.

      &gt;&gt;&gt; assert count_empty_cell(df) &gt; 0
      
    • a dataframe with a named column index has no empty cells.

      &gt;&gt;&gt; assert count_empty_cell(single.rename_axis(columns="upper")) == 0
      
    • there are empty cells when the index is named because the index name is given its own row.

      &gt;&gt;&gt; assert count_empty_cell(single.rename_axis(index="lower")) &gt; 0
      
          table_cells = pandas.Series(bs4.BeautifulSoup(df.to_html(), features="lxml").select("th,td"))
          return table_cells.apply(any).__invert__().sum()
      
  3. adding alt text to images is out of scope for this investigation. it is very valid. pandas dataframes may contain various mimetypes of content and their representations should be assistive. however, for this study, the index and columns are the primary axes for building an accessible table substrate.

  4. null and Nan need semantically meaningful representations. the programming collquialisms for empty content may not translate to assistive technology. what about braille? it is unlikely there is a best placeholder for this values so this value should be configurable.

        placeholder = "not a number"
        df.fillna(F"<span class="sro">{placeholder}</span>").style
    
  5. yes abbreviations and punctuation should be considered, but this is an advanced technique that requires manually screen reader testing literacies.

the comparison between rachel's advice and pandas dataframes is just a start down the rabbit hole. we'll begin to bring in other articles, standards, and specifications to design ARIA first rule implementations of pandas tidy frames.

more accessible tables¤

our next resource provides more dos and don'ts that correspond to accessible table experiences.

  • ✅ Designate at least one row and/or column header using the table formatting tools in your web content management system or document creation software.

    pandas dataframes should be represented with a named index or column.

  • ✅ Use the <th> element to mark up table headers in HTML.

  • ❌ Table headers should never be empty. This is particularly of concern for the top-left cell of some tables.

    our primary task is to remove empty cells from the thead porition of the dataframe representation. currently, it is very common for screen reader users to find empty first cells in a table. imagine how much it sucks.

  • ✅ If you do create a complex data table on a webpage, use the <scope> tag to programmatically associate the data cells with the appropriate headers.

  • ❌ Don't merge cells.

    merging cells creates ambiguities. tds in dataframes will NEVER be spanned.

Accessible table do and don't¤

https://accessibility.umn.edu/what-you-can-do/start-7-core-skills/tables

Use Tables to Display Data * ✅ Use tables to present information in a grid, or matrix, with columns or rows that show the meaning of the information. - ❌ Don't use tables to make your webpage look a particular way. Layout tables on webpages do not pose inherent accessibility issues, but it is more difficult to make sure screen reader software reads the cells in the proper order. - ❌ Never use tables as a means of laying out a page in a Google or Microsoft Word document. While these tables can be hidden from visual users by simply eliminating the borders between cells, they cannot be hidden from screen readers.
Designate Row and/or Column Headers * ✅ Designate at least one row and/or column header using the table formatting tools in your web content management system or document creation software. * ✅ Use the `` element to mark up table headers in HTML. - ❌ Don't create tables without table headers. - ❌ Don't just change the visual formatting of the text, such as the font size or color, to visually indicate table header rows and/or columns. Screen readers will not be able to associate the headers with the correct cells. - ❌ Table headers should never be empty. This is particularly of concern for the top-left cell of some tables.
Avoid or Simplify Complex Tables * ✅ Include a maximum of one header row and one header column. * ✅ Spell out abbreviations or acronyms, or use the `` or `` tags in HTML to ensure accessibility. * ✅ If your table has multiple header rows, merged cells, or another table embedded in it, split it into two or more simple tables. * ✅ If you do create a complex data table on a webpage, use the `` tag to programmatically associate the data cells with the appropriate headers. - ❌ Don't merge cells. - ❌ Don't include a table within another table.
Provide Contextual Information * ✅ Associate descriptive text about a table with its respective table by including a `` element in HTML or alt text in Microsoft Word. Captions are not necessary for each table, but can helpful for screen reader users. The caption can be visually formatted and positioned above or below the table as needed, but on webpages, the `` element must be the first one after the opening `` tag. * ✅ You may use ``, ``, and `` tags in HTML tables so that the head and/or foot rows repeat at the top or bottom of the table when it is printed, but these do not provide any additional accessibility benefits. - ❌ Don't repeat the same text in the caption that appears in a heading preceding the table. - ❌ You may provide a summary of the structure of the data table (not of the content) using the `` attribute, but screen reader support for it varies, and it is not part of the HTML5 specification, so WebAim does not recommend it. - ❌ If both a caption and summary are provided for one table, the summary should not duplicate information present in the caption.
Include Content in All Cells * ✅ Include text such as "not applicable," "none," etc. to indicate that there is no data in empty cells. - ❌ Don't leave any table cells empty.

better tables¤

%%
    def index_span(index: pandas.Index) -&gt; pandas.DataFrame: 
we need to tidy our indexes that may have grouped indexes. 
`index_span` defines the logic for diffing and labelling the index to measure the column
and row spans for an index.

        return pandas.concat(
            dict(
                diff=(diff := index.to_frame().pipe(diff_shift)),
                label=(label := diff.cumsum()),
                span=label.apply(
                    lambda s: s.drop_duplicates().apply(s.value_counts().get), axis=0
                )
            ), axis=1
        ).replace({numpy.nan: None})

    def diff_shift(df: pandas.DataFrame) -&gt; pandas.DataFrame:
shift a data by a row to determine the nearest change in the index when determining spanning metrics.

        return pandas.DataFrame(
            numpy.concatenate((numpy.array([[True]*df.shape[1]]), df.values[:-1] != df.values[1:]), 0), 
            columns=df.columns
        )
def index_span(index: pandas.Index) -&gt; pandas.DataFrame:

we need to tidy our indexes that may have grouped indexes. index_span defines the logic for diffing and labelling the index to measure the column and row spans for an index.

    return pandas.concat(
        dict(
            diff=(diff := index.to_frame().pipe(diff_shift)),
            label=(label := diff.cumsum()),
            span=label.apply(
                lambda s: s.drop_duplicates().apply(s.value_counts().get), axis=0
            )
        ), axis=1
    ).replace({numpy.nan: None})

def diff_shift(df: pandas.DataFrame) -&gt; pandas.DataFrame:

shift a data by a row to determine the nearest change in the index when determining spanning metrics.

    return pandas.DataFrame(
        numpy.concatenate((numpy.array([[True]*df.shape[1]]), df.values[:-1] != df.values[1:]), 0), 
        columns=df.columns
    )
%%
    def column_major(df: pandas.DataFrame, caption=None, SPAN=True) -&gt; bs4.BeautifulSoup:
convert a dataframe to a `column_major` html representation that presents the column index names first.

        soup = bs4.BeautifulSoup(features="html.parser")
        soup.append(table := soup.new_tag("table"))
        if caption:
            table.append(cap := soup.new_tag("caption"))
            cap.append(caption)
        ROWS, COLS = any(df.index.names), any(df.columns.names)

pre-compute the grouping structure of the indexes

        row_span, col_span = index_span(df.index), index_span(df.columns)

        for col_level, col_name in enumerate(df.columns.names):
1. show the column index names

            table.append(tr := soup.new_tag("tr"))
            if COLS:
                attrs = dict(scope="row")
                if df.index.nlevels &gt; 1:
                    attrs.update(colspan=df.index.nlevels)
                tr.append(th := soup.new_tag("th", attrs=attrs))
                th.append(str(col_name) or F"level {col_level}")

            for col_index, col_value in enumerate(df.columns.get_level_values(col_level)):
1. show the column index values

                attrs = dict(scope="col")
                span = col_span["span"].iloc[col_index, col_level] if SPAN else 1
                if span:
                    if span &gt; 1:
                        attrs.update(colspan=int(span))
                    tr.append(th := soup.new_tag("th", attrs=attrs))
                    th.append(str(col_value))
        if ROWS:
1. insert the row names below the column names 

            table.append(tr := soup.new_tag("tr"))
            attrs = dict(scope="col")
            for row_level, row_name in enumerate(df.index.names):
                tr.append(th := soup.new_tag("th", attrs=attrs))
                th.append(str(row_name) or F"index {row_level}")

            for col_value in df.columns.get_level_values(col_level):
   followed by a blank row, a blank row is suboptimal for assistive technology.

                attrs = dict(scope="col")
                tr.append(td := soup.new_tag("td"))

        for row_index in range(df.shape[0]):
1. write the row index headers

            table.append(tr := soup.new_tag("tr"))
            for row_level in range(df.index.nlevels):
                span = row_span["span"].iloc[row_index, row_level] if SPAN else 1
                if span:
                    attrs = dict(scope="row")
                    if span &gt; 1:
                        attrs.update(rowspan=int(span))
                    tr.append(th := soup.new_tag("th", attrs=attrs))
                    th.append(str(df.index.get_level_values(row_level)[row_index]))

            for value in df.iloc[row_index].values:
1. write the values of the dataframe

                tr.append(td := soup.new_tag("td"))
                td.append(str(value))
        return soup
def column_major(df: pandas.DataFrame, caption=None, SPAN=True) -&gt; bs4.BeautifulSoup:

convert a dataframe to a column_major html representation that presents the column index names first.

    soup = bs4.BeautifulSoup(features="html.parser")
    soup.append(table := soup.new_tag("table"))
    if caption:
        table.append(cap := soup.new_tag("caption"))
        cap.append(caption)
    ROWS, COLS = any(df.index.names), any(df.columns.names)

pre-compute the grouping structure of the indexes

    row_span, col_span = index_span(df.index), index_span(df.columns)

    for col_level, col_name in enumerate(df.columns.names):
  1. show the column index names
        table.append(tr := soup.new_tag("tr"))
        if COLS:
            attrs = dict(scope="row")
            if df.index.nlevels &gt; 1:
                attrs.update(colspan=df.index.nlevels)
            tr.append(th := soup.new_tag("th", attrs=attrs))
            th.append(str(col_name) or F"level {col_level}")
    
        for col_index, col_value in enumerate(df.columns.get_level_values(col_level)):
    
    1. show the column index values

          attrs = dict(scope="col")
          span = col_span["span"].iloc[col_index, col_level] if SPAN else 1
          if span:
              if span &gt; 1:
                  attrs.update(colspan=int(span))
              tr.append(th := soup.new_tag("th", attrs=attrs))
              th.append(str(col_value))
      

      if ROWS: 1. insert the row names below the column names

      table.append(tr := soup.new_tag("tr"))
      attrs = dict(scope="col")
      for row_level, row_name in enumerate(df.index.names):
          tr.append(th := soup.new_tag("th", attrs=attrs))
          th.append(str(row_name) or F"index {row_level}")
      
      for col_value in df.columns.get_level_values(col_level):
      

      followed by a blank row, a blank row is suboptimal for assistive technology.

          attrs = dict(scope="col")
          tr.append(td := soup.new_tag("td"))
      

      for row_index in range(df.shape[0]): 1. write the row index headers

      table.append(tr := soup.new_tag("tr"))
      for row_level in range(df.index.nlevels):
          span = row_span["span"].iloc[row_index, row_level] if SPAN else 1
          if span:
              attrs = dict(scope="row")
              if span &gt; 1:
                  attrs.update(rowspan=int(span))
              tr.append(th := soup.new_tag("th", attrs=attrs))
              th.append(str(df.index.get_level_values(row_level)[row_index]))
      
      for value in df.iloc[row_index].values:
      
      1. write the values of the dataframe
        tr.append(td := soup.new_tag("td"))
        td.append(str(value))
            return soup
        
%%
    def row_major(df, caption=None, SPAN=True):
a `row_major` representation that presents the row index names first.

        soup = bs4.BeautifulSoup(features="lxml")
        soup.append(table := soup.new_tag("table"))

        if caption:
            table.append(cap := soup.new_tag("caption"))
            cap.append(caption)

        ROWS, COLS = any(df.index.names), any(df.columns.names)
1. precompute the row and column index spans

        row_span, col_span = index_span(df.index), index_span(df.columns)

        for col_level, col_name in enumerate(df.columns.names):
            table.append(tr := soup.new_tag("tr"))
            if not col_level:
1. write the index names on the first pass of the header rows.

                if ROWS or not COLS:
                    attrs = dict(scope="col")
                    if df.columns.nlevels &gt; 1:
                        attrs.update(rowspan=df.columns.nlevels) 
                    for row_level, row_name in enumerate(df.index.names):
                        tr.append(th := soup.new_tag("th", attrs=attrs))
                        th.append(str(row_name) or F"index {row_level}")

            if COLS:
1. include the column index names if they exist

                attrs = dict(scope="row")
                if not ROWS and df.index.nlevels &gt; 1:
                    attrs.update(colspan=df.index.nlevels)
                tr.append(th := soup.new_tag("th", attrs=attrs))
                th.append(str(col_name) or F"level {col_level}")

            for col_index, col_value in enumerate(df.columns.get_level_values(col_level)):
1.  write the values for the column index

                attrs = dict(scope="col")
                span = col_span["span"].iloc[col_index, col_level] if SPAN else 1
                if span:
                    attrs = dict(scope="col")
                    if span &gt; 1:
                        attrs.update(colspan=int(span))
                    tr.append(th := soup.new_tag("th", attrs=attrs))
                    th.append(str(col_value))


        for row_index in range(df.shape[0]):
1.  write the index header values

            table.append(tr := soup.new_tag("tr"))
            for row_level in range(df.index.nlevels):
                span = row_span["span"].iloc[row_index, row_level] if SPAN else 1
                if span:
                    attrs = dict(scope="row")
                    if span &gt; 1:
                        attrs.update(rowspan=int(span))
                    tr.append(th := soup.new_tag("th", attrs=attrs))
                    th.append(str(df.index.get_level_values(row_level)[row_index]))

            if ROWS and COLS:
1.  insert an empty column if we have column names

                tr.append(td := soup.new_tag("td"))

            for value in df.iloc[row_index].values:
1.  write the data

                tr.append(td := soup.new_tag("td"))
                td.append(str(value))
        return soup
def row_major(df, caption=None, SPAN=True):

a row_major representation that presents the row index names first.

    soup = bs4.BeautifulSoup(features="lxml")
    soup.append(table := soup.new_tag("table"))

    if caption:
        table.append(cap := soup.new_tag("caption"))
        cap.append(caption)

    ROWS, COLS = any(df.index.names), any(df.columns.names)
  1. precompute the row and column index spans
    row_span, col_span = index_span(df.index), index_span(df.columns)
    
    for col_level, col_name in enumerate(df.columns.names):
        table.append(tr := soup.new_tag("tr"))
        if not col_level:
    
    1. write the index names on the first pass of the header rows.

          if ROWS or not COLS:
              attrs = dict(scope="col")
              if df.columns.nlevels &gt; 1:
                  attrs.update(rowspan=df.columns.nlevels) 
              for row_level, row_name in enumerate(df.index.names):
                  tr.append(th := soup.new_tag("th", attrs=attrs))
                  th.append(str(row_name) or F"index {row_level}")
      
      if COLS:
      
      1. include the column index names if they exist

        attrs = dict(scope="row")
        if not ROWS and df.index.nlevels &gt; 1:
            attrs.update(colspan=df.index.nlevels)
        tr.append(th := soup.new_tag("th", attrs=attrs))
        th.append(str(col_name) or F"level {col_level}")
        

        for col_index, col_value in enumerate(df.columns.get_level_values(col_level)): 1. write the values for the column index

        attrs = dict(scope="col")
        span = col_span["span"].iloc[col_index, col_level] if SPAN else 1
        if span:
            attrs = dict(scope="col")
            if span &gt; 1:
                attrs.update(colspan=int(span))
            tr.append(th := soup.new_tag("th", attrs=attrs))
            th.append(str(col_value))
        

      for row_index in range(df.shape[0]): 1. write the index header values

      table.append(tr := soup.new_tag("tr"))
      for row_level in range(df.index.nlevels):
          span = row_span["span"].iloc[row_index, row_level] if SPAN else 1
          if span:
              attrs = dict(scope="row")
              if span &gt; 1:
                  attrs.update(rowspan=int(span))
              tr.append(th := soup.new_tag("th", attrs=attrs))
              th.append(str(df.index.get_level_values(row_level)[row_index]))
      
      if ROWS and COLS:
      
      1. insert an empty column if we have column names

        tr.append(td := soup.new_tag("td"))
        

        for value in df.iloc[row_index].values: 1. write the data

        tr.append(td := soup.new_tag("td"))
        td.append(str(value))
            return soup
        

single index names¤

    row_major(df.head().rename_axis((None, None, None), axis=1).droplevel((0, 1), axis=1).droplevel((0,1), axis=0),
             "a single index row major")
a single index row major
L123123123123123123
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
    column_major(df.head().rename_axis((None, None, None), axis=0).droplevel((0, 1), axis=0).droplevel((0,1), axis=1),
             "a single index column major")
a single index column major
1000123123123123123123
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
    row_major(df.head().rename_axis((None, None, None), axis=1).droplevel(0, axis=1).droplevel((0,1), axis=0),
             "a multi index row major")
a multi index row major
LMNOMNO
123123123123123123
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
    row_major(df.head().rename_axis((None, None, None), axis=0).droplevel(0, axis=0).droplevel((0,1), axis=1),
             "a multi index column major")
a multi index column major
1000123123123123123123
M1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
N1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan

one named index spanning¤

    row_major(df.head().rename_axis((None, None, None), axis=1), "spanning multiple index row major")
spanning multiple index row major
JKLAZ
MNOMNO
123123123123123123
AM1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
N1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
    column_major(df.head().rename_axis((None, None, None), axis=0), "spanning multiple index column major")
spanning multiple index column major
10AZ
100MNOMNO
1000123123123123123123
AM1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
N1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan

spanning two named indexes¤

    row_major(df.head(), "spanning multiple indexes row major")
spanning multiple indexes row major
JKL10AZ
100MNOMNO
1000123123123123123123
AM1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
N1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
    column_major(df.head(), "spanning multiple index column major")
spanning multiple index column major
10AZ
100MNOMNO
1000123123123123123123
JKL
AM1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan
3nannannannannannannannannannannannannannannannannannan
N1nannannannannannannannannannannannannannannannannannan
2nannannannannannannannannannannannannannannannannannan

it is hard to imagine a way to constructing a dataframe with two named indexes without empty cells. we'll likely include an empty row or column. this might seem like this is authoring choice, but we can't know our viewers intent when they interrogate data. the most flexible, compromising approach would be allow for this to change on the client, where the only author choice represents the steady state.

non-spanning frames¤

so far we have only illustrated spanning examples meaning that the row/column index may span multiple rows; only headers will span for dataframes while data will not. this experience can really suck for assistive technology introducing navigation ambiguities and complications.

    row_major(df.head(), "non-spanning multiple indexes row major", False)
non-spanning multiple indexes row major
JKL10AAAAAAAAAZZZZZZZZZ
100MMMNNNOOOMMMNNNOOO
1000123123123123123123
AM1nannannannannannannannannannannannannannannannannannan
AM2nannannannannannannannannannannannannannannannannannan
AM3nannannannannannannannannannannannannannannannannannan
AN1nannannannannannannannannannannannannannannannannannan
AN2nannannannannannannannannannannannannannannannannannan
    column_major(df.head(), "non-spanning multiple index column major", False)
non-spanning multiple index column major
10AAAAAAAAAZZZZZZZZZ
100MMMNNNOOOMMMNNNOOO
1000123123123123123123
JKL
AM1nannannannannannannannannannannannannannannannannannan
AM2nannannannannannannannannannannannannannannannannannan
AM3nannannannannannannannannannannannannannannannannannan
AN1nannannannannannannannannannannannannannannannannannan
AN2nannannannannannannannannannannannannannannannannannan
%%
## visual and nonvisual shape

rows and columns references in a `table` are the not the same as a dataframe.
convention holds that dataframes show their shape, the shape of the data.
the nominal references we use for dataframes are shifted ordinal references
when the shape is shared along side a screen reader.
the shape of the table to the screen reader inclused the rows and columns.
to assistive technology, the shape of the dataframe is computed by:

    df.shape[0] + df.columns.nlevels, df.shape[1] + df.index.nlevels

this naive heuristic is only true for certain combinations of multi indexes. a more rigorous implementation would handle these edge cases.

these inconsistencies mean that screen reader users may be referencing an different indexing system than sighted users.

we improve the captioning with the nominal shape vs the actual shape. 
mentioning the row and columns levels would help parse this content.
if we are requiring folks to do math in their heads then we'll want an adaptive approach to discussing shape.

visual and nonvisual shape¤

rows and columns references in a table are the not the same as a dataframe. convention holds that dataframes show their shape, the shape of the data. the nominal references we use for dataframes are shifted ordinal references when the shape is shared along side a screen reader. the shape of the table to the screen reader inclused the rows and columns. to assistive technology, the shape of the dataframe is computed by:

df.shape[0] + df.columns.nlevels, df.shape[1] + df.index.nlevels

this naive heuristic is only true for certain combinations of multi indexes. a more rigorous implementation would handle these edge cases.

these inconsistencies mean that screen reader users may be referencing an different indexing system than sighted users.

we improve the captioning with the nominal shape vs the actual shape. mentioning the row and columns levels would help parse this content. if we are requiring folks to do math in their heads then we'll want an adaptive approach to discussing shape.

todo¤

  • AT shape vs nominal shape information
  • dataframes larger than trunacted thresholds where we will have to introduce aria