skip to main content

@tonyfast s notebooks

site navigation
notebook summary
running the w3c validator
TLDR: the python package isn't the uptodate to validator and we use node.
22 total
10 code
executed in order
Python [conda env:test-nbconvert-a11y]
lines of code
table of contents
{"kernelspec": {"display_name": "Python [conda env:test-nbconvert-a11y]", "language": "python", "name": "conda-env-test-nbconvert-a11y-py"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6"}, "widgets": {"application/vnd.jupyter.widget-state+json": {"state": {}, "version_major": 2, "version_minor": 0}}, "title": "running the w3c validator", "description": "TLDR: the python package isn't the uptodate to validator and we use node."}
notebook toolbar
cell ordering

running the w3c validator

TLDR: the python package isn't the uptodate to validator and we use node.

i use a lot in my accessibility testing. it always asks if i am about robot now. i thought it would be better to have a local version i could use. this document is how i got it working after significant trouble shooting.



installing the w3c nu validator is a good candidate for conda because we'll use python, java, and node. i tried using python's html5validator , but the version of the checker that it bundles is old. the best solution for me was to use node cause i can figure it out.

use a conda environment.yml with at least:

  - conda-forge
  - python=3.11
  - openjdk
  - nodejs

install the jar distributed on npm

npm install -g vnu-jar


running the validator


get the path to the jar from the node binary

    import itertools, operator, functools, collections, exceptiongroup, re
    import pathlib, pandas, json, subprocess, shlex
    VNU_JAR = pathlib.Path(subprocess.check_output(
            "npm root vnu-jar"
    ).strip().decode()) / "vnu-jar/build/dist/vnu.jar"
    assert VNU_JAR.exists()

validate_html runs the checker and returns the serialized payload.

    def validate_html(*files: pathlib.Path) -> dict:
        return json.loads(subprocess.check_output(
                F"java -jar {VNU_JAR} --stdout --format json --exit-zero-always"
            ) + list(files)

explore the data as a

    HTML = pathlib.Path("../../../notebooks-for-all/tests/exports/html/lorenz-executed.html")
    df = pandas.DataFrame(pandas.Series(validate_html(HTML)).messages)
    del df["url"]
1 outputs.
type lastLine lastColumn firstColumn subType message extract hiliteStart hiliteLength firstLine
0 info 81 27 4 warning Section lacks heading. Consider using “h2”-“h6... ader">\n <section id="skip-link">\n < 10 24 NaN
1 error 295 126 11 NaN The “role” attribute must not be used on a “tr... </tr>\n <tr aria-labelledby="nb-cell-... 10 127 294.0
2 error 296 40 127 NaN The “role” attribute must not be used on a “td... listitem">\n <td class="nb-anchor" role="... 10 41 295.0
3 error 301 49 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-execution_coun... 10 50 300.0
4 error 306 43 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-cell_type" rol... 10 44 305.0
... ... ... ... ... ... ... ... ... ... ...
173 error 1869 54 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-end" id="cell-... 10 55 1868.0
174 error 1874 40 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-source" role="... 10 41 1873.0
175 error 1897 42 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-metadata" role... 10 43 1896.0
176 error 1912 54 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-loc" id="cell-... 10 55 1911.0
177 error 1915 41 12 NaN The “role” attribute must not be used on a “td... </td>\n <td class="nb-outputs" role=... 10 42 1914.0

178 rows × 10 columns


throwing exceptions

we need to collect these results and raise exceptions.


we need to organize the results into something that can be reported.

    results = validate_html(HTML)

group the results the nu error messages and the severity.

    def organize_validator_results(results):
        collect = collections.defaultdict(functools.partial(collections.defaultdict, list))
        for (error, msg), group in itertools.groupby(results["messages"], key=operator.itemgetter("type", "message")):
            for item in group:
        return collect

the page we are testing overrides table roles where the validator throws errors. this is a known issue so we already have to ignore some results.

    EXCLUDE = re.compile(
        """or with a “role” attribute whose value is “table”, “grid”, or “treegrid”.$"""
    def raise_if_errors(results, exclude=EXCLUDE):
        collect = organize_validator_results(results)
        exceptions = []
        for msg in collect["error"]:
            if not exclude or not
                exceptions.append(exceptiongroup.ExceptionGroup(msg, [Exception(x["extract"]) for x in collect["error"][msg]]))
        if exceptions:
             raise exceptiongroup.ExceptionGroup("nu validator errors", exceptions)

since, i've been hand validating, my page doesn't raise any errors except for the excluded ones. i'm really proud of that.


if we include all the validator errors then we raise an exception group

    raise_if_errors(results, None)
1 outputs.
  + Exception Group Traceback (most recent call last):
  |   File "/home/tbone/mambaforge/envs/test-nbconvert-a11y/lib/python3.11/site-packages/IPython/core/", line 3548, in run_code
  |     exec(code_obj, self.user_global_ns, self.user_ns)
  |   File "/tmp/ipykernel_394727/", line 1, in <module>
  |     raise_if_errors(results, None)
  |   File "/tmp/ipykernel_394727/", line 8, in raise_if_errors
  |     raise exceptiongroup.ExceptionGroup("nu validator errors", exceptions)
  | ExceptionGroup: nu validator errors (2 sub-exceptions)
  +-+---------------- 1 ----------------
    | ExceptionGroup: The “role” attribute must not be used on a “tr” element which has a “table” ancestor with no “role” attribute, or with a “role” attribute whose value is “table”, “grid”, or “treegrid”. (16 sub-exceptions)
    +-+---------------- 1 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 1 cell-1-cell_type" class="cell markdown" data-index="1" data-loc="1" role="listitem">
      +---------------- 2 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 2 cell-2-cell_type" class="cell markdown" data-index="2" data-loc="1" role="listitem">
      +---------------- 3 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 3 cell-3-cell_type" class="cell code" data-index="3" data-loc="2" role="listitem">
      +---------------- 4 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 4 cell-4-cell_type" class="cell markdown" data-index="4" data-loc="11" role="listitem">
      +---------------- 5 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 5 cell-5-cell_type" class="cell code" data-index="5" data-loc="3" role="listitem">
      +---------------- 6 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 6 cell-6-cell_type" class="cell markdown" data-index="6" data-loc="1" role="listitem">
      +---------------- 7 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 7 cell-7-cell_type" class="cell markdown" data-index="7" data-loc="1" role="listitem">
      +---------------- 8 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 8 cell-8-cell_type" class="cell code" data-index="8" data-loc="1" role="listitem">
      +---------------- 9 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 9 cell-9-cell_type" class="cell code" data-index="9" data-loc="1" role="listitem">
      +---------------- 10 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 10 cell-10-cell_type" class="cell markdown" data-index="10" data-loc="1" role="listitem">
      +---------------- 11 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 11 cell-11-cell_type" class="cell code" data-index="11" data-loc="1" role="listitem">
      +---------------- 12 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 12 cell-12-cell_type" class="cell code" data-index="12" data-loc="1" role="listitem">
      +---------------- 13 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 13 cell-13-cell_type" class="cell markdown" data-index="13" data-loc="1" role="listitem">
      +---------------- 14 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 14 cell-14-cell_type" class="cell code" data-index="14" data-loc="1" role="listitem">
      +---------------- 15 ----------------
      | Exception:      </tr>
      |      <tr aria-labelledby="nb-cell-label 15 cell-15-cell_type" class="cell code" data-index="15" data-loc="2" role="listitem">
      +---------------- ... ----------------
      | and 1 more exception
    +---------------- 2 ----------------
    | ExceptionGroup: The “role” attribute must not be used on a “td” element which has a “table” ancestor with no “role” attribute, or with a “role” attribute whose value is “table”, “grid”, or “treegrid”. (160 sub-exceptions)
    +-+---------------- 1 ----------------
      | Exception: listitem">
      |       <td class="nb-anchor" role="none">
      +---------------- 2 ----------------
      | Exception:      </td>
      |       <td class="nb-execution_count" role="none">
      +---------------- 3 ----------------
      | Exception:      </td>
      |       <td class="nb-cell_type" role="none">
      +---------------- 4 ----------------
      | Exception:      </td>
      |       <td class="nb-toolbar" role="none">
      +---------------- 5 ----------------
      | Exception:      </td>
      |       <td class="nb-start" id="cell-1-start" role="none">
      +---------------- 6 ----------------
      | Exception:      </td>
      |       <td class="nb-end" id="cell-1-end" role="none">
      +---------------- 7 ----------------
      | Exception:      </td>
      |       <td class="nb-source" role="none">
      +---------------- 8 ----------------
      | Exception:      </td>
      |       <td class="nb-metadata" role="none">
      +---------------- 9 ----------------
      | Exception:      </td>
      |       <td class="nb-loc" id="cell-1-loc" role="none">
      +---------------- 10 ----------------
      | Exception:      </td>
      |       <td class="nb-outputs" role="none">
      +---------------- 11 ----------------
      | Exception: listitem">
      |       <td class="nb-anchor" role="none">
      +---------------- 12 ----------------
      | Exception:      </td>
      |       <td class="nb-execution_count" role="none">
      +---------------- 13 ----------------
      | Exception:      </td>
      |       <td class="nb-cell_type" role="none">
      +---------------- 14 ----------------
      | Exception:      </td>
      |       <td class="nb-toolbar" role="none">
      +---------------- 15 ----------------
      | Exception:      </td>
      |       <td class="nb-start" id="cell-2-start" role="none">
      +---------------- ... ----------------
      | and 145 more exceptions