index
execution_count
cell_type
toolbar
started_at
completed_at
source
loc
metadata
outputs
1
unexecuted
In
[
]
markdown
# running the w3c validator
TLDR: the python package isn't the uptodate to validator and we use node.
i use https://validator.w3.org/ a lot in my accessibility testing. it always asks if i am about robot now. i thought it would be better to have a local version i could use. this document is how i got it working after significant trouble shooting.
metadata
5
TLDR: the python package isn't the uptodate to validator and we use node.
i use
https://validator.w3.org/
a lot in my accessibility testing. it always asks if i am about robot now. i thought it would be better to have a local version i could use. this document is how i got it working after significant trouble shooting.
2
unexecuted
In
[
]
markdown
## installation
installing the w3c nu validator https://github.com/validator/validator/ is a good candidate for `conda` because we'll
use python, java, and node. i tried using python's `html5validator` , but the
version of the checker that it bundles is old. the best solution for me was to use
`node` cause i can figure it out.
use a conda `environment.yml` with at least:
```yml
channels:
- conda-forge
dependencies:
- python=3.11
- openjdk
- nodejs
```
install the jar distributed on `npm`
`npm install -g vnu-jar`
metadata
20
installing the w3c nu validator
https://github.com/validator/validator/
is a good candidate for
conda
because we'll
use python, java, and node. i tried using python's
html5validator
, but the
version of the checker that it bundles is old. the best solution for me was to use
node
cause i can figure it out.
use a conda
environment.yml
with at least:
yml
channels:
- conda-forge
dependencies:
- python=3.11
- openjdk
- nodejs
install the jar distributed on
npm
npm install -g vnu-jar
3
unexecuted
In
[
]
markdown
metadata
1
4
unexecuted
In
[
]
markdown
get the path to the jar from the node binary
metadata
1
get the path to the jar from the node binary
5
executed
In
[
1
]
code
2023-11-16T01:46:29.530029+00:00
2023-11-16T01:46:29.968247+00:00
import itertools , operator , functools , collections , exceptiongroup , re
import pathlib , pandas , json , subprocess , shlex
VNU_JAR = pathlib . Path ( subprocess . check_output (
shlex . split (
"npm root vnu-jar"
)
) . strip () . decode ()) / "vnu-jar/build/dist/vnu.jar"
assert VNU_JAR . exists ()
metadata
8
0 outputs.
Out
[
1
]
6
unexecuted
In
[
]
markdown
`validate_html` runs the checker and returns the serialized payload.
metadata
1
validate_html
runs the checker and returns the serialized payload.
7
executed
In
[
2
]
code
2023-11-16T01:46:29.969581+00:00
2023-11-16T01:46:29.971404+00:00
def validate_html ( * files : pathlib . Path ) -> dict :
return json . loads ( subprocess . check_output (
shlex . split (
F "java -jar { VNU_JAR } --stdout --format json --exit-zero-always"
) + list ( files )
) . decode ())
metadata
6
0 outputs.
Out
[
2
]
8
unexecuted
In
[
]
markdown
## explore the data as a `pandas.Dataframe`
metadata
1
9
executed
In
[
3
]
code
2023-11-16T01:46:29.972257+00:00
2023-11-16T01:46:31.187254+00:00
HTML = pathlib . Path ( "../../../notebooks-for-all/tests/exports/html/lorenz-executed.html" )
df = pandas . DataFrame ( pandas . Series ( validate_html ( HTML )) . messages )
del df [ "url" ]
df
metadata
4
1 outputs.
Out
[
3
]
type
lastLine
lastColumn
firstColumn
subType
message
extract
hiliteStart
hiliteLength
firstLine
0
info
81
27
4
warning
Section lacks heading. Consider using “h2”-“h6...
ader">\n <section id="skip-link">\n <
10
24
NaN
1
error
295
126
11
NaN
The “role” attribute must not be used on a “tr...
</tr>\n <tr aria-labelledby="nb-cell-...
10
127
294.0
2
error
296
40
127
NaN
The “role” attribute must not be used on a “td...
listitem">\n <td class="nb-anchor" role="...
10
41
295.0
3
error
301
49
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-execution_coun...
10
50
300.0
4
error
306
43
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-cell_type" rol...
10
44
305.0
...
...
...
...
...
...
...
...
...
...
...
173
error
1869
54
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-end" id="cell-...
10
55
1868.0
174
error
1874
40
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-source" role="...
10
41
1873.0
175
error
1897
42
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-metadata" role...
10
43
1896.0
176
error
1912
54
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-loc" id="cell-...
10
55
1911.0
177
error
1915
41
12
NaN
The “role” attribute must not be used on a “td...
</td>\n <td class="nb-outputs" role=...
10
42
1914.0
178 rows × 10 columns
10
unexecuted
In
[
]
markdown
## throwing exceptions
we need to collect these results and raise exceptions.
metadata
3
we need to collect these results and raise exceptions.
11
unexecuted
In
[
]
markdown
we need to organize the `results` into something that can be reported.
metadata
1
we need to organize the
results
into something that can be reported.
12
executed
In
[
4
]
code
2023-11-16T01:46:31.188534+00:00
2023-11-16T01:46:32.371238+00:00
results = validate_html ( HTML )
metadata
1
0 outputs.
Out
[
4
]
13
unexecuted
In
[
]
markdown
group the `results` the nu error messages and the severity.
metadata
1
group the
results
the nu error messages and the severity.
14
executed
In
[
5
]
code
2023-11-16T01:46:32.373148+00:00
2023-11-16T01:46:32.376384+00:00
def organize_validator_results ( results ):
collect = collections . defaultdict ( functools . partial ( collections . defaultdict , list ))
for ( error , msg ), group in itertools . groupby ( results [ "messages" ], key = operator . itemgetter ( "type" , "message" )):
for item in group :
collect [ error ][ msg ] . append ( item )
return collect
metadata
6
0 outputs.
Out
[
5
]
15
unexecuted
In
[
]
markdown
the page we are testing overrides `table` roles where the validator throws errors. this is a known issue so we already have to ignore some results.
metadata
1
the page we are testing overrides
table
roles where the validator throws errors. this is a known issue so we already have to ignore some results.
16
executed
In
[
6
]
code
2023-11-16T01:46:32.377740+00:00
2023-11-16T01:46:32.380457+00:00
EXCLUDE = re . compile (
"""or with a “role” attribute whose value is “table”, “grid”, or “treegrid”.$"""
# https://github.com/validator/validator/issues/1125
)
metadata
4
0 outputs.
Out
[
6
]
17
executed
In
[
7
]
code
2023-11-16T01:46:32.381889+00:00
2023-11-16T01:46:32.383992+00:00
def raise_if_errors ( results , exclude = EXCLUDE ):
collect = organize_validator_results ( results )
exceptions = []
for msg in collect [ "error" ]:
if not exclude or not exclude . search ( msg ):
exceptions . append ( exceptiongroup . ExceptionGroup ( msg , [ Exception ( x [ "extract" ]) for x in collect [ "error" ][ msg ]]))
if exceptions :
raise exceptiongroup . ExceptionGroup ( "nu validator errors" , exceptions )
metadata
8
0 outputs.
Out
[
7
]
18
unexecuted
In
[
]
markdown
since, i've been hand validating, my page doesn't raise any errors except for the excluded ones. i'm really proud of that.
metadata
1
since, i've been hand validating, my page doesn't raise any errors except for the excluded ones. i'm really proud of that.
19
executed
In
[
8
]
code
2023-11-16T01:46:32.385000+00:00
2023-11-16T01:46:32.386664+00:00
metadata
1
0 outputs.
Out
[
8
]
20
unexecuted
In
[
]
markdown
if we include all the validator errors then we raise an exception group
metadata
1
if we include all the validator errors then we raise an exception group
21
executed
In
[
9
]
code
2023-11-16T01:46:32.387758+00:00
2023-11-16T01:46:32.391729+00:00
raise_if_errors ( results , None )
metadata
1
1 outputs.
Out
[
9
]
+ Exception Group Traceback (most recent call last):
| File "/home/tbone/mambaforge/envs/test-nbconvert-a11y/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3548, in run_code
| exec(code_obj, self.user_global_ns, self.user_ns)
| File "/tmp/ipykernel_394727/1257168619.py", line 1, in <module>
| raise_if_errors(results, None)
| File "/tmp/ipykernel_394727/522998515.py", line 8, in raise_if_errors
| raise exceptiongroup.ExceptionGroup("nu validator errors", exceptions)
| ExceptionGroup: nu validator errors (2 sub-exceptions)
+-+---------------- 1 ----------------
| ExceptionGroup: The “role” attribute must not be used on a “tr” element which has a “table” ancestor with no “role” attribute, or with a “role” attribute whose value is “table”, “grid”, or “treegrid”. (16 sub-exceptions)
+-+---------------- 1 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 1 cell-1-cell_type" class="cell markdown" data-index="1" data-loc="1" role="listitem">
|
+---------------- 2 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 2 cell-2-cell_type" class="cell markdown" data-index="2" data-loc="1" role="listitem">
|
+---------------- 3 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 3 cell-3-cell_type" class="cell code" data-index="3" data-loc="2" role="listitem">
|
+---------------- 4 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 4 cell-4-cell_type" class="cell markdown" data-index="4" data-loc="11" role="listitem">
|
+---------------- 5 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 5 cell-5-cell_type" class="cell code" data-index="5" data-loc="3" role="listitem">
|
+---------------- 6 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 6 cell-6-cell_type" class="cell markdown" data-index="6" data-loc="1" role="listitem">
|
+---------------- 7 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 7 cell-7-cell_type" class="cell markdown" data-index="7" data-loc="1" role="listitem">
|
+---------------- 8 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 8 cell-8-cell_type" class="cell code" data-index="8" data-loc="1" role="listitem">
|
+---------------- 9 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 9 cell-9-cell_type" class="cell code" data-index="9" data-loc="1" role="listitem">
|
+---------------- 10 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 10 cell-10-cell_type" class="cell markdown" data-index="10" data-loc="1" role="listitem">
|
+---------------- 11 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 11 cell-11-cell_type" class="cell code" data-index="11" data-loc="1" role="listitem">
|
+---------------- 12 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 12 cell-12-cell_type" class="cell code" data-index="12" data-loc="1" role="listitem">
|
+---------------- 13 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 13 cell-13-cell_type" class="cell markdown" data-index="13" data-loc="1" role="listitem">
|
+---------------- 14 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 14 cell-14-cell_type" class="cell code" data-index="14" data-loc="1" role="listitem">
|
+---------------- 15 ----------------
| Exception: </tr>
| <tr aria-labelledby="nb-cell-label 15 cell-15-cell_type" class="cell code" data-index="15" data-loc="2" role="listitem">
|
+---------------- ... ----------------
| and 1 more exception
+------------------------------------
+---------------- 2 ----------------
| ExceptionGroup: The “role” attribute must not be used on a “td” element which has a “table” ancestor with no “role” attribute, or with a “role” attribute whose value is “table”, “grid”, or “treegrid”. (160 sub-exceptions)
+-+---------------- 1 ----------------
| Exception: listitem">
| <td class="nb-anchor" role="none">
|
+---------------- 2 ----------------
| Exception: </td>
| <td class="nb-execution_count" role="none">
|
+---------------- 3 ----------------
| Exception: </td>
| <td class="nb-cell_type" role="none">
|
+---------------- 4 ----------------
| Exception: </td>
| <td class="nb-toolbar" role="none">
|
+---------------- 5 ----------------
| Exception: </td>
| <td class="nb-start" id="cell-1-start" role="none">
|
+---------------- 6 ----------------
| Exception: </td>
| <td class="nb-end" id="cell-1-end" role="none">
|
+---------------- 7 ----------------
| Exception: </td>
| <td class="nb-source" role="none">
|
+---------------- 8 ----------------
| Exception: </td>
| <td class="nb-metadata" role="none">
|
+---------------- 9 ----------------
| Exception: </td>
| <td class="nb-loc" id="cell-1-loc" role="none">
|
+---------------- 10 ----------------
| Exception: </td>
| <td class="nb-outputs" role="none">
|
+---------------- 11 ----------------
| Exception: listitem">
| <td class="nb-anchor" role="none">
|
+---------------- 12 ----------------
| Exception: </td>
| <td class="nb-execution_count" role="none">
|
+---------------- 13 ----------------
| Exception: </td>
| <td class="nb-cell_type" role="none">
|
+---------------- 14 ----------------
| Exception: </td>
| <td class="nb-toolbar" role="none">
|
+---------------- 15 ----------------
| Exception: </td>
| <td class="nb-start" id="cell-2-start" role="none">
|
+---------------- ... ----------------
| and 145 more exceptions
+------------------------------------
22
unexecuted
In
[
None
]
code
metadata
0
Out
[
None
]