Commit Graph

2061 Commits

Author SHA1 Message Date
Rhodri James
b003c9fea5 Test two colons in the document name in a doctype is rejected 2017-07-22 22:49:20 +02:00
Rhodri James
b93735f15b Test invalid character in document local name is faulted 2017-07-22 22:49:20 +02:00
Rhodri James
1346f5cd74 Test UTF-16 non-zero high byte characters in document local name
Tests DOCTYPE handling of non-ASCII characters leading the local
element (i.e. the part after the colon) of a document name
2017-07-22 22:49:20 +02:00
Rhodri James
32e023fca4 Test invalid UTF-16 characters in DOCTYPE 2017-07-22 22:49:20 +02:00
Rhodri James
301cf609ad Test invalid characters in a DOCTYPE 2017-07-22 22:49:20 +02:00
Rhodri James
330c9b298d Refactor: move element event handlers together 2017-07-22 22:49:20 +02:00
Rhodri James
b9727a7c44 Test UTF-16 attribute description keyword with non-zero high bytes.
The #keyword consists of valid name-worthy characters, so is
rejected as a syntax error not an invalid token.
2017-07-22 22:49:20 +02:00
Rhodri James
63b215def5 Test attribute list with bad keyword starting with non-name character 2017-07-22 22:49:19 +02:00
Rhodri James
365a61e52d Test parameter entity with UTF-16 name with top bytes non-zero 2017-07-22 22:49:19 +02:00
Rhodri James
71a293259c Test that UTF-16 element local names with non-zero top byte are accepted 2017-07-22 22:49:19 +02:00
Rhodri James
2cde440699 Test that non-name characters after a colon in an element are rejected 2017-07-22 22:49:19 +02:00
Rhodri James
98c1385bba Test that more than one colon in an element name is rejected 2017-07-22 22:49:19 +02:00
Rhodri James
9f718ef37c Test that text after a / is faulted 2017-07-22 22:49:19 +02:00
Rhodri James
c0593524ab Test second attribute with UTF-16 top byte non-zero is valid
This exercises a different code path to the first attribute
2017-07-22 22:49:19 +02:00
Rhodri James
e4caa983df Test invalid character in attribute value is faulted 2017-07-22 22:49:19 +02:00
Rhodri James
303e6c5499 Test UTF-16 attribute names with non-zero top byte are accepted 2017-07-22 22:49:19 +02:00
Rhodri James
30b15c43b1 Test that non-name characters after a colon are rejected
I.e. test that leafnames have to be valid names
2017-07-22 22:49:19 +02:00
Rhodri James
45da5d094d Test more than one colon in an attribute name is rejected
I.e. test that namespaces don't nest
2017-07-22 22:49:19 +02:00
Rhodri James
2bb87d2586 Test attribute with high-byte-set UTF-16 character name 2017-07-22 22:49:19 +02:00
Rhodri James
d14ba8e58c Test non-ASCII characters in entity names
Use UTF-16 character with value > 0x00ff for extra code coverage
2017-07-22 22:49:19 +02:00
Rhodri James
bfbafbcef4 Test invalid character in a hex character reference 2017-07-22 22:49:19 +02:00
Rhodri James
e597621187 Test trailing spaces in element names are ignored in start and end tags 2017-07-22 22:49:19 +02:00
Rhodri James
6ec40ff522 Test single stray ] doesn't close a CDATA section
Also uses multi-byte characters around the ] to exercise more code.
2017-07-22 22:49:19 +02:00
Rhodri James
7e7cb2c560 Test that UTF-8 safely passes through a CDATA section.
Specifically test that multi-byte characters are correctly tokenised
2017-07-22 22:49:19 +02:00
Rhodri James
4545eb67bb Test PI in UTF-16 with non-ASCII target name 2017-07-22 22:49:19 +02:00
Rhodri James
caa0b46c6a Test PI with target of "xmm" 2017-07-22 22:49:19 +02:00
Rhodri James
9b1a9dd790 Test PI with a target of "xnl" 2017-07-22 22:49:19 +02:00
Rhodri James
1c6675e302 Test PI with a target of "yml" 2017-07-22 22:49:18 +02:00
Rhodri James
8380c39318 Refactor test_comment_handled_in_default()
Eliminates an unnecessary handler when we can use
accumulate_characters instead.
2017-07-22 22:49:18 +02:00
Rhodri James
6822bea985 Refactor test_pi_handled_in_default to use accumulate_characters() 2017-07-22 22:49:18 +02:00
Rhodri James
278545e976 Test unknown encoding handler in a parser with namespaces 2017-07-22 22:49:18 +02:00
Rhodri James
0940aa04ee Test unannounced UTF-16, no BOM, leading space to hide '<' 2017-07-22 22:49:18 +02:00
Rhodri James
431a7b30bc Test UTF-16 where UTF-8 is expected is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
2f40b6d5fa Test UTF-16BE vs explicit UTF-16LE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
6526fb993d Test UTF-8 character nearly the BOM (for code path coverage) 2017-07-22 22:49:18 +02:00
Rhodri James
e0bcf3f674 Test UTF-16LE vs explicit UTF-16BE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
d4ce36970b Test UTF-16 BE BOM vs explicit Latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
4f9f6651b7 Test UTF-16 big-endian BOM vs explicit Latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
afe76e8bbf Test UTF-16 BOM vs explicit latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
8d6bb0fd1a Test UTF-16 BOM vs explicit latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
a2104666fb Test specifying a UTF-16 encoding when creating the parser works 2017-07-22 22:49:18 +02:00
Rhodri James
5aa3ead193 Test unknown encoding with 17-bit top-bit-set character is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
9bd8391e2c Test top-bit-set character mimicking surrogate high is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
67aabeb2b8 Test top-bit set character mimicking ASCII is rejected in encodings 2017-07-22 22:49:18 +02:00
Rhodri James
a330a21859 Test unknown encoding with invalid multi-byte length field 2017-07-22 22:49:18 +02:00
Rhodri James
2ce358b203 More refactoring of unknown encoding handlers 2017-07-22 22:49:18 +02:00
Rhodri James
834eb92bdb Refactor to use encoding to switch UnknownEncoding handler behaviour 2017-07-22 22:49:17 +02:00
Rhodri James
36d8911ecb Test ASCII-like unknown encoding correctly rejects invalid characters 2017-07-22 22:49:17 +02:00
Rhodri James
0f950f4b9d Check that ASCII-like encoding is valid 2017-07-22 22:49:17 +02:00
Rhodri James
32ade2af40 Test unknown encoding not compatible with ASCII is rejected 2017-07-22 22:49:17 +02:00