Commit Graph

2068 Commits

Author SHA1 Message Date
Rhodri James
8be7b89885 Comment change: exclude paranoia check from coverage 2017-07-22 22:49:20 +02:00
Rhodri James
36e47058dc Comment change: exclude paranoia check from coverage 2017-07-22 22:49:20 +02:00
Rhodri James
c53aba8bf7 Test multi-byte characters in ATTLIST default attribute value 2017-07-22 22:49:20 +02:00
Rhodri James
ca1cb78164 Fix error messages in test_bad_doctype_* tests 2017-07-22 22:49:20 +02:00
Rhodri James
c73e63de42 Test document name of "1?" is rejected in doctype declaration 2017-07-22 22:49:20 +02:00
Rhodri James
61db6a8a93 Test document name of "1*" is rejected in a doctype declaration 2017-07-22 22:49:20 +02:00
Rhodri James
bb60b30a68 Test document name of "1+" is rejected in doctype
Following a codepath that requires a character valid in a name but
not at the start (or '-' or ':') followed by a '+'
2017-07-22 22:49:20 +02:00
Rhodri James
b003c9fea5 Test two colons in the document name in a doctype is rejected 2017-07-22 22:49:20 +02:00
Rhodri James
b93735f15b Test invalid character in document local name is faulted 2017-07-22 22:49:20 +02:00
Rhodri James
1346f5cd74 Test UTF-16 non-zero high byte characters in document local name
Tests DOCTYPE handling of non-ASCII characters leading the local
element (i.e. the part after the colon) of a document name
2017-07-22 22:49:20 +02:00
Rhodri James
32e023fca4 Test invalid UTF-16 characters in DOCTYPE 2017-07-22 22:49:20 +02:00
Rhodri James
301cf609ad Test invalid characters in a DOCTYPE 2017-07-22 22:49:20 +02:00
Rhodri James
330c9b298d Refactor: move element event handlers together 2017-07-22 22:49:20 +02:00
Rhodri James
b9727a7c44 Test UTF-16 attribute description keyword with non-zero high bytes.
The #keyword consists of valid name-worthy characters, so is
rejected as a syntax error not an invalid token.
2017-07-22 22:49:20 +02:00
Rhodri James
63b215def5 Test attribute list with bad keyword starting with non-name character 2017-07-22 22:49:19 +02:00
Rhodri James
365a61e52d Test parameter entity with UTF-16 name with top bytes non-zero 2017-07-22 22:49:19 +02:00
Rhodri James
71a293259c Test that UTF-16 element local names with non-zero top byte are accepted 2017-07-22 22:49:19 +02:00
Rhodri James
2cde440699 Test that non-name characters after a colon in an element are rejected 2017-07-22 22:49:19 +02:00
Rhodri James
98c1385bba Test that more than one colon in an element name is rejected 2017-07-22 22:49:19 +02:00
Rhodri James
9f718ef37c Test that text after a / is faulted 2017-07-22 22:49:19 +02:00
Rhodri James
c0593524ab Test second attribute with UTF-16 top byte non-zero is valid
This exercises a different code path to the first attribute
2017-07-22 22:49:19 +02:00
Rhodri James
e4caa983df Test invalid character in attribute value is faulted 2017-07-22 22:49:19 +02:00
Rhodri James
303e6c5499 Test UTF-16 attribute names with non-zero top byte are accepted 2017-07-22 22:49:19 +02:00
Rhodri James
30b15c43b1 Test that non-name characters after a colon are rejected
I.e. test that leafnames have to be valid names
2017-07-22 22:49:19 +02:00
Rhodri James
45da5d094d Test more than one colon in an attribute name is rejected
I.e. test that namespaces don't nest
2017-07-22 22:49:19 +02:00
Rhodri James
2bb87d2586 Test attribute with high-byte-set UTF-16 character name 2017-07-22 22:49:19 +02:00
Rhodri James
d14ba8e58c Test non-ASCII characters in entity names
Use UTF-16 character with value > 0x00ff for extra code coverage
2017-07-22 22:49:19 +02:00
Rhodri James
bfbafbcef4 Test invalid character in a hex character reference 2017-07-22 22:49:19 +02:00
Rhodri James
e597621187 Test trailing spaces in element names are ignored in start and end tags 2017-07-22 22:49:19 +02:00
Rhodri James
6ec40ff522 Test single stray ] doesn't close a CDATA section
Also uses multi-byte characters around the ] to exercise more code.
2017-07-22 22:49:19 +02:00
Rhodri James
7e7cb2c560 Test that UTF-8 safely passes through a CDATA section.
Specifically test that multi-byte characters are correctly tokenised
2017-07-22 22:49:19 +02:00
Rhodri James
4545eb67bb Test PI in UTF-16 with non-ASCII target name 2017-07-22 22:49:19 +02:00
Rhodri James
caa0b46c6a Test PI with target of "xmm" 2017-07-22 22:49:19 +02:00
Rhodri James
9b1a9dd790 Test PI with a target of "xnl" 2017-07-22 22:49:19 +02:00
Rhodri James
1c6675e302 Test PI with a target of "yml" 2017-07-22 22:49:18 +02:00
Rhodri James
8380c39318 Refactor test_comment_handled_in_default()
Eliminates an unnecessary handler when we can use
accumulate_characters instead.
2017-07-22 22:49:18 +02:00
Rhodri James
6822bea985 Refactor test_pi_handled_in_default to use accumulate_characters() 2017-07-22 22:49:18 +02:00
Rhodri James
278545e976 Test unknown encoding handler in a parser with namespaces 2017-07-22 22:49:18 +02:00
Rhodri James
0940aa04ee Test unannounced UTF-16, no BOM, leading space to hide '<' 2017-07-22 22:49:18 +02:00
Rhodri James
431a7b30bc Test UTF-16 where UTF-8 is expected is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
2f40b6d5fa Test UTF-16BE vs explicit UTF-16LE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
6526fb993d Test UTF-8 character nearly the BOM (for code path coverage) 2017-07-22 22:49:18 +02:00
Rhodri James
e0bcf3f674 Test UTF-16LE vs explicit UTF-16BE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
d4ce36970b Test UTF-16 BE BOM vs explicit Latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
4f9f6651b7 Test UTF-16 big-endian BOM vs explicit Latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
afe76e8bbf Test UTF-16 BOM vs explicit latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
8d6bb0fd1a Test UTF-16 BOM vs explicit latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
a2104666fb Test specifying a UTF-16 encoding when creating the parser works 2017-07-22 22:49:18 +02:00
Rhodri James
5aa3ead193 Test unknown encoding with 17-bit top-bit-set character is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
9bd8391e2c Test top-bit-set character mimicking surrogate high is rejected 2017-07-22 22:49:18 +02:00