Commit Graph

2048 Commits

Author SHA1 Message Date
Rhodri James
c0593524ab Test second attribute with UTF-16 top byte non-zero is valid
This exercises a different code path to the first attribute
2017-07-22 22:49:19 +02:00
Rhodri James
e4caa983df Test invalid character in attribute value is faulted 2017-07-22 22:49:19 +02:00
Rhodri James
303e6c5499 Test UTF-16 attribute names with non-zero top byte are accepted 2017-07-22 22:49:19 +02:00
Rhodri James
30b15c43b1 Test that non-name characters after a colon are rejected
I.e. test that leafnames have to be valid names
2017-07-22 22:49:19 +02:00
Rhodri James
45da5d094d Test more than one colon in an attribute name is rejected
I.e. test that namespaces don't nest
2017-07-22 22:49:19 +02:00
Rhodri James
2bb87d2586 Test attribute with high-byte-set UTF-16 character name 2017-07-22 22:49:19 +02:00
Rhodri James
d14ba8e58c Test non-ASCII characters in entity names
Use UTF-16 character with value > 0x00ff for extra code coverage
2017-07-22 22:49:19 +02:00
Rhodri James
bfbafbcef4 Test invalid character in a hex character reference 2017-07-22 22:49:19 +02:00
Rhodri James
e597621187 Test trailing spaces in element names are ignored in start and end tags 2017-07-22 22:49:19 +02:00
Rhodri James
6ec40ff522 Test single stray ] doesn't close a CDATA section
Also uses multi-byte characters around the ] to exercise more code.
2017-07-22 22:49:19 +02:00
Rhodri James
7e7cb2c560 Test that UTF-8 safely passes through a CDATA section.
Specifically test that multi-byte characters are correctly tokenised
2017-07-22 22:49:19 +02:00
Rhodri James
4545eb67bb Test PI in UTF-16 with non-ASCII target name 2017-07-22 22:49:19 +02:00
Rhodri James
caa0b46c6a Test PI with target of "xmm" 2017-07-22 22:49:19 +02:00
Rhodri James
9b1a9dd790 Test PI with a target of "xnl" 2017-07-22 22:49:19 +02:00
Rhodri James
1c6675e302 Test PI with a target of "yml" 2017-07-22 22:49:18 +02:00
Rhodri James
8380c39318 Refactor test_comment_handled_in_default()
Eliminates an unnecessary handler when we can use
accumulate_characters instead.
2017-07-22 22:49:18 +02:00
Rhodri James
6822bea985 Refactor test_pi_handled_in_default to use accumulate_characters() 2017-07-22 22:49:18 +02:00
Rhodri James
278545e976 Test unknown encoding handler in a parser with namespaces 2017-07-22 22:49:18 +02:00
Rhodri James
0940aa04ee Test unannounced UTF-16, no BOM, leading space to hide '<' 2017-07-22 22:49:18 +02:00
Rhodri James
431a7b30bc Test UTF-16 where UTF-8 is expected is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
2f40b6d5fa Test UTF-16BE vs explicit UTF-16LE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
6526fb993d Test UTF-8 character nearly the BOM (for code path coverage) 2017-07-22 22:49:18 +02:00
Rhodri James
e0bcf3f674 Test UTF-16LE vs explicit UTF-16BE encoding 2017-07-22 22:49:18 +02:00
Rhodri James
d4ce36970b Test UTF-16 BE BOM vs explicit Latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
4f9f6651b7 Test UTF-16 big-endian BOM vs explicit Latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
afe76e8bbf Test UTF-16 BOM vs explicit latin-1 encoding parsing a full buffer 2017-07-22 22:49:18 +02:00
Rhodri James
8d6bb0fd1a Test UTF-16 BOM vs explicit latin-1 encoding 2017-07-22 22:49:18 +02:00
Rhodri James
a2104666fb Test specifying a UTF-16 encoding when creating the parser works 2017-07-22 22:49:18 +02:00
Rhodri James
5aa3ead193 Test unknown encoding with 17-bit top-bit-set character is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
9bd8391e2c Test top-bit-set character mimicking surrogate high is rejected 2017-07-22 22:49:18 +02:00
Rhodri James
67aabeb2b8 Test top-bit set character mimicking ASCII is rejected in encodings 2017-07-22 22:49:18 +02:00
Rhodri James
a330a21859 Test unknown encoding with invalid multi-byte length field 2017-07-22 22:49:18 +02:00
Rhodri James
2ce358b203 More refactoring of unknown encoding handlers 2017-07-22 22:49:18 +02:00
Rhodri James
834eb92bdb Refactor to use encoding to switch UnknownEncoding handler behaviour 2017-07-22 22:49:17 +02:00
Rhodri James
36d8911ecb Test ASCII-like unknown encoding correctly rejects invalid characters 2017-07-22 22:49:17 +02:00
Rhodri James
0f950f4b9d Check that ASCII-like encoding is valid 2017-07-22 22:49:17 +02:00
Rhodri James
32ade2af40 Test unknown encoding not compatible with ASCII is rejected 2017-07-22 22:49:17 +02:00
Rhodri James
a88c4cff6d Test unknown encoding conversion buffer overflow not on convert() 2017-07-22 22:49:17 +02:00
Rhodri James
186945d238 Test unknown encoding conversion buffer overflow on convert() 2017-07-22 22:49:17 +02:00
Rhodri James
8c4249685f Test bad non-start of name in unknown encoding 2017-07-22 22:49:17 +02:00
Rhodri James
edc300c876 Test bad start of name with an unknown encoding 2017-07-22 22:49:17 +02:00
Rhodri James
e8999ace2e Test converted start of name with unknown encoding 2017-07-22 22:49:17 +02:00
Rhodri James
7c11f77a03 Comment change: exclude more unreachable code from coverage 2017-07-22 22:49:17 +02:00
Rhodri James
c0b7b55567 Test a valid (if odd) encoding 2017-07-22 22:49:17 +02:00
Rhodri James
6fd7365bfb Comment change: exclude unreachable condition from coverage 2017-07-22 22:49:17 +02:00
Rhodri James
19d0115623 Refactor to get rid of data/code pointer conversion warning 2017-07-22 22:49:17 +02:00
Rhodri James
b2ba9d7911 Add test for encoding conversion function returning errors 2017-07-22 22:49:17 +02:00
Rhodri James
fc06a529a3 Reject multi-byte encodings with no converter function.
Not supplying a character encoding convert function when one was
needed previously caused a segfault if a multi-byte character was
encountered.  This errors at the start of parse instead.
2017-07-22 22:49:17 +02:00
Rhodri James
63de667ed0 Add test of multi-byte encoding without a conversion function
At this commit, the tests seg fault
2017-07-22 22:49:17 +02:00
Rhodri James
69b864b9b1 Test XML declaration with missing value for 'standalone' attr 2017-07-22 22:49:17 +02:00