diff --git a/HACKING b/HACKING index 9fd20c9..a314bfd 100644 --- a/HACKING +++ b/HACKING @@ -88,10 +88,10 @@ I had a flash of inspiration as to how I could run the real compile function in a "fake" mode that enables it to compute how much memory it would need, while in most cases only ever using a small amount of working memory, and without too many tests of the mode that might slow it down. So I refactored the compiling -functions to work this way. This got rid of about 600 lines of source. It -should make future maintenance and development easier. As this was such a major -change, I never released 6.8, instead upping the number to 7.0 (other quite -major changes were also present in the 7.0 release). +functions to work this way. This got rid of about 600 lines of source and made +further maintenance and development easier. As this was such a major change, I +never released 6.8, instead upping the number to 7.0 (other quite major changes +were also present in the 7.0 release). A side effect of this work was that the previous limit of 200 on the nesting depth of parentheses was removed. However, there was a downside: compiling ran @@ -122,7 +122,7 @@ all the named subpatterns and their corresponding group numbers. This means that the actual compile (both the memory-computing dummy run and the real compile) has full knowledge of group names and numbers throughout. Several dozen lines of messy code were eliminated, though the new pre-pass was not -short. In particular, parsing and skipping over [] classes is complicated. +short. In particular, parsing and skipping over [] classes was complicated. While working on 10.22 I realized that I could simplify yet again by moving more of the parsing into the pre-pass, thus avoiding doing it in two places, so @@ -162,7 +162,7 @@ simpler than before. Most errors can be diagnosed during the parsing scan. For those that cannot (for example, "lookbehind assertion is not fixed length"), the parsed code contains offsets into the pattern so that the actual compiling code can -identify where errors occur. +report where errors are. The elements of the parsed pattern vector @@ -217,10 +217,10 @@ The following have data in the lower 16 bits, and may be followed by other data elements: META_ALT | alternation -META_BACKREF -META_CAPTURE -META_ESCAPE -META_RECURSE +META_BACKREF back reference +META_CAPTURE start of capturing group +META_ESCAPE non-literal escape sequence +META_RECURSE recursion call If the data for META_ALT is non-zero, it is inside a lookbehind, and the data is the length of its branch, for which OP_REVERSE must be generated. @@ -232,8 +232,8 @@ META_BACKREF is followed by an offset if the back reference group number is 10 or more. The offsets of the first ocurrences of references to groups whose numbers are less than 10 are put in cb->small_ref_offset[] (only the first occurrence is useful). On 64-bit systems this avoids using more than two parsed -pattern elements for items such as \3. The offset is used when an error is -given for a reference to a non-existent group. +pattern elements for items such as \3. The offset is used when an error occurs +because the reference is to a non-existent group. META_RECURSE is always followed by an offset, for use in error messages. @@ -286,7 +286,7 @@ group; this is used when generating OP_REVERSE for that branch. META_LOOKBEHIND (?<= META_LOOKBEHINDNOT (?