Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiny documentation/comment fixes #585

Merged
merged 1 commit into from
Nov 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 17 additions & 17 deletions doc/pcre2pattern.3
Original file line number Diff line number Diff line change
Expand Up @@ -1543,8 +1543,8 @@ something AND NOT ...".
.P
The metacharacters that are recognized in character classes are backslash,
hyphen (when it can be interpreted as specifying a range), circumflex
(only at the start), and the terminating closing square bracket. An opening
square bracket is also special when it can be interpreted as introducing a
(only at the start), and the terminating closing square bracket. An opening
square bracket is also special when it can be interpreted as introducing a
POSIX class (see
.\" HTML <a href="#posixclasses">
.\" </a>
Expand All @@ -1555,15 +1555,15 @@ below), or a special compatibility feature (see
.\" </a>
"Compatibility feature for word boundaries"
.\"
below. Escaping any non-alphanumeric character in a class turns it into a
below. Escaping any non-alphanumeric character in a class turns it into a
literal, whether or not it would otherwise be a metacharacter.
.
.
.SH "PERL EXTENDED CHARACTER CLASSES"
.rs
.sp
From release 10.45 PCRE2 supports Perl's (?[...]) extended character class
syntax. This can be used to perform set operations such intersection on
syntax. This can be used to perform set operations such as intersection on
character classes.
.P
The syntax permitted within (?[...]) is quite different to ordinary character
Expand Down Expand Up @@ -1604,7 +1604,7 @@ syntax, allowing instead extended class behaviour inside ordinary [...]
character classes. This altered syntax for [...] classes is loosely described
by the Unicode standard UTS#18. The PCRE2_ALT_EXTENDED_CLASS option does not
prevent use of (?[...]) classes; it just changes the meaning of all
[...] classes that are not nested inside a Perl (?[...]) class.
[...] classes that are not nested inside a Perl (?[...]) class.
.P
Firstly, in ordinary Perl [...] syntax, an expression such as "[a[]" is a
character class with two literal characters "a" and "[", but in UTS#18 extended
Expand All @@ -1614,22 +1614,22 @@ denoting the start of a nested class, so a literal "[" must be escaped as "\e[".
Secondly, within the UTS#18 extended syntax, there are operators "||", "&&",
"--" and "~~" which denote character class union, intersection, subtraction,
and symmetric difference respectively. In standard Perl syntax, these would
simply be needlessly-repeated literals (except for "--" which could be the
start of a range). In UTS#18 extended classes these operators can be used in
constructs such as [\ep{L}--[QW]] for "Unicode letters, other than Q and W". A
literal "-" at the end of a range must be escaped, so while "[--1]" in Perl
syntax is the range from hyphen to "1", it must be escaped as "[\e--1]" in
UTS#18 extended classes.
.P
Unlike Perl's (?[...]) extended classes, the PCRE2_EXTENDED_MORE option to
ignore space and tab characters is not automatically enabled for UTS#18
simply be needlessly-repeated literals (except for "--" which could be the
start or end of a range). In UTS#18 extended classes these operators can be used
in constructs such as [\ep{L}--[QW]] for "Unicode letters, other than Q and W".
A literal "-" at the start or end of a range must be escaped, so while "[--1]"
in Perl syntax is the range from hyphen to "1", it must be escaped as "[\e--1]"
in UTS#18 extended classes.
.P
Unlike Perl's (?[...]) extended classes, the PCRE2_EXTENDED_MORE option to
ignore space and tab characters is not automatically enabled for UTS#18
extended classes, but it is honoured if set.
.P
Extended UTS#18 classes can be nested, and nested classes are themselves
extended classes (unlike Perl, where nested classes must be simple classes).
For example, [\ep{L}&&[\ep{Thai}||\ep{Greek}]] matches any letter that is in
the Thai or Greek scripts. Note that this means that no special grouping
characters (such as the parentheses used in Perl's (?[...]) class syntax) are
For example, [\ep{L}&&[\ep{Thai}||\ep{Greek}]] matches any letter that is in
the Thai or Greek scripts. Note that this means that no special grouping
characters (such as the parentheses used in Perl's (?[...]) class syntax) are
needed.
.P
Individual class items (literal characters, literal ranges, properties such as
Expand Down
5 changes: 3 additions & 2 deletions src/pcre2_compile.c
Original file line number Diff line number Diff line change
Expand Up @@ -4140,7 +4140,7 @@ while (ptr < ptrend)
(c == CHAR_PLUS || c == CHAR_VERTICAL_LINE || c == CHAR_MINUS ||
c == CHAR_AMPERSAND || c == CHAR_CIRCUMFLEX_ACCENT))
{
/* Check for a preceding operand. */
/* Check that there was a preceding operand. */
if (class_op_state != CLASS_OP_OPERAND)
{
errorcode = ERR109;
Expand Down Expand Up @@ -4172,7 +4172,8 @@ while (ptr < ptrend)
else if (class_mode_state == CLASS_MODE_PERL_EXT &&
c == CHAR_EXCLAMATION_MARK)
{
/* Check for no preceding operand. */
/* Check that the "!" has not got a preceding operand (i.e. it's the
start of the class, or follows an operator). */
if (class_op_state == CLASS_OP_OPERAND)
{
errorcode = ERR113;
Expand Down