Some explanations concerning PCRE

Some explanations concerning PCRE

PCRE stands for “Perl Compatible Regular Expressions”. The provided runtime library named “pcre.dll” is a special Delphi-optimized DLL from Renato Mancuso. The main changes I made for my software was to change the binding from static to dynamic. So, a missing pcre.dll will not prevent my software from running. There are some other ways (components) to gain regular expression possibilities, but this solutions was always stable, so, why should I change this?
In his header file “pcre_dll.pas”, Renato Mancuso, gives some precious hints about the Compile Options and Match Options.

ANCHORED:
If this bit is set, the pattern is forced to be “anchored”,
that is, it is constrained to match only at the start of the
string which is being searched (the “subject string”).
This effect can also be achieved by appropriate constructs in the pattern itself, which is the only way to do it in Perl.

CASELESS:
If this bit is set, letters in the pattern match both upper and lower case letters. It is equivalent to Perl’s /i option.

DOLLAR_ENDONLY:
If this bit is set, a dollar metacharacter in the pattern matches
only at the end of the subject string.
Without this option, a dollar also matches immediately before
the final character if it is a newline (but not before any other newlines).

DOTALL:
If this bit is set, a dot metacharater in the pattern matches all
characters, including newlines. Without it, newlines are excluded.
This option is equivalent to Perl’s /s option.
A negative class such as [^a] always matches a newline character,
independent of the setting of this option.

EXTENDED:
If this bit is set, whitespace data characters in the pattern are
totally ignored except when escaped or inside a character class,
and characters between an unescaped # outside a character class
and the next newline character, inclusive, are also ignored.
This is equivalent to Perl’s /x option, and makes it possible to
include comments inside complicated patterns.
Note, however, that this applies only to data characters.
Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence
(?( which introduces a conditional subpattern.

EXTRA:
This option was invented in order to turn on additional functionality of PCRE that is incompatible with Perl, but it is currently of very little use. When set, any backslash in a pattern that is followed by a letter that has no special meaning causes an error, thus reserving these combinations for future expansion.
By default, as in Perl, a backslash followed by a letter with no
special meaning is treated as a literal.
There are at present no other features controlled by this option.
It can also be set by a (?X) option setting within a pattern.

NO_AUTO_CAPTURE:
If this option is set, it disables the use of numbered capturing
parentheses in the pattern. Any opening parenthesis that is not
followed by ? behaves as if it were followed by ?: but named
parentheses can still be used for capturing (and they acquire numbers in the usual way). There is no equivalent of this option in Perl.

UNGREEDY:
This option inverts the “greediness” of the quantifiers so that they are not greedy by default, but become greedy if followed by “?”. It is not compatible with Perl.
It can also be set by a (?U) option setting within the pattern.

NOTBOL:
The first character of the string is not the beginning of a line,
so the circumflex metacharacter should not match before it.
Setting this without PCRE_MULTILINE (at compile time) causes
circumflex never to match.

NOTEOL:
The end of the string is not the end of a line, so the dollar
metacharacter should not match it nor (except in multiline mode) a
newline immediately before it.  Setting this without PCRE_MULTILINE (at compile time)   causes dollar never to match.

NOTEMPTY:
An empty string is not considered to be a valid match if this option is set. If there are alternatives in the pattern, they are tried.
If all the alternatives match the empty string, the entire match
fails. For example, if the pattern  a?b?  is applied to a string not beginning with “a” or “b”, it matches  the empty string at the start of the subject. With PCRE_NOTEMPTY set, this match is not valid, so PCRE searches further into the string for occurrences of “a” or “b”.

Go Home!