|
|
System.Text.RegularExpressions
Classes
related to Regular Expressions are contained within a subset of the general
text manipulation namespace, System.Text. The System.Text.RegularExpressions
namespace contains eight classes and one enumeration that are used to leverage
the power of Regular Expressions. These classes and the enumeration are listed
below, and are discussed in more detail in subsequent articles.
Regex
The Regex class represents a regular expression. It contains static and
instance methods for testing input strings for a match, retrieving objects for
each of the matches made, replacing matched strings, and splitting the string
based on the pattern. The following code sample shows the initialization of a
Regex class.
Regex rex = new Regex(@"\D");
Dim rex as Regex = new Regex("\D")
|
|
The regular expression represented by the Regex class is immutable, meaning
that the pattern loaded during instantiation cannot be changed. You can use the
static methods to pass a regular expression pattern and string to test without
having to manually instantiate a Regex object for each pattern.
The Regex class is fundamental in working with Regular Expressions, and is
covered in more detail here.
|
|
Match & MatchCollection
| A Match is a result of a regular expression pattern match, and encapsulates
information about the success in finding a match, it's location within the
string, and it's value. A single match can be obtained by calling the Match()
method on a Regex object. If the Success property of this object is true, then
a match was found.
For instances where a pattern finds multiple matches, a collection can be
returned by calling the Matches() method on the Regex object. This
MatchCollection is a collection of Match objects, and implements ICollection
and IEnumerable, allowing you to walk through the collection using the foreach
syntax.
foreach(Match m in rex.Matches("abcdefghi"))
{
 Console.WriteLine(m.Value);
}
For Each m as Match in rex.Matches("abcdefghi")
 Console.WriteLine(m.Value)        
Next
|
|
In our section on Groups, you'll learn the syntax for identifying sub-patterns within your regular
expression. The groups for a particular match can be extracted using the Match
object's Groups property.
Processing matches are covered in the section on the Match
and MatchCollection classes.
|
 |
Group & GroupCollection
| Regular Expressions allow you to designate a subset of a pattern as a group.
These groups can then be accessed for each Match. In our example on
extracting URLs in an HTML document, we'll see how groups allow us to
extract just the URL portion of a match on href="" attributes.
The GroupCollection contains Group objects, and like the MatchCollection,
implements the ICollection and IEnumerable interfaces. A GroupCollection can be
obtained through the Groups property of a Match object.
We discuss the Group and GroupCollection object in our section on
Groups.
|
 |
Capture & CaptureCollection
|
A Match can have Groups and Groups can have Captures. When a group contains a
multiplicity specifier, the CaptureCollection provides the mechanism to access
each Capture substring. For the most part, Matches and Groups will be
sufficient.
Because these objects only come into play with specific advanced regular
expressions, and the processing is very similar to processing a Match, we do
not cover the Capture and CaptureCollection in detail.
|
 |
RegexCompilationInfo
| Regular Expressions can be compiled to assemblies for increased performance.
The RegexCompilationInfo class provides information to the compiler when
creating a stand-alone assembly. The Pattern property, for example, indicates
the regular expression to compile.
|
 |
RegexOptions
| RegexOptions is an enumeration used to set properties of the regular expression
matching engine. By supplying the IgnoreCase option, the pattern matching will
not be case sensitive. Methods of the Regex class are overloaded to accept a
RegexOptions object.
|
|
MatchEvaluator
| MatchEvaluator is a delegate declaration used when Replacing text. By using
this callback mechanism, you can perform processing logic based on the match at
runtime, and return the replacement text to the regular expression engine.
|
|
|
|