I have this regex:
<DIV(?:(?!</DIV>).)*?"(http://www\.foo\.com(?:\\.|[^"\\])*)"
And I am trying to use it using c#:
@"<DIV(?:(?!</DIV>).)*?""(http://www\.foo\.com(?:\\.|[^""\\])*)"""
But this outputs every thing from <DIV
onward I want it to just show inside the ""
Like the actual regex does
Since you are using capture groups (groups between a pair of parentheses ()
), you will have to use Groups[#]
to get the different parts of the capture group. Groups[1]
will have the value of the first capture group.
If you are using:
@"<DIV(?:(?!</DIV>).)*?""(http://www\.foo\.com(?:\\.|[^""\\])*)"""
You will get the ones between "
by using Groups[1].Value
and the whole match in Groups[0].Value
.
Example:
@"a(b(cd)(ef))"
Here you have 3 capture groups because there are 3 pairs of parentheses. After a match, and if you use:
Console.WriteLine(match.Groups[0].Value);
Console.WriteLine(match.Groups[1].Value);
Console.WriteLine(match.Groups[2].Value);
Console.WriteLine(match.Groups[3].Value);
You get:
abcdef
bcdef
cd
ef
If that's a little confusing, maybe this breakdown can help:
a(b(cd)(ef))
1 2 3
^--|^--|
^---------|
The numbers and ^
indicate the beginning of the capture groups.
string str = "<DIV src=\"http://www.foo.com\"></DIV>";
Regex re = new Regex(@"<DIV(?:(?!<\/DIV>).)*?\""(http:\/\/www\.foo\.com(?:\\.|[^\""\\])*)\""");
// or Regex re = new Regex(@"<DIV(?:(?!</DIV>).)*?""(http://www\.foo\.com(?:\\.|[^""\\])*)""");
Match match = re.Match(str);
Console.Write(match.Groups[1]); // Returns "http://www.foo.com"