Regex Groups and Find/Replace Tagged Expressions

Every once in awhile I have a need to use regular expressions in code. I love using them in Visual Studio find/replace. And one of my favorite features in the find/replace dialog is tagged expressions. This allows me to make a regular expression match and at the same time extract pieces of it for use in my replace with expression.

So today, I'm toying around with Regex and find a need to have the same behavior in code. The difference is, tagged expressions in code are referred to as matching groups. Normally you would use braces for your tagged expressions in a find/replace dialog; but in code, you would use parentheses instead.

(BTW, These are very simplified and by no means an end-all match expression for this scenario).

For today's search, I need an expression that would give me a table name from an insert or update t-sql statement. Simple enough right?  With this expression: "insert into [a-zA-Z0-9]+|update [a-zA-Z0-9]+ " I can validate either of the two statements. But I really want the table name used in this match.  In a tagged expression, I would've used "insert into {[a-zA-Z0-9]+}|update {[a-zA-Z0-9]+} " so I could use \1 and \2 for my table name in the replace expression. But in code, we'll need to use groups. So the regex will look like this, "insert into ([a-zA-Z0-9]+)|update ([a-zA-Z0-9]+) ".

Fortunately in our search text, it can be only insert or update, not both. So in code, I could match a statement like this:

Regex regex = new Regex("insert into ([a-zA-Z0-9_]+)|update ([a-zA-Z0-9_]+) ", RegexOptions.IgnoreCase);

if (regex.IsMatch(sql_command))  
{
    string table_name = regex.Match(sql_command).Groups[1].Value;
}

The Groups collection contains the sub matches starting with the second element.  This is handy; so you don't have to re-search your match with a second expression. In our scenario, only one group match will ever exist, so using Groups[1] gets me the table name.

You can get more info from the MSDN docs.