What are back-references?
A back-reference is a regular expression token such as $0 or $1 that refers to whatever part of the text was matched by the capture group in that position within the regular expression.
Back-references are used whenever you want the output/interpretation to resemble the original match: they insert a substring of the original matching text. Like other regular expression features, back-references help to ensure that you do not have to maintain a large, cumbersome list of all possible URL or HTML permutations and their variations or translations when using features such as custom attack signatures, rewriting, or auto-learning.
To invoke a substring, use $n (0 <= n <= 9), where n is the order of appearance of capture group in the regular expression, from left to right, from outside to inside, then from top to bottom.
For example, regular expressions in a condition table in this order:
(a)(b)(c(d))(e)
would result in back-reference variables (e.g. $0) with the following values:
• $0 — a
• $1 — b
• $2 — cd
• $3 — d
• $4 — e
Should you use $0 or /0 to refer back to a substring? Something else? That depends.
• /0 — An earlier part in the current string, such as when you have a URL that repeats: (/(^/)*)/0/0/0/0
• $0 — A part of the previous match string, such as when using part of the originally matched domain name to rewrite the new domain name: $0\.example\.co\.jp where $0 contains www, ftp, or whichever prefix matched the first capture group in the match test regular expression, (^.)*\.example\.com
• $+ — The highest-numbered capture group of the previous match string: if the capture groups were numbered 0-9, this would be equivalent to /9.
• $& — The entire match string.
See also