What are back-references?

Appendix D: Regular expressions : Regular expression syntax : What are back-references?

A back-reference is a regular expression token such as $0 or $1 that refers to whatever part of the text was matched by the capture group in that position within the regular expression.

Back-references are used whenever you want the output/interpretation to resemble the original match: they insert a substring of the original matching text. Like other regular expression features, back-references help to ensure that you do not have to maintain a large, cumbersome list of all possible URL or HTML permutations and their variations or translations when using features such as custom attack signatures, rewriting, or auto-learning.

To invoke a substring, use $n (0 <= n <= 9), where n is the order of appearance of capture group in the regular expression, from left to right, from outside to inside, then from top to bottom.

For example, regular expressions in a condition table in this order:

(a)(b)(c(d))(e)

would result in back-reference variables (e.g. $0) with the following values:

• $0 — a

• $1 — b

• $2 — cd

• $3 — d

• $4 — e


	Numbering of back-references to capture groups starts from 0: to refer to the first substring, use $0 or /0, not $1 or /1.

Should you use $0 or /0 to refer back to a substring? Something else? That depends.

• /0 — An earlier part in the current string, such as when you have a URL that repeats: (/(^/)*)/0/0/0/0

• $0 — A part of the previous match string, such as when using part of the originally matched domain name to rewrite the new domain name: $0\.example\.co\.jp where $0 contains www, ftp, or whichever prefix matched the first capture group in the match test regular expression, (^.)*\.example\.com

• $+ — The highest-numbered capture group of the previous match string: if the capture groups were numbered 0-9, this would be equivalent to /9.

• $& — The entire match string.