When web applications have dynamic URLs or unusual parameter styles, you must adapt auto-learning to recognize them.
By default, auto-learning assumes that your web applications use the most common URL structure:
?
). They do not follow a hash ( # ) or other separator character.&
). They are not separated by a semi-colon ( ; ) or other separator character.All paths before the question mark ( ?
) are static — they do not change based upon input, blending the path with parameters (sometimes called a dynamic URL).
For example, the page at:
/app/main
always has that same path. After a person logs in, the page’s URL doesn’t become:
/app/marco/main
or
/app#deepa
For another example, the URL does not dynamically reflect inventory, such as:
/app/sprockets/widget1024894
Some web applications, however, embed parameters within the path structure of the URL, or use unusual or non-uniform parameter separator characters. If you do not configure URL replacers for such applications, it can cause your FortiWeb appliance to gather auto-learning data incorrectly. This can cause the following symptoms:
For example, with Microsoft Outlook Web App (OWA), the user’s login name could be embedded within the path structure of the URL, such as:
/owa/tom/index.html
/owa/mary/index.html
instead of suffixed as a parameter, such as:
/owa/index.html?username=tom
/owa/index.html?username=mary
Auto-learning would continue to create new URLs as new users are added to OWA. Auto-learning would also expend extra resources learning about URLs and parameters that are actually the same. Additionally, auto-learning may not be able to fully learn the application structure, as each user may not request the same URLs.
To solve this, you would create a URL replacer that recognizes the user name within the OWA URL as if it were a standard, suffixed parameter value so that auto-learning can function properly.
When using auto-learning, you must define how to interpret dynamic URLs and URLs that include parameters in non-standard ways, such as with different parameter separators (;
or #
, for example) or by embedding the parameter within the URL’s path structure.
In the web UI, these interpreter plug-ins are called “URL replacers.”
URL replacers match the URL as it appears in the HTTP header of the client’s request (using the regular expression in URL Path) and interpret it into this standard URL formulation:
New URL?
New Param=
Param Change
For example, if the URL is:
/application/value
and the URL replacer settings are:
Setting name | Value |
---|---|
Type | Custom-Defined |
URL Path | (/application)/([^/]+)
|
New URL | $0
|
Param Change | $1
|
New Param | setting
|
$0
holds this part of the matched URL:
/application
and $1
holds this part of the matched URL:
value
so then the URL will be understood by auto-learning, and displayed in the report, as:
/application?setting=value
Need a refresher on regular expressions? See Regular expression syntax, What are back-references?, and Cookbook regular expressions. You can also use the examples in this section, such as Example: URL interpreter for WordPress. |
1. Go to Auto Learn > Application Templates > URL Replacer.
To access this part of the web UI, your administrator’s account access profile must have Read and Write permission to items in the Autolearn Configuration category. For details, see Permissions.
2. Click Create New.
3. Configure these settings:
Name | Type a unique name that can be referenced by other parts of the configuration. Do not use spaces or special characters. The maximum length is 35 characters. |
Type |
Select either:
|
4. If you selected Predefined in Type, also configure this setting:
5. If you selected Custom-Defined in Type, configure these settings:
URL Path |
Type a regular expression, such as The pattern does not require a slash ( / ). However, it must at least match URLs that begin with a slash as they appear in the HTTP header, such as For examples, see Example: URL interpreter for WordPress. To test the regular expression against sample text, click the >> (test) icon. This opens the Regular Expression Validator window where you can fine-tune the expression (see Regular expression syntax, What are back-references? and Cookbook regular expressions) Note: If this URL replacer will be used sequentially in its set of URL replacers, instead of being mutually exclusive, this regular expression should match the URL produced by the previous interpreter, not the original URL from the request. |
New URL |
Type either a literal URL, such as Note: Back-references can only refer to capture groups (parts of the expression surrounded with parentheses) within the same URL replacer. Back-references cannot refer to capture groups in other URL replacers. |
Param Change | Type either the parameter’s literal value, such as user1 , or a back-reference (such as $0 ) defining how the value will be interpreted. |
New Param |
Type either the parameter’s literal name, such as Note: Back-references can only refer to capture groups (parts of the expression surrounded with parentheses) within the same URL replacer. Back-references cannot refer to capture groups in other URL replacers. |
6. Click OK.
7. Group the URL replacers in an application policy (see Grouping URL interpreters).
8. Select the application policy in one or more auto-learning profiles (see Configuring an auto-learning profile).
9. Select the auto-learning profiles in server policies (see Configuring a server policy).
The HTTP request URL from a client is:
/app/login.jsp;jsessionid=xxx;p1=111;p2=123?p3=5555&p4=66aaaaa
which uses semi-colons as parameter separators ( ; ) in the URL, a behavior typical to JSP applications. You would create a URL replacer to recognize the JSP application’s parameters: the semi-colons.
Setting name | Value |
---|---|
Type | Predefined |
Application Type | JSP |
The predefined JSP interpreter plug-in will interpret the URL as:
/app/login.jsp?p4=66aaaaa&p1=111&p2=123&p3=5555
When a client sends requests to Microsoft Outlook Web App (OWA), many of its URLs use structures like this:
/exchange/jane.doe/memo.EML
/exchange/qinlu/2012/1.html
These have user name parameters embedded in the URL. In order for auto-learning to recognize the parameters, you must either:
A custom URL replacer for those URLs could look like this:
URL interpreter | |
---|---|
Setting name | Value |
Name | OWAusername1
|
Type | Custom-Defined |
URL Path | (/exchange/)([^/]+)/(.*)
|
New URL | $0$2
|
Param Change | $1
|
New Param | username1
|
Then the URLs would be recognized by auto-learning as if OWA used a more conventional parameter structure like this:
/exchange/index.html?username1=tom
/exchange/memo.EML?username1=jane.doe
/exchange/2012/1.html?username1=qinlu
Notably, OWA can also include other parameters in the URL, such as a mail folder’s name. Also, OWA can include the user name and folder in more than one way. Therefore multiple URL interpreters are required to match all possible URL structures. In addition to the first URL replacer, you would also configure the following URL replacers and group them into a single set (an auto-learning “application policy”) in order to recognize all possible URLs.
Sample URL | /exchange/archive-folders/2011
|
---|---|
URL interpreter | |
Setting name | Value |
Name | OWAfoldername1
|
Type | Custom-Defined |
URL Path | (/exchange/)([^/]+/)(.*)
|
New URL | $0
|
Param Change | $1$2
|
New Param | folder1
|
Results | /exchange/?folder1=archive-folders/2011
|
Sample URL | /exchange/jane.doe
|
---|---|
URL interpreter | |
Setting name | Value |
Name | OWAusername2
|
Type | Custom-Defined |
URL Path | (/exchange/)([^/]+\.[^/]+)
|
New URL | $0
|
Param Change | $1
|
New Param | username2
|
Results | /exchange/?username2=jane.doe
|
Sample URL | /public/imap-share-folders/memos
|
---|---|
URL interpreter | |
Setting name | Value |
Name | OWAfoldername2
|
Type | Custom-Defined |
URL Path | (/public/)([^/]+/)(.*)
|
New URL | $0
|
Param Change | $1$2
|
New Param | folder2
|
Results | /public/?folder2=imap-share-folders/memos
|
If the HTTP request URL from a client is a slash-delimited chain of multiple parameters, like either of these:
/index/province/ontario/city/ottawa/street/moodie
then the format is either of these:
/wordpress/value1/value2/value3
/index/param1/value1/param2/value2/param3/value3
In this URL format, there are 3 parameter values (with or without their names) in the URL:
param1
param2
param3
Because each interpreter can only extract a single parameter, you would create 3 URL interpreters, and group them into a set where they are used sequentially — a chain. Each interpreter would use the interpreted output of the previous one as its input, until all parameters had been extracted, at which point the last interpreter would output both the last parameter and the final interpreted URL. FortiWeb would then append parameters back onto the interpreted URL in the standard structure before storing them in the auto-learning data set.
This configuration requires that for every request:
If parameter order or existence vary, this URL interpreter will not work. Requests will not match the URL interpreter set if either
then the regular expression would be too flexible: auto-learning might mistakenly match and learn some of |
Setting name | Value |
---|---|
Name | slash-parameter3
|
Type | Custom-Defined |
URL Path | /index/param1/(.*)/param2/(.*)/param3/(.*)/
|
New URL | /index/param1/$0/param2/$1/
|
Param Change | $2
|
New Param | param3
|
Setting name | Value |
---|---|
Name | slash-parameter2
|
Type | Custom-Defined |
URL Path | /index/param1/(.*)/param2/(.*)/
|
New URL | /index/param1/$0/
|
Param Change | $1
|
New Param | param2
|
Setting name | Value |
---|---|
Name | slash-parameter1
|
Type | Custom-Defined |
URL Path | /index/param1/(.*)/
|
New URL | /index
|
Param Change | $0
|
New Param | param1
|
Until you add the URL interpreters to a group, FortiWeb doesn’t know the sequential order.
These URL interpreters will not function correctly if they are not used in that order, because each interpreter’s input is the output from the previous one. So you must set the priorities correctly when referencing each of those interpreters in the set of URL interpreters (Grouping URL interpreters). |
Setting name | Value |
---|---|
Priority | 0
|
Type | URL REPLACER |
Plugin Name | slash-parameter3
|
Setting name | Value |
---|---|
Priority | 1
|
Type | URL REPLACER |
Plugin Name | slash-parameter2
|
Setting name | Value |
---|---|
Priority | 2
|
Type | URL REPLACER |
Plugin Name | slash-parameter1
|
Then the URL will be interpreted by auto-learning as if the application used a more conventional and easily understood URL/parameter structure:
/index?param1=value1¶m2=value2¶m3=value3
In order to use URL interpreters with an auto-learning profile, you must group URL replacers into sets.
Sets can be:
1. Before you create an application policy, first create the URL replacers that it will include (see Configuring URL interpreters).
2. Go to Auto Learn > Application Templates > Application Policy.
To access this part of the web UI, your administrator’s account access profile must have Read and Write permission to items in the Autolearn Configuration category. For details, see Permissions.
3. Click Create New.
A dialog appears.
4. In Name, type a name that can be referenced by other parts of the configuration. Do not use spaces or special characters. The maximum length is 35 characters.
5. Click OK.
6. Click Create New.
A dialog appears.
7. From Plugin Name, select an existing URL replacer from the drop-down list.
Rule order affects URL replacer matching and behavior. FortiWeb appliances evaluate URLs for a matching URL replacer starting with the smallest ID number (greatest priority) rule in the list, and continue towards the largest number in the list. |
8. Click OK.
9. Repeat the previous steps for each URL replacer you want added to the policy.
10. Select the application policy in an auto-learning profile (see Configuring an auto-learning profile).
11. Select the auto-learning profiles in server policies (see Configuring a server policy).