Configure machine learning policies

Machine-learning policies are part of a server policy. They are created on the Policy > Sever Policy page. All machine-learning policies that you create will show up on the Machine Learning > Machine Learning Policy page, where you can configure or edit them to your preference.

To configure a machine-learning policy:

  1. Click Machine Learning>Machine Learning Policy.
  2. Double-click a machine learning policy of interest (or highlight it and then click the Edit button on top of the page) to open it. The Edit Machine Learning page opens, which breaks down machine learning profile into several sections, each of which has various parameters you can use to configure the profile.
  3. Follow the instructions in the following subsections to configure a machine learning profile.
  4. Click OK when done.

 

Sections & Parameters Function
HMM Parameter Model Update
Sample Collection mode

Normal: up to 5000 samples will be collected to build a machine learning model for the parameter. The default sample collection mode is Normal.

Fast: up to 2500 samples will be collected to build a machine learning model for the parameter.

2500/5000 is the maximum number. If the system observes an obvious pattern of HTTP request behavior for this parameter, or there are enough valid samples to build a machine learning model, the system will stop collection and start building model even though the number of samples hasn't reached 2500/5000 yet.

Dynamically update when parameters change

Applications change frequently as new URLs are added and existing parameters provide new functions. This means the mathematical model of the same parameter might be different from what FortiWeb originally observed during the collection phase. In this case, FortiWeb needs to re-learn the parameter and updates the mathematical model for it.

Enable this option to automatically update the mathematical models of the parameters when they are changed.

Application Change Sensitivity

This option appears when you enable Dynamically update when parameters change.

The system uses boxplots to determine whether a parameter has changed. The boxplot displays the probability distribution of the parameter value. During sample collection period, the system generates 2 or 4 boxplots. After machine learning model is built, the system will keep on generating new boxplots to display the probability distribution of the new inputs. If the probability distribution area of the newly generated boxplot doesn't overlap with any one of the sample boxplots, the system determines this parameter has changed.

For more information on boxplots, see Probability Boxplots.

Depending on the Application Change Sensitivity level, the system triggers model update when it observes different extent of overlapping area.

  • Low—The system triggers model update only when the entire data distribution area (from the maximum value to the minimum value, that is, the entire area containing all the data) of the new boxplot doesn't have any overlapping part with that of the sample boxplots.
  • Medium—The system triggers model update if the notch area (the median rectangular area in the boxplot where most of the data is located) of the new boxplot doesn't have any overlapping part with the entire data distribution areas of the sample boxplots.
  • High—The system triggers model update as long as the notch area of the new boxplot doesn't have any overlapping part with that of the sample boxplots.
Update parameter model when number of boxplots do not overlap

This option appears when you enable Dynamically update when parameters change.

The default value is 2, which means if 2 newly generated boxplots don't overlap with any one of the sample boxplots, FortiWeb automatically updates the machine learning model.

You can set a value from 1 to 3.

Anomaly Detection Settings
Anomaly Detection Method

There are two anomaly detection methods: Automatic and Quantile. They determine anomalies with the aid of machine learning models. The machine learning model judges whether a request is normal or not based on its HMM probability and the length of the parameter value.

Both methods have potential anomaly threshold and definite anomaly threshold to detect anomalies. You can set different actions (alert, alert&deny or period block) for potential anomaly and definite anomaly.

Automatic

Automatic is the default method. Compared with Quantile method, Automatic method is more complicated, so that it requires longer time to detect anomalies, but it's more accurate.

The value for the strictness level can range from 0.1 to 1.0. The higher the value, the more anomalies will be triggered. For example, 0.1 means that 0.1% of all samples with the largest HMM probability and length will be treated as anomalies.

Quantile

Quantile simply uses a threshold, and all probabilities above it are identified as anomalies.

The value for the strictness level can range from 0.1 to 1.0. The higher the value, the more anomalies will be triggered. If you select 0.3 for Potential Anomaly, it means that the 99.7 quantile will be used as the threshold. If a probability exceeds this 99.7 quantile threshold, it will be considered to be an unexpected outlier, so the request would be considered as an attack.

Strictness Level for Potential Anomaly

Enter the threshold value or choose the threshold numbers. The default is 0.3, and the valid range is from 0 to 1. The higher the threshold, the more anomalies will be triggered.

Strictness Level for Definite Anomaly

Enter the threshold value or choose threshold numbers. The default is 0.1, and the valid range is from 0 to 0.9. The higher the threshold, the more anomalies will be triggered.

Threat Model
Threat Model

Enable to scan anomalies to verify whether they are attacks. It provides a method to check whether an anomaly is a real attack by the trained Support Vector Machine Model.

View Threat Models

Click the View Threat Models link to enable or disable threat models for different types of threats such as cross-site scripting, SQL injection and code injection. Currently, seven trained Support Vector Machine Model are provided for seven attack types.

Action Settings
Action

All requests are scanned first by HMM and then by Threat model.

Double click the cells in the Action Settings table to choose the action FortiWeb takes when attack is verified for each of the following situations:

  • Alert—Accepts the connection and generates an alert email and/or log message.
  • Alert & Deny—Blocks the request (or resets the connection) and generates an alert and/or log message.
  • Period Block—Blocks the request for a certain period of time.
Block Period

Enter the number of seconds that you want to block the requests. The valid range is 1–3,600 seconds. The default value is 60 seconds.

This option only takes effect when you choose Period Block in Action.

Severity

Select the severity level for this anomaly type. The severity level will be displayed in the alert email and/or log message.

Trigger Action

Select a trigger policy that you have set in Log&Report > Log Policy > Trigger Policy. If potential or definite anomaly or HTTP Method Violation is detected, it will trigger the system to send email and/or log messages according to the trigger policy.

URL Replacer Policy

Select the name of the URL Replacer Policy that you have created in Machine Learning Templates.

If web applications have dynamic URLs or unusual parameter styles, you must adapt URL Replacer Policy to recognize them.

For more information on URL Replacer Policy, see "Configure machine-learning templates" on page 1

Allow sample collection for domains

Add domains in this table so that the system will collect samples and generate machine learning models for these domains.

Here's what you can do:

Allow sample collection from IPs

Add IP addresses in this table so that the system will collect traffic data samples only from these IP addresses to build machine learning models.

Here's what you can do:

If you leave the table blank, the system will collect traffic data samples from random IP addresses. The maximum number of samples collected from each random IP address is 30. You can change the maximum value through CLI command waf machine-learning-policy.

If you add IP addresses in this table, the sample collection limit will not take effect, which means FortiWeb will collect traffic data samples only from these IP addresses and will not limit the number of samples.