View domain data

The system provides three dimensions to view the domain data:

To view the collected domain data:

  1. Click Machine Learning>Machine Learning Policy.
  2. Double-click a machine-learning policy to open it.
  3. Scroll down to the bottom of the Edit Machine Learning page.
  4. In the Action column, click (View Domain).

Overview

The Overview tab provides a summary of data collected for the domain through the use of the machine learning profile. It reports information about the entire domain, including the domain overview, Top 10 URLs by Hit, HMM Learning Progress, Violations Triggered by Anomalies, and Machine Learning Events dashboard.

Parameters Description
Access Frequency

Indicates how frequent this application is being accessed.

Start Time

The date and time when the machine-learning module started to learn about the domain.

URL Number

The total number of URLs that the machine-learning module has learned.

Action (Alert/Block)

The total number of the alerts, including both Alert action and Alert & Deny action, that has been issued since the start time up to the present moment, as well as the percentage of each in the total number of requests.

Service(HTTP/HTTPS)

The total amount of the HTTP and the HTTPS traffic from the start time up to now.

Page Charset

The charset of URLs in the domain, such as UTF-8.

Top 10 URLs by Hit

The Top 10 URLs by Hit chart displays the top 10 URLs for page hits counts.

HTTP and HTTPS Traffic Trend

This chart displays the trend of HTTP and HTTPS access to the domain over time.

HMM Learning Progress

This chart displays the statistics of HMM learning states of all parameters in the domain.

Parameters Description
Collecting

Indicates that the learning progress of parameters is in the sample collecting stage.

Building

Indicates that, after successfully collected the samples, the machine learning module has begun to build all the needed mathematical models for the parameters. This is the mathematical models-building stage.

Testing

Indicates that, after successfully built the mathematical models, the models are being tested. All models are required to be tested against a certain number of samples until they have proved to be stable.

Running

Indicates that the mathematical models of the parameters are stable, and the machine-learning module is running. Requests triggering an anomaly will move into the second machine learning layer to check whether they are actual threats.

Discarded

Indicates that FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use machine learning to protect them.

Violations Triggered by Anomalies

This chart displays the total number of the potential anomalies and definite anomalies found experienced by the domain.

Allow Method Learning Progress

This chart displays the statistics of HTTP method learning stages of all URLs in the domain.

Parameters Description
Learning

Indicates that the HTTP method learning is still in progress, not yet completed.

Finished

Indicates that the machine-learning module has completed the HTTP method learning, and is able to allow or deny HTTP request methods according the HTTP method learning models.

Allow Method Violation Trend

The chart displays the HTTP method violation trend over the time.

TreeView

The Tree View displays the entire URL directory of the domain in a tree view. You can choose either one of the URLs to view its violation statistics.

Web site directory

The left panel of the Tree View page shows the directory structure of the website. The / (backslash) indicates the root of the site. You can click a URL in the directory tree, then the violation statistics of this URL will be displayed on the right side of the Tree View page. You can also click a directory, then click Rebuild Directory to rebuild machine learning models for all the URLs under the selected directory.

URL-specific data

This part of the TreeView page shows the statistics of a specific URL.

Parameters Description
Access Frequency

The frequency at which this URL was accessed in last 24 hours. The frequency is divided into 7 levels, as defined below:

  • Level1 ( over 500 requests )
  • Level2 ( over 1000 requests )
  • Level3 ( over 1500 requests )
  • Level4 ( over 2000 requests )
  • Level5 ( over 2500 requests )
  • Level6 ( over 3000 requests )
  • Level7 ( over 3500 requests )
Model Initialization Date

The date and time when the mathematical model of this URL was initialized. It shows when FortiWeb began to learn about the data of this URL.

Action (Alert/Block)

The actions taken for this URL for all requests in last 24 hours, including the number of requests alerted and blocked.

Violation Trend

This chart shows the trend of violations in last 24 hours, including the number of violations alerted and blocked.

Triggered Violations Based on Anomaly Type

This chart shows the number of violations triggered by anomaly type in the last 24 hours.

Control buttons

The TreeView page also provides two control buttons: Rebuild URL and Import.

The bottom of the TreeView page shows two tabs: Parameters and Allowed Method. The former shows the HMM learning states of a URL, and the latter displays the method(s) used in machine learning.

Parameters table

The Parameters tab shows a list of arguments attached to a URL. To put our discussion in context, let's use the following URL as an example:

http://www.demo.com/1.php?user_name=jack

where http://www.demo.com/1.php is the URL, and user_name is the argument attached to the URL. The illustration below shows how this information is present under the Parameters tab.

Column Description
Parameter Name

The (name of) argument attached to a URL.

HMM Learning Stage

The stage in which the HMM learning process is at. It can be one of the following:

  • Collecting—The system is collecting data samples.
  • Building—Sample collection is completed, and is building the mathematical models. Note: This phase last only a few seconds.
  • Testing—In this phase, the system collects 500 inputs for this argument, and tests them against the mathematical model. If 5% of the inputs for this argument are recognized as anomalies, this mathematical model is considered invalid. The system will discard the learning results and rebuild the mathematical model.
  • Running—The system enters this stage after the testing has completed successfully. FortiWeb will use this mathematical model to evaluate all new inputs for this argument. If the inputs are anomalies, the system will employ the second machine learning layer to verify whether the anomaly is an attack and take the corresponding action.
  • Discarded—FortiWeb has determined that it cannot build a mathematical model for these parameters, and therefore will not use machine learning to protect them.
HMM Details

Click the (View HMM Details) icon to view the probability boxplots and distribution of anomalies triggered by HMM.

Note: The boxplots and anomaly distribution chart are available only parameter status is in testing or running stage. See the discussions below.

View HMM Details

Applications change frequently as new URLs are added and existing parameters provide new functions. This means the same parameter might be different than what FortiWeb originally observed during the collection phase. In this case, FortiWeb needs to re-learn the parameter and then updates the mathematical model for it.

First of all, FortiWeb needs to determine that the parameter has changed. To do that, it uses boxplots to depict numerical data.

FortiWeb builds boxplots, each representing the distribution of a certain number of entries, and then compares them to previous boxplots. If certain areas in the boxplots do not overlap with areas in the previous boxplots, FortiWeb determines that the parameter has changed and then rebuilds the parameter (i.e., re-learn it).

The illustration above shows the boxplots generated with samples and new values of the argument, respectively. FortiWeb uses colors to distinguish the boxeplots generated with samples and the boxplot generated with new inputs for the argument: boxes generated with samples are brown whereas the box generated with new inputs is blue. FortiWeb can build 2 to 4 boxes depending on the number of samples collected for the argument.

From the diagram, you can clearly see how the boxplot generated with new inputs overlaps with the boxplots generated with samples.

The chart above presents the anomalies together with normal requests over a graph based on the requests’ probability and length—the two dimensions that HMM uses to evaluate whether a request is an anomaly. The blue dots represent normal requests, whereas the red dot represents a potential or definite anomaly.

Manage anomaly-detecting settings

This section of the page shows the settings the system uses to detect anomalies. You can either click the Inherit global setting tab to use the system's global anomaly detection settings, or click the Custom settings tab to define your own settings. Both tabs use the same settings to detect anomalies:

These two settings control how strict you wish to detect the anomalies. The value can range from 0.1 to 1.0

The lower the value, the more strict the detection of anomalies. Changing the value of strictness here will cause changes in the Distribution of Anomalies triggered by HMM chart.

Definite anomalies are far more serious than potential anomalies. Therefore, the Strictness Level for Definite Anomaly must be lower than the Strictness Level for Potential Anomaly.

0.3 means that 0.3% of all samples with the largest HMM probability and length will be treated as anomalies.

To set the anomaly detection settings:

  1. Click either the Inherit global setting or the Custom settings tab.
  2. Set the Strictness Level for Potential Anomaly.
  3. Set the Strictness Level for Definite Anomaly.
  4. Click Apply.

Actions you can take on any argument

There is a configuration button which, when clicked, will open a drop-down menu with three options, as illustrated below.

 

Menu option Description
Rebuild Parameter Clears the preceding mathematical model for the argument, and then begins to collect samples and build the models again. Use this option when you think that the current model can not meet your needs. For example, it creates some false positives or fails to detect some attacks.
Discard Discards this parameter and does not re-build it. This will disable the learning for this argument and bypass machine learning all together for this parameter.
Export Export the mathematical model for this argument to a file. You can import the model to arbitrary URL. See Import under Control buttons

At the bottom of the page is a list of anomalies. It shows the samples which have been recognized as potential anomalies and definite anomalies. The list may change as new strictness settings are applied.

Allowed Method

This part of the page allows you to set or change the method used to access the URL.

 

As shown above, there are two ways to set the allowed method.

Method Description
By Machine Learning This approach lets the system set the allowed method based on the result of machine learning.
Customized This approach allows you to customize the allowed method.

To set a custom allowed method:

  1. Click the Customized tab.
  2. Select any method(s) of interest.
  3. Click Apply.

To switch back to the default allowed method (machine learning):

  1. Click the By Machine Leaning tab.
  2. Click Apply.