Event Parser Specification

FortiSIEM uses an XML-based parser framework to parse events. These topics describe the parser syntax and include examples of XML parser specifications.

Custom Parser XML Specification Template

The basic template for a custom parser XML specification includes five sections. Click on the name of any section for more information.

Section Description
Parser Name Specification Name of the parser file
Device Type The type of device or application associated with the parser
Format Recognizer Specification Patterns that determine whether an event will be parsed by this parser
Pattern Definition Specification Defines the parsing patterns that are iterated over by the parsing instructions
Parsing Instructions Specification Instructions on how to parse events that match the format recognizer patterns

Custom Parser XML Specification Template

<eventParser name="xxx">
   <deviceType> </deviceType>
   <eventFormatRecognizer> </eventFormatRecognizer>
   <patternDefinitions> </patternDefinitions>
   <parsingInstructions> </parsingInstructions>
</eventParser>

Parser Name Specification

This section specifies the name of the parser, which is used only for readability and identifying the device type associated with the parser.

<eventParser name="CiscoIOSParser">
</eventParser>

Device or Application Type Specification

This section specifies the device or the application to which this parser applies. The device and application definitions enable FortiSIEM to detect the device and application type for a host from the received events. This is called log-based discovery in FortiSIEM. Once a received event is successfully parsed by this file, a CMDB entry is created with the device and application set from this file. FortiSIEM discovery may further refine the device.

There are two separate subsections for device and application. In each section, vendor, model and version can be specified, but version is not typically needed. 

Set Version to Any

In the examples in this topic, <Version> is set to ANY because events are generally not tied to a particular version of a device or software. You could of course set this to a specific version number if you only wanted this parser to apply to a specific version of an application or device.

Vendor and Model Must Match the FortiSIEM Version

<Vendor> and <Model> entries must match the spelling and capitalization in the CMDB.

Examples of Specifications for Types of Device and Applications

Hardware Appliances

In this  case, the type of event being parsed specifies the device type, for example Cisco IOS, Cisco ASA, etc.

<deviceType>
    <Vendor>Cisco</Vendor>
    <Model>IOS</Model>
    <Version>ANY</Version>
</deviceType>                        
Software Operating Systems that Specify the Device Type

In this case, the type of events being parsed specifies the device type, for example Microsoft Windows etc. In this case the device type section looks like:

<deviceType>
    <Vendor>Microsoft</Vendor>
    <Model>Windows</Model>
    <Version>ANY</Version>
</deviceType>
Applications that Specify Both Device Type and Application

In this case, the  events being parsed specify the device and application types because Microsoft SQL Server can only run on Microsoft Windows OS.

<deviceType>
    <Vendor>Microsoft</Vendor>
    <Model>Windows</Model>
    <Version>ANY</Version>
</deviceType>
<appType>
    <Vendor>Microsoft</Vendor>
    <Model>SQL Server</Model>
    <Version>ANY</Version>
    <Name> Microsoft SQL Server</Name>
</appType>
Applications that Specify the Application Type but Not the Device Type

Consider the example of an Oracle database server, which can run on both Windows and Linux operating systems. In this case, the device type is set to Generic but the application is specific. FortiSIEM depends on discovery to identify the device type.

<deviceType>
    <Vendor>Generic</Vendor>
    <Model>Generic</Model>
    <Version>ANY</Version>
</deviceType>
<appType>
    <Vendor>Oracle</Vendor>
    <Model>Database Server</Model>
    <Version>ANY</Version>
    <Name>Oracle Database Server</Name>
</appType>

Format Recognizer Specification

In many cases, events associated with a device or application will contain a unique pattern. You can enter a regular expression in the Format Recognizer section of the parser XML file to search for this pattern, which if found, will then parse the events according to the parser instructions. After the first match, the event source IP to parser file map is cached, and only that parser file is used for all events from that source IP. A notable exception is when events from disparate sources are received via a syslog server, but that case is handled differently.

While not a required part of the parser specification, a format recognizer can speed up event parsing, especially when one parsing pattern file among many pattern files must be chosen. Only one pattern check can determine whether the parsing file must be used or not. The other less efficient option would be to examine patterns in every file. At the same time, the format recognizer must be carefully chosen so that it is not so broad to misclassify events into wrong files, and at the same time, not so narrow that it fails at classifying the right file. 

Order in Which Parsers are Used

 FortiSIEM parser processes the files in the specific order listed in the file parserOrder.csv.

Format Recognizer Syntax

The specification for the format recognizer section is:

<eventFormatRecognizer><![CDATA[regexpattern]]></eventFormatRecognizer>

In the regexpattern block, a pattern can be directly specified using regex or a previously defined pattern (in the pattern definition section in this file or in the GeneralPatternDefinitions.xml file) can be referenced.

Example Format Recognizers

Cisco IOS

All Cisco IOS events have a %module name pattern.

<patternDefinitions>
    <pattern name="patCiscoIOSMod" list="begin"><![CDATA[FW|SEC|SEC_LOGIN|SYS|SNMP|]]></pattern>
    <pattern name="patCiscoIOSMod" list="continue"><![CDATA[LINK|SPANTREE|LINEPROTO|DTP|PARSER|]]></pattern>
    <pattern name="patCiscoIOSMod" list="end"><![CDATA[CDP|DHCPD|CONTROLLER|PORT_SECURITY-SP]]></pattern>
</patternDefinitions>
<eventFormatRecognizer><![CDATA[:%<:patCiscoIOSMod>-<:gPatInt>-<:patStrEndColon>:]]></eventFormatRecognizer>
Cisco ASA

All Cisco ASA events have the pattern ASA-severity-id pattern, for example ASA-5-12345.

<eventFormatRecognizer><![CDATA[ASA-\d-\d+]]></eventFormatRecognizer>
Palo Alto Networks Log Parser

In this case, there is no unique keyword, so the entire message structure from the beginning to a specific point in the log must be considered.

Event
<14>May 6 15:51:04 1,2010/05/06 15:51:04,0006C101167,TRAFFIC,start,1,2010/05/06 15:50:58,192.168.28.21,172.16.255.78,::172.16.255.78,172.16.255.78,rule3,,,icmp,vsys1,untrust,untrust,ethernet1/1,ethernet1/1,syslog-172.16.20.152,2010/05/06 15:51:04,600,2,0,0,0,0,0x40,icmp,allow,196,196,196,2,2010/05/06 15:50:58,0,any,0

<eventFormatRecognizer><![CDATA[<:gPatTime>,\w+,(?:TRAFFIC|THREAT|CONFIG|SYSTEM)]]></eventFormatRecognizer>

Pattern Definition Specification

In this section of the parser XML specification, you set the regular expression patterns that that FortiSIEM will iterate through to parse the device logs.

Reusing Pattern Definitions in Multiple Parser Specifications

If there is a pattern definition that you wan to use in multiple parser specification, you need to define it in the file GeneralPatternDefintions.xml, and then refer to it from your s, then it needs to be defined in the file GeneralPatternDefinitions.xml. The patterns in that file are named with a g prefix, and can be referenced as shown in this example:

<generalPatternDefinitions> 
<pattern name="gPatSyslogPRI"><![CDATA[<\d+>]]></pattern>  
    <pattern name="gPatMesgBody"><![CDATA[.*]]></pattern>
    <pattern name="gPatMonNum"><![CDATA[\d{1,2}]]></pattern>
    <pattern name="gPatDay"><![CDATA[\d{1,2}]]></pattern>
    <pattern name="gPatTime"><![CDATA[\d{1,2}:\d{1,2}:\d{1,2}]]></pattern>
    <pattern name="gPatYear"><![CDATA[\d{2,4}]]></pattern>
</generalPatternDefinitions>

Each pattern has a name and the regular expression pattern within the CDATA section. This the basic syntax. 

<pattern name="patternName"><![CDATA[pattern]]></pattern>

This is an example of a pattern definition:

<patternDefinitions>
    <pattern name="patIpV4Dot"><![CDATA[\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}]]></pattern>
    <pattern name="patComm"><![CDATA[[^,]+]]></pattern>
    <pattern name="patUpDown"><![CDATA[up|down]]></pattern>
    <pattern name="patStrEndColon"><![CDATA[[^:]*]]></pattern>
</patternDefinitions>

You can also write a long pattern definition in multiple lines and indicate their order as shown in this example. The value of the list attribute should be begin in first line and end in last line. If there are more than two lines, the attribute should be set to continue for the other lines.

<pattern name="patSolarisMod" list="begin"><![CDATA[sshd|login|]]></pattern>
<pattern name="patSolarisMod" list="continue"><![CDATA[inetd|lpstat|]]></pattern>
<pattern name="patSolarisMod" list="end"><![CDATA[su|sudo]]></pattern>

Parsing Instructions Specification

This section is the heart of the parser, which attempts to recognize patterns in a log message and populate parsed event attributes.

In most cases, parsing involves applying a regular expression to the log, picking up values, and setting them to event attributes. Sometimes the processing is more involved, for example when attributes need to be stored as local variables and compared before populating the event attributes. There are three key components that are used in parsing instructions: Event attributes and variables, inbuilt functions that perform operations on event attributes and variables, and switch and choose branching constructs for logical operations. Values can be collected from both unstructured and structured strings in log messages. 

Event Attributes and Variables

The dictionary of event attributes are defined in FortiSIEM database and any member not belonging to that list is considered a local variable. For readability, local variables should begin with an _, although this is not enforced.

Setting an Event Attribute to a Constant
<setEventAttribute attr="eventSeverity">1</setEventAttribute>
Setting an Event Attribute from Another Variable

The $ symbol is used to specify the content of a variable. In the example below, attribute hostMACAddr gets the value stored in the local variable _mac.

<setEventAttribute attr="hostMACAddr">$_mac</setEventAttribute>

Inbuilt Functions

Combining Two or More Strings to Produce a Final String

This is accomplished by using the combineMsgId function. Here _evIdPrefix is the prefix, _evIdSuffix is the suffix, and the output will be string1-_evIdPrefix-_evIdSuffix.

<setEventAttribute attr="eventType">combineMsgId("string1",$_evIdPrefix, "-", $_evIdSuffix)</setEventAttribute>
Normalize MAC Address

This is accomplished by using the normalizeMAC function. The output will be six groups of two nibbles separated by a colon, for example AA:BB:CC:DD:EE:FF.

<setEventAttribute attr="hostMACAddr">normalizeMAC($_mac)</setEventAttribute>
Compare Interface Security Level

This is accomplished by using the compIntfSecVal function. This primarily applies to Cisco ASA and PIX firewalls. The results returned are:

  • LESS if srcIntf has strictly lower security level than destIntf
  • GREATER if srcIntf has strictly higher security level than destIntf
  • EQUAL if srcIntf and destIntf have identical security levels
<setEventAttribute attr="_result">compIntfSecVal($srcIntf, $destIntf)</setEventAttribute>
Convert Hex Number to Decimal Number

This is accomplished by using the convertHexStrToInt function.

<setEventAttribute attr="ipConnId">convertHexStrToInt($_ipConnId)</setEventAttribute>
Convert TCP/UDP Protocol String to Port Number

This is accomplished by using the convertStrToIntIpPort function.

<setEventAttribute attr="destIpPort">convertStrToIntIpPort($_dport)</setEventAttribute>
Convert Protocol String to Number

This is accomplished by the using the convertStrToIntIpProto function.

<setEventAttribute attr="ipProto">convertStrToIntIpProto($_proStr)</setEventAttribute>
Convert Decimal IP to String

This is accomplished by using the converIpDecimalToStr function.

<setEventAttribute attr="srcIpAddr">convertIpDecimalToStr($_srcIpAddr)</setEventAttribute>
Convert Host Name to IP

This is accomplished by using the convertHostNameToIp function.

<setEventAttribute attr="srcIpAddr">convertHostNameToIp($_saddr)</setEventAttribute>
Add Two Numbers

This is accomplished by using the add function.

<setEventAttribute attr="totBytes">add($sentBytes, $recvBytes)</setEventAttribute>
Divide Two Numbers

This is accomplished by using the divide function.

<setEventAttribute attr="memUtil">divide($_usedMem, $_totalMem)</setEventAttribute>
Scale Function

This is accomplished by using the scale function.

<setEventAttribute attr="durationMSec">scale($_durationSec, 1000)</setEventAttribute>
Extract Host from Fully Qualified Domain Name

This is accomplished by using the extractHostFromFQDN function. If _fqdn` contains a . , get the string before the first .,  otherwise, get the whole string.

<setEventAttribute attr="hostName">extractHostFromFQDN($_fqdn)</setEventAttribute>
Replace a String Using a Regular Expression

This is accomplished by using the replaceStringByRegex function.

<setEventAttribute attr="eventType">replaceStringByRegex($_eventType, "\s+", "_")</setEventAttribute>
e.g. _eventType: "Event Type"; eventType: "Event_Type"
Replace String in String

This is accomplished by using the replaceStrInStr function.

<setEventAttribute attr="computer">replaceStrInStr($_computer, "\\", "")</setEventAttribute>
Resolve DNS Name

This is accomplished by using the resolveDNSName function, which converts DNS name to IP address.

<setEventAttribute attr="destIpAddr">resolveDNSName($destName)</setEventAttribute>
Convert to UNIX Time

This is accomplished by using the toDateTime function.

<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>
<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_time)</setEventAttribute>
Trim Attribute

This is accomplished by using the trimAttribute function. In the example below, it is used to trim the leading and trailing dots in destName.

<setEventAttribute attr="destName">trimAttribute($destName, ".")</setEventAttribute>

Branching Constructs

  • Choose

    The format is:

    <choose>
        <when test='$AttributeOrVariable1 operator Value1'>
             ...
        </when>
        <when test='$AttributeOrVariable2 operator Value2'>
             ...
        </when>
        <otherwise>
             ...
        </otherwise>
    </choose>
  • Switch

    The format is:

    <switch>
           <case>
            ...
           </case>
           <case>
            ...
           </case>
    </switch>

Collecting Values from Unstructured Strings

From a string input source, a regex match is applied and variables are set. The variables can be event attributes or local variables. The input will be a local variable or the default raw message variable. The syntax is:

<collectAndSetAttrByRegex src="$inputString ">
     <regex><![CDATA[regexpattern]]></regex>
</collectAndSetAttrByRegex>

The regexpattern is specified by a list of variables and sub-patterns embedded within a larger pattern. Each variable and sub-pattern pair are enclosed within <>.

Consider an example in which the local variable _body is set to list 130 permitted eigrp 172.16.34.4(Serial1) -> 172.16.34.3, 1 packet. From this sting we need to set the values to local variables and event attributes.

Value Set To Type
130  _aclName Local Variable
permitted _action Local Variable
eigrp _proto Local Variable
172.16.34.4 srcIpAddr Event Attribute
Serial1 srcIntfName Event Attribute
172.16.34.3 destIpAddr Event Attribute
1 totPkts Event Attribute

This is achieved by using this XML. Note that you can use both the collectAndSetAttrByRegex and collectFieldsByRegex functions to collect values from fields. 

<collectAndSetAttrByRegex src="$_body">
         <regex><![CDATA[list <_aclName:gPatStr> <_action:gPatWord>
<_proto:gPatWord> <srcIpAddr:gPatIpV4Dot>\(<srcIntfName:gPatWord>\) -\>
<destIpAddr:gPatIpV4Dot>, <totPkts:gPatInt> <:gPatMesgBody>]]></regex>
</collectAndSetAttrByRegex>

Collecting Fields from Structured Strings

The are usually two types of structured strings in device logs:

In each case, two simpler specialized parsing constructs than are provided

Key=Value Structured Data

Certain logs, such as SNMP traps, are structured as Key1 = value1 <separator> Key2 = value2,.... These can be parsed using the collectAndSetAttrByKeyValuePair XML attribute tag with this syntax.

<collectAndSetAttrByKeyValuePair sep='separatorString'src="$inputString">
   <attrKeyMap attr="variableOrEventAttribute1" key="key1"/>
   <attrKeyMap attr="variableOrEventAttribute2" key="key2"/>
</collectAndSetAttrByKeyValuePair>

When a key1 match is found, then the entire string following key1 up to the separatorString is parsed out and stored in the attribute variableOrEventAttribute1.

As an example, consider this log fragment.

_body =
SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.60 = Hex-STRING: 07 D8 06 0B
13 15 00 00 2D 07 00    SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.11.0
= Hex-STRING: 00 16 B6 DB 12 22 
SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.12.0 = Hex-STRING: 00 21 55
4D 66 B0  SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.13.0 = INTEGER: 36 
SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.1.0 = Hex-STRING: 00 1A 1E C0
60 7A  SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.56.0 = INTEGER: 2   
SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.17.0 = STRING:
"00:1a:1e:c0:60:7a"

The corresponding parser fragment is:

<collectAndSetAttrByKeyValuePair sep='\t\\| SNMP' src="$_body">
     <attrKeyMap attr="srcMACAddr"
key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.11.0 = Hex-STRING: "/>
     <attrKeyMap attr="_destMACAddr"
key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.12.0 = Hex-STRING: "/>
    <attrKeyMap attr="wlanSSID"
key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.6.0 = STRING: "/>
     <attrKeyMap attr="wlanRadioId"
key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.56.0 = INTEGER: "/>
     <attrKeyMap attr="apMac"
key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.17.0 = STRING: "/>
</collectAndSetAttrByKeyValuePair>

After parsing, the attribute values are set:

Value Attribute
00 16 B6 DB 12 22 srcMACAddr
00 21 55 4D 66 B0 destMacAddr
2 wlanRadioId
00:1a:1e:c0:60:7a apMac

Value List Structured Data

Certain application logs, such as those from Microsoft IIS, are structured as a list of values with a separator. These can be parsed using the collectAndSetAttrByPos XML attribute tag following this syntax.

<collectAndSetAttrByPos sep='separatorString' src="$inputString">
         <attrPosMap attr="variableOrEventAttribute1" pos='offset1'/>
         <attrPosMap attr="variableOrEventAttribute2" pos='offset2'/>
</collectAndSetAttrByPos>

When the position offset1 is encountered, the subsequent values up to the separatorString is stored in variableOrEventAttribute1.

As an example, consider this log fragment.

_body =
W3SVC1 ADS-PRI 192.168.0.10 GET /Document/ACE/index.htm - 80 -
192.168.20.55 HTTP/1.1
Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.8.1.11)+Gecko/20071
127+Firefox/2.0.0.11 [http://wwwin/Document/] wwwin 200 0 0 5750 445 15

The parser fragment is:

<collectAndSetAttrByPos src="$_body" sep='  '>
       <attrPosMap attr="srvInstName" pos='1'/>
       <attrPosMap attr="destName" pos='2'/>
       <attrPosMap attr="relayDevIpAddr" pos='2'>
       <attrPosMap attr="destIpAddr" pos='3'/>
       <attrPosMap attr="httpMethod" pos='4'/>
       <attrPosMap attr="uriStem" pos='5'/>
       <attrPosMap attr="uriQuery" pos='6'/>
       <attrPosMap attr="destIpPort" pos='7'/>
       <attrPosMap attr="user" pos='8'/>
       <attrPosMap attr="srcIpAddr" pos='9'/>
       <attrPosMap attr="httpVersion" pos='10'/>
       <attrPosMap attr="httpUserAgent" pos='11'/>
       <attrPosMap attr="httpReferrer" pos='13'/>
       <attrPosMap attr="httpStatusCode" pos='15'/>
       <attrPosMap attr="httpSubStatusCode" pos='16'/>
       <attrPosMap attr="httpWin32Status" pos='17'/>
       <attrPosMap attr="recvBytes" pos='18'/>
       <attrPosMap attr="sentBytes" pos='19'/>
       <attrPosMap attr="durationMSec" pos='20'/>
 </collectAndSetAttrByPos>

For structured strings, techniques in this section are more efficient than in the previous section since, the expression is simpler and ONE tag can be used to parse regardless of the order in which the keys or values appear in the string.