Event Parser Specification
FortiSIEM uses an XML-based parser framework to parse events. These topics describe the parser syntax and include examples of XML parser specifications.
- Custom Parser XML Specification Template
- Parser Name Specification
- Device or Application Type Specification
- Format Recognizer Specification
- Pattern Definition Specification
- Parsing Instructions Specification
Custom Parser XML Specification Template
The basic template for a custom parser XML specification includes five sections. Click on the name of any section for more information.
Section | Description |
---|---|
Parser Name Specification | Name of the parser file |
Device Type | The type of device or application associated with the parser |
Format Recognizer Specification | Patterns that determine whether an event will be parsed by this parser |
Pattern Definition Specification | Defines the parsing patterns that are iterated over by the parsing instructions |
Parsing Instructions Specification | Instructions on how to parse events that match the format recognizer patterns |
Custom Parser XML Specification Template
<eventParser name="xxx"> <deviceType> </deviceType> <eventFormatRecognizer> </eventFormatRecognizer> <patternDefinitions> </patternDefinitions> <parsingInstructions> </parsingInstructions> </eventParser>
Parser Name Specification
This section specifies the name of the parser, which is used only for readability and identifying the device type associated with the parser.
<eventParser name="CiscoIOSParser"> </eventParser>
Device or Application Type Specification
This section specifies the device or the application to which this parser applies. The device and application definitions enable FortiSIEM to detect the device and application type for a host from the received events. This is called log-based discovery in FortiSIEM. Once a received event is successfully parsed by this file, a CMDB entry is created with the device and application set from this file. FortiSIEM discovery may further refine the device.
There are two separate subsections for device and application. In each section, vendor, model and version can be specified, but version is not typically needed.
Set Version to Any
In the examples in this topic, <Version>
is set to ANY
because events are generally not tied to a particular version of a device or software. You could of course set this to a specific version number if you only wanted this parser to apply to a specific version of an application or device.
Vendor and Model Must Match the FortiSIEM Version
<Vendor>
and <Model>
entries must match the spelling and capitalization in the CMDB.
Examples of Specifications for Types of Device and Applications
Hardware Appliances
In this case, the type of event being parsed specifies the device type, for example Cisco IOS, Cisco ASA, etc.
<deviceType> <Vendor>Cisco</Vendor> <Model>IOS</Model> <Version>ANY</Version> </deviceType>
Software Operating Systems that Specify the Device Type
In this case, the type of events being parsed specifies the device type, for example Microsoft Windows etc. In this case the device type section looks like:
<deviceType> <Vendor>Microsoft</Vendor> <Model>Windows</Model> <Version>ANY</Version> </deviceType>
Applications that Specify Both Device Type and Application
In this case, the events being parsed specify the device and application types because Microsoft SQL Server can only run on Microsoft Windows OS.
<deviceType> <Vendor>Microsoft</Vendor> <Model>Windows</Model> <Version>ANY</Version> </deviceType> <appType> <Vendor>Microsoft</Vendor> <Model>SQL Server</Model> <Version>ANY</Version> <Name> Microsoft SQL Server</Name> </appType>
Applications that Specify the Application Type but Not the Device Type
Consider the example of an Oracle database server, which can run on both Windows and Linux operating systems. In this case, the device type is set to Generic but the application is specific. FortiSIEM depends on discovery to identify the device type.
<deviceType> <Vendor>Generic</Vendor> <Model>Generic</Model> <Version>ANY</Version> </deviceType> <appType> <Vendor>Oracle</Vendor> <Model>Database Server</Model> <Version>ANY</Version> <Name>Oracle Database Server</Name> </appType>
Format Recognizer Specification
In many cases, events associated with a device or application will contain a unique pattern. You can enter a regular expression in the Format Recognizer section of the parser XML file to search for this pattern, which if found, will then parse the events according to the parser instructions. After the first match, the event source IP to parser file map is cached, and only that parser file is used for all events from that source IP. A notable exception is when events from disparate sources are received via a syslog server, but that case is handled differently.
While not a required part of the parser specification, a format recognizer can speed up event parsing, especially when one parsing pattern file among many pattern files must be chosen. Only one pattern check can determine whether the parsing file must be used or not. The other less efficient option would be to examine patterns in every file. At the same time, the format recognizer must be carefully chosen so that it is not so broad to misclassify events into wrong files, and at the same time, not so narrow that it fails at classifying the right file.
Order in Which Parsers are Used
FortiSIEM parser processes the files in the specific order listed in the file parserOrder.csv
.
Format Recognizer Syntax
The specification for the format recognizer section is:
<eventFormatRecognizer><![CDATA[regexpattern]]></eventFormatRecognizer>
In the regexpattern block, a pattern can be directly specified using regex or a previously defined pattern (in the pattern definition section in this file or in the GeneralPatternDefinitions.xml file) can be referenced.
Example Format Recognizers
Cisco IOS
All Cisco IOS events have a %module name
pattern.
<patternDefinitions> <pattern name="patCiscoIOSMod" list="begin"><![CDATA[FW|SEC|SEC_LOGIN|SYS|SNMP|]]></pattern> <pattern name="patCiscoIOSMod" list="continue"><![CDATA[LINK|SPANTREE|LINEPROTO|DTP|PARSER|]]></pattern> <pattern name="patCiscoIOSMod" list="end"><![CDATA[CDP|DHCPD|CONTROLLER|PORT_SECURITY-SP]]></pattern> </patternDefinitions> <eventFormatRecognizer><![CDATA[:%<:patCiscoIOSMod>-<:gPatInt>-<:patStrEndColon>:]]></eventFormatRecognizer>
Cisco ASA
All Cisco ASA events have the pattern ASA-severity-id
pattern, for example ASA-5-12345
.
<eventFormatRecognizer><![CDATA[ASA-\d-\d+]]></eventFormatRecognizer>
Palo Alto Networks Log Parser
In this case, there is no unique keyword, so the entire message structure from the beginning to a specific point in the log must be considered.
Event
<14>May 6 15:51:04 1,2010/05/06 15:51:04,0006C101167,TRAFFIC,start,1,2010/05/06 15:50:58,192.168.28.21,172.16.255.78,::172.16.255.78,172.16.255.78,rule3,,,icmp,vsys1,untrust,untrust,ethernet1/1,ethernet1/1,syslog-172.16.20.152,2010/05/06 15:51:04,600,2,0,0,0,0,0x40,icmp,allow,196,196,196,2,2010/05/06 15:50:58,0,any,0
<eventFormatRecognizer><![CDATA[<:gPatTime>,\w+,(?:TRAFFIC|THREAT|CONFIG|SYSTEM)]]></eventFormatRecognizer>
Pattern Definition Specification
In this section of the parser XML specification, you set the regular expression patterns that that FortiSIEM will iterate through to parse the device logs.
Reusing Pattern Definitions in Multiple Parser Specifications
If there is a pattern definition that you wan to use in multiple parser specification, you need to define it in the file GeneralPatternDefintions.xml, and then refer to it from your s, then it needs to be defined in the file GeneralPatternDefinitions.xml
. The patterns in that file are named with a g prefix, and can be referenced as shown in this example:
<generalPatternDefinitions> <pattern name="gPatSyslogPRI"><![CDATA[<\d+>]]></pattern> <pattern name="gPatMesgBody"><![CDATA[.*]]></pattern> <pattern name="gPatMonNum"><![CDATA[\d{1,2}]]></pattern> <pattern name="gPatDay"><![CDATA[\d{1,2}]]></pattern> <pattern name="gPatTime"><![CDATA[\d{1,2}:\d{1,2}:\d{1,2}]]></pattern> <pattern name="gPatYear"><![CDATA[\d{2,4}]]></pattern> </generalPatternDefinitions>
Each pattern has a name and the regular expression pattern within the CDATA section. This the basic syntax.
<pattern name="patternName"><![CDATA[pattern]]></pattern>
This is an example of a pattern definition:
<patternDefinitions> <pattern name="patIpV4Dot"><![CDATA[\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}]]></pattern> <pattern name="patComm"><![CDATA[[^,]+]]></pattern> <pattern name="patUpDown"><![CDATA[up|down]]></pattern> <pattern name="patStrEndColon"><![CDATA[[^:]*]]></pattern> </patternDefinitions>
You can also write a long pattern definition in multiple lines and indicate their order as shown in this example. The value of the list
attribute should be begin
in first line and end
in last line. If there are more than two lines, the attribute should be set to continue
for the other lines.
<pattern name="patSolarisMod" list="begin"><![CDATA[sshd|login|]]></pattern> <pattern name="patSolarisMod" list="continue"><![CDATA[inetd|lpstat|]]></pattern> <pattern name="patSolarisMod" list="end"><![CDATA[su|sudo]]></pattern>
Parsing Instructions Specification
This section is the heart of the parser, which attempts to recognize patterns in a log message and populate parsed event attributes.
In most cases, parsing involves applying a regular expression to the log, picking up values, and setting them to event attributes. Sometimes the processing is more involved, for example when attributes need to be stored as local variables and compared before populating the event attributes. There are three key components that are used in parsing instructions: Event attributes and variables, inbuilt functions that perform operations on event attributes and variables, and switch
and choose
branching constructs for logical operations. Values can be collected from both unstructured and structured strings in log messages.
- Event Attributes and Variables
- Inbuilt Functions
- Branching Constructs
- Collecting Values from Unstructured Strings
- Collecting Fields from Structured Strings
Event Attributes and Variables
The dictionary of event attributes are defined in FortiSIEM database and any member not belonging to that list is considered a local variable. For readability, local variables should begin with an _, although this is not enforced.
Setting an Event Attribute to a Constant
<setEventAttribute attr="eventSeverity">1</setEventAttribute>
Setting an Event Attribute from Another Variable
The $
symbol is used to specify the content of a variable. In the example below, attribute hostMACAddr
gets the value stored in the local variable _mac
.
<setEventAttribute attr="hostMACAddr">$_mac</setEventAttribute>
Inbuilt Functions
Combining Two or More Strings to Produce a Final String
This is accomplished by using the combineMsgId
function. Here _evIdPrefix
is the prefix, _evIdSuffix
is the suffix, and the output will be string1-_evIdPrefix-_evIdSuffix
.
<setEventAttribute attr="eventType">combineMsgId("string1",$_evIdPrefix, "-", $_evIdSuffix)</setEventAttribute>
Normalize MAC Address
This is accomplished by using the normalizeMAC
function. The output will be six groups of two nibbles separated by a colon, for example AA:BB:CC:DD:EE:FF
.
<setEventAttribute attr="hostMACAddr">normalizeMAC($_mac)</setEventAttribute>
Compare Interface Security Level
This is accomplished by using the compIntfSecVal
function. This primarily applies to Cisco ASA and PIX firewalls. The results returned are:
LESS
ifsrcIntf
has strictly lower security level thandestIntf
GREATER
ifsrcIntf
has strictly higher security level thandestIntf
EQUAL
ifsrcIntf
and destIntf have identical security levels
<setEventAttribute attr="_result">compIntfSecVal($srcIntf, $destIntf)</setEventAttribute>
Convert Hex Number to Decimal Number
This is accomplished by using the convertHexStrToInt
function.
<setEventAttribute attr="ipConnId">convertHexStrToInt($_ipConnId)</setEventAttribute>
Convert TCP/UDP Protocol String to Port Number
This is accomplished by using the convertStrToIntIpPort
function.
<setEventAttribute attr="destIpPort">convertStrToIntIpPort($_dport)</setEventAttribute>
Convert Protocol String to Number
This is accomplished by the using the convertStrToIntIpProto
function.
<setEventAttribute attr="ipProto">convertStrToIntIpProto($_proStr)</setEventAttribute>
Convert Decimal IP to String
This is accomplished by using the converIpDecimalToStr
function.
<setEventAttribute attr="srcIpAddr">convertIpDecimalToStr($_srcIpAddr)</setEventAttribute>
Convert Host Name to IP
This is accomplished by using the convertHostNameToIp
function.
<setEventAttribute attr="srcIpAddr">convertHostNameToIp($_saddr)</setEventAttribute>
Add Two Numbers
This is accomplished by using the add
function.
<setEventAttribute attr="totBytes">add($sentBytes, $recvBytes)</setEventAttribute>
Divide Two Numbers
This is accomplished by using the divide
function.
<setEventAttribute attr="memUtil">divide($_usedMem, $_totalMem)</setEventAttribute>
Scale Function
This is accomplished by using the scale
function.
<setEventAttribute attr="durationMSec">scale($_durationSec, 1000)</setEventAttribute>
Extract Host from Fully Qualified Domain Name
This is accomplished by using the extractHostFromFQDN function. If _fqdn
` contains a .
, get the string before the first .
, otherwise, get the whole string.
<setEventAttribute attr="hostName">extractHostFromFQDN($_fqdn)</setEventAttribute>
Replace a String Using a Regular Expression
This is accomplished by using the replaceStringByRegex
function.
<setEventAttribute attr="eventType">replaceStringByRegex($_eventType, "\s+", "_")</setEventAttribute>
e.g. _eventType: "Event Type"; eventType: "Event_Type"
Replace String in String
This is accomplished by using the replaceStrInStr
function.
<setEventAttribute attr="computer">replaceStrInStr($_computer, "\\", "")</setEventAttribute>
Resolve DNS Name
This is accomplished by using the resolveDNSName
function, which converts DNS name to IP address.
<setEventAttribute attr="destIpAddr">resolveDNSName($destName)</setEventAttribute>
Convert to UNIX Time
This is accomplished by using the toDateTime
function.
<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_year, $_time)</setEventAttribute>
<setEventAttribute attr="deviceTime">toDateTime($_mon, $_day, $_time)</setEventAttribute>
Trim Attribute
This is accomplished by using the trimAttribute function. In the example below, it is used to trim the leading and trailing dots in destName
.
<setEventAttribute attr="destName">trimAttribute($destName, ".")</setEventAttribute>
Branching Constructs
- Choose
The format is:
<choose> <when test='$AttributeOrVariable1 operator Value1'> ... </when> <when test='$AttributeOrVariable2 operator Value2'> ... </when> <otherwise> ... </otherwise> </choose>
- Switch
The format is:
<switch> <case> ... </case> <case> ... </case> </switch>
Collecting Values from Unstructured Strings
From a string input source, a regex match is applied and variables are set. The variables can be event attributes or local variables. The input will be a local variable or the default raw message variable. The syntax is:
<collectAndSetAttrByRegex src="$inputString "> <regex><![CDATA[regexpattern]]></regex> </collectAndSetAttrByRegex>
The regexpattern is specified by a list of variables and sub-patterns embedded within a larger pattern. Each variable and sub-pattern pair are enclosed within <>
.
Consider an example in which the local variable _body
is set to list 130 permitted eigrp 172.16.34.4(Serial1) -> 172.16.34.3, 1 packet
. From this sting we need to set the values to local variables and event attributes.
Value | Set To | Type |
---|---|---|
130
|
_aclName
|
Local Variable |
permitted
|
_action
|
Local Variable |
eigrp
|
_proto
|
Local Variable |
172.16.34.4
|
srcIpAddr
|
Event Attribute |
Serial1
|
srcIntfName
|
Event Attribute |
172.16.34.3
|
destIpAddr
|
Event Attribute |
1
|
totPkts
|
Event Attribute |
This is achieved by using this XML. Note that you can use both the collectAndSetAttrByRegex
and collectFieldsByRegex
functions to collect values from fields.
<collectAndSetAttrByRegex src="$_body"> <regex><![CDATA[list <_aclName:gPatStr> <_action:gPatWord> <_proto:gPatWord> <srcIpAddr:gPatIpV4Dot>\(<srcIntfName:gPatWord>\) -\> <destIpAddr:gPatIpV4Dot>, <totPkts:gPatInt> <:gPatMesgBody>]]></regex> </collectAndSetAttrByRegex>
Collecting Fields from Structured Strings
The are usually two types of structured strings in device logs:
In each case, two simpler specialized parsing constructs than are provided
Key=Value Structured Data
Certain logs, such as SNMP traps, are structured as Key1 = value1 <separator> Key2 = value2
,.... These can be parsed using the collectAndSetAttrByKeyValuePair
XML attribute tag with this syntax.
<collectAndSetAttrByKeyValuePair sep='separatorString'src="$inputString"> <attrKeyMap attr="variableOrEventAttribute1" key="key1"/> <attrKeyMap attr="variableOrEventAttribute2" key="key2"/> </collectAndSetAttrByKeyValuePair>
When a key1
match is found, then the entire string following key1
up to the separatorString
is parsed out and stored in the attribute variableOrEventAttribute1
.
As an example, consider this log fragment.
_body = SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.60 = Hex-STRING: 07 D8 06 0B 13 15 00 00 2D 07 00 SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.11.0 = Hex-STRING: 00 16 B6 DB 12 22 SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.12.0 = Hex-STRING: 00 21 55 4D 66 B0 SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.13.0 = INTEGER: 36 SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.1.0 = Hex-STRING: 00 1A 1E C0 60 7A SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.56.0 = INTEGER: 2 SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.17.0 = STRING: "00:1a:1e:c0:60:7a"
The corresponding parser fragment is:
<collectAndSetAttrByKeyValuePair sep='\t\\| SNMP' src="$_body"> <attrKeyMap attr="srcMACAddr" key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.11.0 = Hex-STRING: "/> <attrKeyMap attr="_destMACAddr" key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.12.0 = Hex-STRING: "/> <attrKeyMap attr="wlanSSID" key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.6.0 = STRING: "/> <attrKeyMap attr="wlanRadioId" key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.56.0 = INTEGER: "/> <attrKeyMap attr="apMac" key="SNMPv2-SMI::enterprises.14823.2.3.1.11.1.1.17.0 = STRING: "/> </collectAndSetAttrByKeyValuePair>
After parsing, the attribute values are set:
Value | Attribute |
---|---|
00 16 B6 DB 12 22
|
srcMACAddr
|
00 21 55 4D 66 B0
|
destMacAddr
|
2
|
wlanRadioId
|
00:1a:1e:c0:60:7a
|
apMac
|
Value List Structured Data
Certain application logs, such as those from Microsoft IIS, are structured as a list of values with a separator. These can be parsed using the collectAndSetAttrByPos
XML attribute tag following this syntax.
<collectAndSetAttrByPos sep='separatorString' src="$inputString"> <attrPosMap attr="variableOrEventAttribute1" pos='offset1'/> <attrPosMap attr="variableOrEventAttribute2" pos='offset2'/> </collectAndSetAttrByPos>
When the position offset1
is encountered, the subsequent values up to the separatorString
is stored in variableOrEventAttribute1
.
As an example, consider this log fragment.
_body = W3SVC1 ADS-PRI 192.168.0.10 GET /Document/ACE/index.htm - 80 - 192.168.20.55 HTTP/1.1 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+en-US;+rv:1.8.1.11)+Gecko/20071 127+Firefox/2.0.0.11 [http://wwwin/Document/] wwwin 200 0 0 5750 445 15
The parser fragment is:
<collectAndSetAttrByPos src="$_body" sep=' '> <attrPosMap attr="srvInstName" pos='1'/> <attrPosMap attr="destName" pos='2'/> <attrPosMap attr="relayDevIpAddr" pos='2'> <attrPosMap attr="destIpAddr" pos='3'/> <attrPosMap attr="httpMethod" pos='4'/> <attrPosMap attr="uriStem" pos='5'/> <attrPosMap attr="uriQuery" pos='6'/> <attrPosMap attr="destIpPort" pos='7'/> <attrPosMap attr="user" pos='8'/> <attrPosMap attr="srcIpAddr" pos='9'/> <attrPosMap attr="httpVersion" pos='10'/> <attrPosMap attr="httpUserAgent" pos='11'/> <attrPosMap attr="httpReferrer" pos='13'/> <attrPosMap attr="httpStatusCode" pos='15'/> <attrPosMap attr="httpSubStatusCode" pos='16'/> <attrPosMap attr="httpWin32Status" pos='17'/> <attrPosMap attr="recvBytes" pos='18'/> <attrPosMap attr="sentBytes" pos='19'/> <attrPosMap attr="durationMSec" pos='20'/> </collectAndSetAttrByPos>
For structured strings, techniques in this section are more efficient than in the previous section since, the expression is simpler and ONE tag can be used to parse regardless of the order in which the keys or values appear in the string.