Anomaly detection Engine for Linux Logs (ADE)
XML Results from an Analyze request - summary of period (day)
The following code illustrates the XML structure of the output generated by analyze for each period (day) for each Linux system in the model group from the collection of logs processed by analyze. The major element is the systems element, which identifies the specific date and system for which analytical data was requested. The systems element also identifies the number and size of intervals returned in the XML document. The XML also contains one interval element for each analysis snapshot since UTC midnight on the requested date. The interval element provides the interval anomaly score and number of unique message IDs that were generated by analyze.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.ibm.com/zAware/MelodyCorePlexV2"
xmlns="http://www.ibm.com/zAware/MelodyCorePlexV2"
elementFormDefault="qualified">
<xs:element name="systems" >
<xs:complexType>
<xs:sequence>
<xs:element name="version" type="xs:int"/>
<xs:element name="start_time" type="xs:dateTime" />
<xs:element name="end_time" type="xs:dateTime" />
<xs:element name="gmt_offset" type="xs:string" />
<xs:element name="number_intervals">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:int">
<xs:attribute name="analysis_snapshot_size" type="xs:int" use="required"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="interval_size" type="xs:int" />
<xs:element name="model_info">
<xs:complexType>
<xs:attribute name="model_creation_date" type="xs:dateTime" use="required"/>
<xs:attribute name="training_period" type="xs:int" use="required"/>
<xs:attribute name="analysis_group" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="system" type="systems_system_type"
minOccurs="1" maxOccurs="1" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="systems_system_type">
<xs:sequence>
<xs:element name="interval" type="systems_interval_type"
minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="sys_id" type="xs:string" use="required" />
<xs:attribute name="log_type" type="xs:string" use="required"/>
</xs:complexType>
<xs:complexType name="systems_interval_type">
<xs:sequence>
<xs:element name="num_unique_msg_ids" type="xs:int" />
<xs:element name="anomaly_score" type="xs:double" />
</xs:sequence>
<xs:attribute name="num_never_seen_before_messages" type="xs:int" use="required" />
<xs:attribute name="num_new_messages" type="xs:int" use="required" />
<xs:attribute name="index" type="xs:int" use="required"/>
<xs:attribute name="missing" type="xs:boolean" use="required"/>
<xs:attribute name="missing_reason" type="xs:string" use="optional"/>
<xs:attribute name="limited_model" use="optional">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Yes" />
<xs:enumeration value="No" />
<xs:enumeration value="Unknown" />
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:schema>
XML descriptions for Period output
- version
- An integer that identifies the version of the ADE application programming interface (API).
- start_time
- Indicates the beginning of the first interval for which data is
available for the specified system on the date .
The start time is indicated in the XML dateTime data
type format in Coordinated Universal Time (UTC).
YYYY-MM-DDThh:mm:ss.tttZ
- end_time
- Indicates the beginning of the first interval after the
date . The end
time is indicated in the XML dateTime data type format
in Coordinated Universal Time (UTC).
YYYY-MM-DDThh:mm:ss.tttZ
- gmt_offset
- An integer that indicates the difference in hours and minutes from Coordinated Universal Time (UTC) for the requested start time.
- number_intervals
- An integer that indicates the number of intervals for which analytical data is available for the system and the date. The attribute analysis_snapshot_size provides the size of an analysis snapshot in seconds.
- analysis_snapshot_size
- An integer that indicates the amount of time in an analysis snapshot.
- interval_size
- An integer that indicates the number of seconds in an interval.
- model_info
- Provides information about the model associated with the specified
system.
- model_creation_date
- An element that provides the date and time when ADE successfully built the most recent model of system behavior for this date.
- training_period
- An integer that indicates the number of consecutive calendar days that the ADE uses to identify the data to include in training models.
- analysis_group
- An element that provides the name of a Linux model group in the IBM zAware topology.
- system
- An element that provides additional
details about intervals for the system.
- interval
- An element that provides additional details about a specific interval.
For the system and the date,
the XML response contains one interval element for each element for
which analytical data is available.
- num_unique_msg_ids
- An integer that provides the number of unique message IDs that were issued during this analysis interval. If the same message ID was issued more than once during the interval, the message ID is counted only once.
- anomaly_score
- A double value that provides the anomaly
score for this interval. The interval anomaly score is the percentile
of the sum of each anomaly score for individual message IDs within
an interval. When the ADE
uses priming data and current data to create a model of system behavior,
a process that is called "training", ADE captures the
distribution of interval anomaly scores for all intervals that are
represented in the training data. The server uses the distribution
results and uses them to establish the range of values for each percentile.The possible interval anomaly scores are:
- 0 through 99.4
- The analysis interval contains messages and message clusters that
match or exhibit relatively insignificant differences in expected
behavior, as defined in the ADE model. A score of
0 is possible because the server eliminates all expected, in-context
messages from its scoring calculation. A score of 0 indicates intervals
that exhibit no difference in behavior compared to the system or
group model.
Analysis intervals with scores that are greater than 0 but less than 99.5 contain some messages that are unexpected or issued out of context. Scores in this range indicate intervals that do not vary significantly from the system model.
- 99.5
- Analysis intervals with this score contain some rarely seen, unexpected, or out-of-context messages. Generally speaking, this score indicates analysis intervals with some differences from the system or group model but do not contain messages of much diagnostic value.
- 99.6 - 100
- Analysis intervals with this score contain rarely seen messages (these messages appear in the model only once or twice), or many messages that are unexpected or issued out of context. This score indicates analysis intervals with more differences from the system or group model; these intervals can contain messages that might help you diagnose anomalous system behavior.
- 101
- Analysis intervals with this score exhibit the most significant
differences from the system or group model; these intervals
contain messages that merit investigation. ADE assigns this score
to analysis intervals that contain:
- Unusual or unexpected messages..
- A much higher volume of messages than expected.
- num_never_seen_before_messages
- An integer that indicates the total number of messages in this interval that are considered new because they have never been reported in analysis results.
- num_new_messages
- An integer that indicates the total number of messages in this interval that are considered new because they are not in the current model.
- index
- An integer that indicates the sequence number of this interval within the date specified on the LPAR request.
- missing
- A Boolean value that identifies whether analytical data is available for this interval.
- missing_reason
- An element that indicates why analytical data is not available; this element has a value only when the value returned for missing is true.
- limited_model
- Indicates whether or not ADE used a limited model to calculate the anomaly score for the interval. Valid values are Yes, No, or Unknown, which indicates temporary conditions under which ADE cannot determine whether the model is limited.
- sys_id
- Provides the name of the system
- log_type
- An element that identifies the type of data that ADE used to build the results..
Sample XML Output Summary of Intervals within the Period
- The output:
<?xml version='1.0' encoding='UTF-8' ?> <?xml-stylesheet href='./xslt/AdeCorePlexV2.xsl' type='text/xsl' ?> <systems xsi:noNamespaceSchemaLocation="/xml/AdeCorePlexV2.xsd" xmlns="http://www.openmainframe.org/ade/AdeCorePlexV2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <version>2</version> <start_time>2015-12-11T19:00:00.000-05:00</start_time> <end_time>2015-12-12T19:00:00.000-05:00</end_time> <gmt_offset>GMT+00:00</gmt_offset> <number_intervals analysis_snapshot_size="600">144</number_intervals> <interval_size>3600</interval_size> <model_info model_creation_date="2016-02-10T16:02:56.840Z" training_period="7" analysis_group="default"/> <system sys_id="sys1.openmainframe.org" log_type="Unix style syslog"> <interval num_never_seen_before_messages="0" num_new_messages="7" index="0" missing="false" limited_model="Yes"> <num_unique_msg_ids>32</num_unique_msg_ids> <anomaly_score>99.5</anomaly_score> </interval> <interval num_never_seen_before_messages="0" num_new_messages="6" index="1" missing="false" limited_model="Yes"> <num_unique_msg_ids>26</num_unique_msg_ids> <anomaly_score>98.9</anomaly_score> </interval> <interval num_never_seen_before_messages="0" num_new_messages="5" index="2" missing="false" limited_model="Yes"> <num_unique_msg_ids>28</num_unique_msg_ids> <anomaly_score>98.9</anomaly_score> </interval> <interval num_never_seen_before_messages="0" num_new_messages="5" index="3" missing="false" limited_model="Yes"> <num_unique_msg_ids>27</num_unique_msg_ids> <anomaly_score>98.9</anomaly_score> </interval> <interval num_never_seen_before_messages="0" num_new_messages="8" index="4" missing="false" limited_model="Yes"> <num_unique_msg_ids>30</num_unique_msg_ids> <anomaly_score>99.5</anomaly_score> </interval> ........ <interval num_never_seen_before_messages="0" num_new_messages="8" index="143" missing="false" limited_model="Yes"> <num_unique_msg_ids>30</num_unique_msg_ids> <anomaly_score>100.0</anomaly_score> </interval> </system> </systems>