Configure lograptor’s applications

CONFIGURATION FILES

${confdir}/*.conf

Lograptor defines its applications by configuration files. An application configuration filename is the name of the application followed by the suffix .conf. Each file that is located in the configuration directory that has this suffix has to be an application configuration file for lograptor.

An application’s configuration file uses the Python’s ConfigParser format which provides a structure similar to Microsoft Windows INI files. A configuration file consists of sections and option entries. A section start with a ‘’[section]’’ header. Each section can have different name=value (name: value is also accepted) option entries, with continuations in the style of RFC 822 (see section 3.1.1, “LONG HEADER FIELDS”). Note that leading and trailing whitespaces are removed from values.

DESCRIPTION

An application configuration file for lograptor must contains two sections:

main
Contains the parameters of the application. Includes log app-tags, log files locations, priority and enabling status.
rules
This section contains the pattern rules for the analysis of application’s logs. Those regexp rules are used by the engines of lograptor.

Optional additional sections can be defined to define report data composition.

[main] SECTION

desc
A fully comprehensive description of the application.
files

Log files of the application. You can specify multiple entries separated by commas. Entries can be GLOB filename patterns, so you can use the wildcard characters ?, *, + in filenames. String interpolation is done on entries just before processing, so you can use obtain the effective list of files to be included in the run. Typically the string $logdir (or ${logdir}) is used to shorten paths that have the same common root. You can also use other variables related to program options, such as $hostname, that is linked to the option –hosts.

Finally you can also use some wildcards related to dates:

%Y
specifies the year
%m
specifies the month as a number with 2 digits (01..12)
%d
specifies the day with 2 digits (01..)

Currently only these formats are supported to specify the dates. Filenames that include variables related to dates are expanded by the program according to the date range provided (options –last or –date).

enabled
It can be either “yes” or “no.” If “no”, the program ignores the app. If the application is invoked explicitly using the option -a/–app then the value of this parameter is ignored. This allows you to schedule reports with a favorite set of applications and still be able to use the program for analyze logs of all the applications defined.
priority
It’s an unsigned integer that indicates the priority of the application, commonly a value from 0 to 10. A lower value indicates an higher priority in the composition of the final report, ie report data elements produced by the application will appear before those of other applications with an higher value. The priority also conditions the processing order of the log files.

[rules] SECTION

This section contains pattern rules written as regular expressions, according to the syntax of Python’s re module. Those rules are used by the program to analyze application’s log lines and to extract information from matched events. Each rule is identified with the option name, so must be unique within application. Don’t use names already used by other options of the program for defining a pattern rule, in order to avoid ambiguities.

Symbolic Groups

Lograptor makes use of Python’s regex symbolic groups to extract information from logs. A pattern rule must contain at least one symbolic group in order to be accepted by the program. For example if a rule is:

SMTPD_Warning = ": warning: (?P<reason>.+)"

the program extract information about group “reason” and is able to use those information during reporting stage. You can use more symbolic groups within a rule for detailing the structure of extracted data:

Mail_Resent = ": (?P<thread>[A-Z,0-9]{9,14}): resent-message-id=<(?P<reason>.+)>"

The “thread” symbolic group is used to extract thread information from log lines, in order to perform thread matching (see option -T/–thread).

Pattern Rules and Filters

An app pattern rule can also contain variables ($VARNAME or ${VARNAME}) related to a lograptor’s filter. At the run each variable is substituted with the corresponding filter’s pattern. This feature has sense when you pair a variable with a symbolic group, as in this example:

Mail_Client = ": (?P<thread>[A-Z,0-9]{9,14}): client=(?P<client>${client})"

If you use filter options the program discards the rules logically excluded by filters (unused rules).

Dictionary of Results

Each rule produces a table of results as a Python dictionary. This dictionary has tuples as keys and integers as values. The values record the number of events associated with each tuple. For example with the following rule:

Mail_Received = ": (?P<thread>[A-Z,0-9]{9,14}): from=<(?P<from>${from})>, size=(?P<size>\d+)"

the tuple key consists of three elements, positionally related to fields <hostname>, <from> and <size>:

('smtp.example.com', 'postmaster@example.com', '4827')

Of course inserting more symbolic groups increase the complexity of the results and the number of elements of the dictionary. So if you don’t need details you could simplify the default pattern rules.

Order of Pattern Rules

The sequence of the rules in the configuration also determines the order of execution during the process of log analysis. The order are important to reduce execution total time. Generally is better to put first the rules corresponding to more numerous log lines.

Writing Pattern Rules

A simple method to write new pattern rules is to use the lograptor unparsed engine for each application, in order to verify which lines are not matched by any pattern rule, e.g.:

# lograptor -a dovecot --unparsed -m 1 /var/log/dovecot.log
...
...

If the search is not empty start to write a new detailed rule until the match is done and the line disappear from the above search command. Repeat these steps until lograptor doesn’t found any unparsed string in your file.

With this technique you can easily write down all the report rules for an application in some minutes.

REPORT DATA SECTIONS

Additional configuration sections define the data elements for composing the report. These sections have some mandatory options and one or more options that define the usage of application’s pattern rules.

Mandatory Options

subreport
Indicates in which subreport insert the element. It has to match the name of one of the subreports specified in the main configuration file.
title
Header to be included in the report.
color
Color to be used for the header (use the names or the codes defined for HTML and CSS specifications).
function

Function to apply on the results extracted from the pattern rules of the application. There are three different functions definable, each one lead to a different representation of the results:

total(), total
Creates lists with total values from the results.
top(<num>, <header>)

Creates a ranking of maximum values.

The <num> parameter is a positive integer that indicating how many maximum values to be taken into account. The third parameter is a description for the field, which will appear on the right column of a two-column table.

table(<header 1>, .. <header K>)

Create a table from a result set.

The arguments are the descriptions that have to be included in the headers of the table. The number of arguments determines the number of columns of the table. These tables, also when generated from logs of different applications, are compacted into a single table under specific conditions. For this topic read the REPORT OPTIMIZATION paragraph.

Report Optimization

The program automatically merge tables produced from logs of different applications when the tables belong to the same subreport. Table merging is done when if there is an exact matching between titles and headers. The correspondence of the headers is performed on names, total number and position. This feature is useful for example if you want to produce a single table with all user logins. The resulting reports are smaller and more readable.

COMMENTS

Lines starting with “#” or ‘;’ are ignored and may be used to provide comments.

AUTHORS

Davide Brunato <brunato@sissa.it>