Logweeder is a log analyzer. It scans UN*X logs (syslog), categorizes common events, and discovers uncommon events automatically.
In most UN*X systems, syslogs are recorded one event per line.
Logweeder scans syslog outputs (e.g. /var/log/messages
) and
identify "uncommon" events. To detect uncommon entries, a program
first learns "common" log patterns from existing log files. This
is achieved by grouping similar log entries and generalizing them
as a regular expression pattern. A user can use the
generated patterns to classify logs that have a known type. A log
entry that is not categorized to any known type is possibly an
uncommon entry.
Download: logweeder-0.2.tar.gz (gzipped tar, 6kbytes)
Logweeder consists of two programs:
learn.py
- analyzes log files and generates regular expression patterns.
match.py
- takes the regexp patterns and apply them to new logs.
First you need to run learn.py
against log files to
learn regexp patterns. The generated regexp patterns are written
to the standard output.
$ ./learn.py /var/log/messages > mypattern Clustering.++..+.......
The generated file mypattern
should look like this:
('type-0', 8, '^mango\\ rpc\\.mountd\\:\\ authenticated\\ [a-zA-Z_]*\\ request\\ from\\ [a-zA-Z_]*\\.xx\\.xxx\\.xxx\\:[0-9]*\\ for\\ .*\\/[a-zA-Z_]*\\ \\(.*\\/[a-zA-Z_]*\\)', ['authenticated', 'for', 'xxx', 'request', 'xx', 'mountd', 'rpc', 'mango', 'from']) # mango rpc.mountd: authenticated unmount request from kiwi.xx.xxx.xxx:686 for /data (/data) # mango rpc.mountd: authenticated unmount request from kiwi.xx.xxx.xxx:689 for /home (/home) # mango rpc.mountd: authenticated mount request from banana.xx.xxx.xxx:613 for /home (/home) # mango rpc.mountd: authenticated mount request from kiwi.xx.xxx.xxx:697 for /home (/home) # mango rpc.mountd: authenticated mount request from kiwi.xx.xxx.xxx:708 for /data (/data) # mango rpc.mountd: authenticated mount request from grape.xx.xxx.xxx:1023 for /usr/local (/usr/local) # mango rpc.mountd: authenticated unmount request from banana.xx.xxx.xxx:880 for /home (/home) # mango rpc.mountd: authenticated mount request from grape.xx.xxx.xxx:1023 for /usr/local (/usr/local) ('type-1', 3, '^mango\\ kernel\\:\\ Packet\\ log\\:\\ input\\ REJECT\\ eth1\\ PROTO\\=[0-9]*\\ [0-9]*\\.[0-9]*\\.[0-9]*\\.[0-9]*\\:[0-9]*\\ 192\\.168\\.0\\.61\\:[0-9]*\\ L\\=[0-9]*\\ S\\=0x00\\ I\\=[0-9]*\\ F\\=0x4000\\ T\\=.*\\ \\(\\#[0-9]*\\)', ['kernel', '0x4000', 'eth1', 'log', 'PROTO', '0x00', 'input', 'F', 'I', 'L', 'S', 'T', 'Packet', 'mango', 'REJECT']) # mango kernel: Packet log: input REJECT eth1 PROTO=6 128.122.80.107:61887 192.168.0.61:113 L=48 S=0x00 I=4036 F=0x4000 T=63 SYN (#10) # mango kernel: Packet log: input REJECT eth1 PROTO=17 128.105.143.14:41385 192.168.0.61:9618 L=64 S=0x00 I=0 F=0x4000 T=53 (#11) # mango kernel: Packet log: input REJECT eth1 PROTO=6 133.15.94.103:33271 192.168.0.61:113 L=60 S=0x00 I=44595 F=0x4000 T=48 SYN (#10) ('type-2', 1, '^mango\\ rpc\\.mountd\\:\\ export\\ request\\ from\\ 192\\.168\\.0\\.70', ['from', 'request', 'mountd', 'rpc', 'mango', 'export']) # mango rpc.mountd: export request from 192.168.0.70
A line that starts with a parenthesis is a pattern line.
Each pattern has its name (like 'type-0'),
frequency, and regular expression. The following lines with '#
' are
actual log entries which were used to generate this pattern.
Now you can classify each log entry with the obtained patterns:
The name of each entry is displayed at the beginning of the line. An entry that does not match with any known pattern is labeled '$ ./match.py mypattern /var/log/messages type-0: Jan 31 06:17:27 mango rpc.mountd: authenticated unmount request from kiwi.xx.xxx.xxx:686 for /data (/data) type-1: Jan 31 06:35:34 mango kernel: Packet log: input REJECT eth1 PROTO=6 128.122.80.107:61887 192.168.0.61:113 L=48 S=0x00 I=4036 F=0x4000 T=63 SYN (#10) type-0: Feb 4 06:18:42 mango rpc.mountd: authenticated unmount request from kiwi.xx.xxx.xxx:689 for /home (/home) type-0: Feb 4 06:20:02 mango rpc.mountd: authenticated mount request from banana.xx.xxx.xxx:613 for /home (/home) unknown: Feb 5 06:20:51 mango rpc.mountd: export request from 192.168.0.70 type-0: Feb 5 06:21:01 mango rpc.mountd: authenticated mount request from kiwi.xx.xxx.xxx:697 for /home (/home) ...
unknown
'.
Pass these outputs to other postprocessing programs like grep
, awk
, etc.
Synopsis:
$ learn.py [options] [logfile1 logfile2 ...]
learn.py
takes zero or more log files as arguments.
When no filename is specified, it reads logs from the standard input.
Options:
-c charskip
-n num_samples
NOTICE: It takes O(n^2) time to perform learning where n is the nubmer of log entries.
-p pattern
-t similarity_threshold
-v
-q
Synopsis:
$ match.py [options] pattern_file [logfile1 logfile2 ...]
match.py
takes exactly one pattern file and zero or more log files as arguments.
When no log file is specified, it reads logs from the standard input.
Options:
-c charskip
learn.py
.
The default value is 16.
-t freq_threshold
unknown
".
(under construction)
(This is so-called MIT/X license.)
Copyright (c) 2007 Yusuke Shinyama <yusuke at cs dot nyu dot edu>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.