Re-worked Simple WATCHdog.


NAME

plw - Perl Log Watcher


SYNOPSIS

plw [--config-file file] [--debug-level integer] [--restart-time time] [--examine] [--dont-get-state] [--help] [--version]


DESCRIPTION

PLW is designed to monitor system activity. PLW requires a configuration file which contains a global configuration, pattern(s) to look for and action(s) to perform when each pattern is found.

Below, PLW means the project and plw means the perl script.


REQUIREMENTS

These additional perl modules may be needed--except for XML::Twig, which is required. The others will only be required if the configuration uses that function. For example, Term::ANSIColor is not needed unless the echo_message action is used.

XML::Twig

To parse the configuration file. See http://xmltwig.com/xmltwig/.

Time::HiRes

Used by the ring bell action to measure the gap between rings.

Date::Manip

Used by <throttle>, <threshold> and <when> when --examine is specified on the command line.

Term::ANSIColor

Used by the echo_message action to send output to the console, when a color mode is specified.

File::Tail and Mail::Sendmail

Optionally, these can be used instead of tail and sendmail binaries.


COMMAND LINE OPTIONS

--config-file=filename or -c filename

Tells plw where to find its configuration file. The default is /etc/plw.xml. This is useful to monitor a log file for users who don't have root, and so cannot edit /etc/plw.xml, for example Oracle DBAs who want to monitor TNS listener log files.

--debug_level=digit

Tells plw to log additional information useful for debugging. Higher numbers yeild more output. The default is 0.

--restart-time=[+]hh:mm[am|pm] or -r [+]hh:mm[am|pm]

Restart at the specified time where hh is hours and mm is minutes. If the am/pm indicator is omitted, then a 24-hour clock is assumed. If the time is preceeded by the ``+'' character, then the restart time will be set to the current time plus the specified time and the am/pm indicator will be ignored. The default is not to restart.

--examine

This causes plw to run a single pass through the configuration file. <file> and <pipe> sections will be processed in full, once rather than monitoring.

--dont-get-state

Tells plw not to load saved state for threshold values.

--help

Prints usage information and exits.

--version or -V

Prints version information and exits.


THE CONFIGURATION FILE

The configuration file is used by the plw program to determine what types of expression patterns to look for and what type of action(s) should be taken when a pattern is matched. The layout of the file is any global configuration values first, then log sections that define a file to watch and one or more groups of patterns and actions.

  <plw>
    <global>....</global>
    <log>....</log>
  </plw>

GLOBAL SETTINGS

hostname

Defines the hostname used in email headers. It defaults to the value returned by the hostname() function from Sys::Hostname.

  <hostname>server.example.com</hostname>
pid_file

Defines a location to write the only or main process id number. Defaults to no pid file.

  <pid_file>/var/run/plw</pid_file>
clean_up_pid_file

Delete the the pid file on exit, if possible.

  <clean_up_pid_file/>
daemon

If this is specified in the global section, the process will run in the background. On Linux systems, plw is best run from inittab via a respawn entry like sw:2345:respawn:/usr/local/sbin/plw >/var/log/plw.log 2E<gt>&1, so this shouldn't be specified in this usage. Also, for Mac OS running plw via launchd don't specify <daemon> either. If you are running from a command line, <daemon> is useful since it will disconnect from the terminal and will not exit when the terminal does.

  <daemon/>
debug_level

Determines how much, if any, output should be written to standard error, or <log_file>. Higher numbers yeild more output.

  <debug_level>9</debug_level>
log_file

Specify a log file to write to instead of standard error, when debug_level is set.

  <log_file>/var/log/plw.log</log_file>
select, fork or thread

One of these must be specified. Select uses operating system calls to block until a file is ready to be read. Fork spawns a process for each log file being watched. Thread creates a thread for each log file; this requires a perl binary built with threading. For low log file volume, use select.

  <select/>
use_cpan_file_tail, or tail_cmd and tail_args

One of <use_cpan_file_tail> and <tail_cmd> must be given. use_cpan_file_tail requires the File::Tail module, which may or may not be adequate for the log volume. To use an external tail program, specify the path to tail, and, if necessary, any argurments (defaults to -n 0 -f).

  <tail_cmd>/usr/bin/tail</tail_cmd>
use_cpan_mail_sendmail, or sendmail_cmd and sendmail_args

Specify either <use_cpan_mail_sendmail> or <sendmail_cmd>, or mail will be sent via /usr/sbin/sendmail. use_cpan_mail_sendmail requires the Mail::Sendmail module. To use an external sendmail program other then the default, specify the path to sendmail, and, if necessary, any argurments (defaults to -oi -t).

  <sendmail_cmd>/usr/bin/sendmail</sendmail>
nrpe_port

Give a number from 1 to 65535 to use for communication with the included Nagios plugin check_plw.pl. Defaults to port 17 (TCP). plw only listens on localhost.

write_cmd

Give the path to the command used by write_message. Defaults to /usr/bin/write. The command must accept one argument on the command line: the userid to write to, and read standard in as the source of the message. See write(1).

  <write_cmd>/usr/bin/local/write</write_cmd>
state_file

Location for a database to record the number of matches for thresholds and throttles. Defaults to /var/tmp/plw.db.

  <state_file>/var/db/plw.db</state_file>
restart_time

Sets an alarm for plw, at which plw restarts. The format is either HH:MM, in a 24 hour clock, to restart at the specified time (01:01 to restart at 1:01 AM, for example), or +HH:MM to restart in the given number of hours and minutes (+12:00 to restart every 12 hours).

  <restart_time>01:30</restart_time>
iptables or ipfw or pfctl or ipf

Configuration for Linux (iptables), Mac OS X and FreeBSD (ipfw), OpenBSD based systems (pfctl), or Solaris based systems (ipf) to use to add drop rules to a local firewall. For example, I have iptables -N plw_rejects and iptables -I INPUT -j plw_rejects in a startup script on Linux systems, with failed root logins causing the source IP to get added to the plw_rejects table. This blocks additional connection attempts, which has reduced the number of brute force login attempts the systems get.

for Linux:

  <iptables cmd="/sbin/iptables" table="plw_rejects"/>

for Mac OS X:

  <ipfw cmd="/sbin/ipfw" set="1" start="30000" stop="60000"/>

for OpenBSD and FreeBSD:

  <pfctl cmd="/sbin/pfctl" table="plw_rejects"/>

for Solaris and NetBSD:

  <ipf cmd="/sbin/ipf" file="/var/local/plw_rejects.ipf"/>
extra_include_dirs

Paths to add to @INC for extra_modules.

  <extra_include_dirs>
    /usr/local/lib/perl5/site_perl
  </extra_include_dirs>
extra_modules

Any additional modules that may be needed by <perl> sections. Note that this currently crashes perl (v5.8.5 built for x86_64-linux-thread-multi) on at least Redhat Enterprise 4, so don't use it.

  <extra_modules>DBD::Oracle</extra_modules>

LOG SECTIONS

Each <log> must begin with a <file>,<pipe> or <port> entry specifying what to watch. Then a <watchfor> section is required, followed by an optional <ignore> section. Each <watchfor> section is one or more <group>s of <patterns> and <actions>. The <ignore> section is used to not perform <actions> on defined patterns. This allows an internal network to not cause actions, for example.

  <log>
    <file>/var/log/messages</file>
    <watchfor>
      <group>
        <patterns>....</patterns>
        <actions>....</actions>
      <group>
    </watchfor>
  </log>
file

Follow the contents of the given file (or named pipe), using the method specified in the global settings.

  <file>/path/to/file</file>
pipe

A <pipe> tells perl to run the given command, and read input from it. For a named pipe in the filesystem, just use <file>.

  <pipe>/path/to/executable</pipe>
port_data

The port allows listening on a port, then taking actions based on the data comming from the port. The configuration for the listener is given in a <listen> section. <listen> must have port and address attributes. The protocol attribute is optional, and defaults to tcp. Address can be given as * for all interfaces, or a hostname or IP address. Port must be an integer from 0 to 65535.

  <port_data><listen port="10000" address="*"/></port_data>
port_drop

This allows listening on a port, then taking actions based on the connection information. For example, this can be used to listen on a port that no one should be connecting to and then blocking any system that does connect via <block_at_firewall>. This would make port scans a bit more difficult, as well as stopping connections from a system that appears to be attempting to exploit a vulnerability. The listen section is the same as port_data.

  <port_drop>
    <listen port="10000" protocol="tcp" address="192.168.2.2"/>
  </port_drop>

PATTERNS

A <patterns> section is made up of one or more <pattern> entries, which are perl regular expressions. When a <pattern> is matched, the <actions> in the <group> are performed.

  <patterns>
    <pattern>: Failed password for root</pattern>
  </patterns>

ACTIONS

threshold

Require a number of matches in a time period before actions are taken. The repeat attribute is a boolean. ``no'', ``off'', ``0'' and ``false'' are the same, all other values are the default of ``yes''. The default is to reset the counter to 0 after the number of events has been met, thus causing actions to not be taken for other events within the time period. ``no'' is to continue with actions for other events during the time period. events and seconds are both required as integers and must be greater than 1.

  <threshold repeat="no" events="2" seconds="60"/>
throttle

Use this action to limit the number of times that the matched pattern has actions performed on it. This has a similar effect to threshold--reducing the number of times other actions are done, but with throttle actions will be done at least once. Events and seconds are both required as integers and must be greater than 1.

  <throttle events="2" seconds="60"/>
send_mail

Send e-mail to address(es) containing the matched lines as they appear. This must have a <subject> specified, and may have <hostname_in_subject>. <addresses> must be specified. An <addresses> section has one or more <address> entries, which are standard email addresses.

  <send_mail>
    <subject>Failed Root Login</subject>
    <hostname_in_subject/>
    <addresses>
      <address>root@localhost</address>
    </addresses>
  </send_mail>
save_for_nrpe

Save basic information from this event for reporting to Nagios via the included check_plw.pl Nagios plugin. Specify the name to report to Nagios. plw will increment a counter each time this action happens, and check_plw.pl will cause the counter to reset to 0. Keep the label less than 32 characters.

  <save_for_nrpe>FailedLogin</save_for_nrpe>
block_at_firewall

Must have an <iptables>, <ipfw>, <pfctl> or <ipf> entry in the global configuration to indicate which firewall rule to update. Optionally include a ``remove_in'' attribute with the number of minutes to wait before automatically removing the IP address from the firewall.

  <block_at_firewall remove_in="30"/>
send_to_pipe

A keep_open=``true'' attribute indicates the program should not be closed after each call. Pipe sends lines to <command>. The pipe will get the full line that was matched. If escape=``true'' is set in <command>, the line will have shell meta characters escaped with a \ like exec_command.

  <send_to_pipe keep_open="true">
    <command>/usr/local/sbin/process</command>
  </send_to_pipe>
exec_command

Must have a <command> to execute. The command may contain variables which are substituted with fields from the matched line. A $N will be replaced by the Nth field in the line. A $0 or $* will be replaced by the entire line. Shell meta characters ([]()<>$) are escaped by adding 'escape=``true''' to <command>.

  <exec_command>
    <command escape="true">/usr/local/bin/process</command>
  </exec_command>
write_message

Use the command defined by <write_cmd> to send matched lines to <users>, with an optional <message> header. <message> can have $N notation like exec_command. The full line that matched is sent if <message> isn't given. This uses the same routine as <send_to_pipe>, so it will cause any open process held by <send_to_pipe> to exit.

  <write_message>
    <message>alert: $0</message>
    <users>
      <user>root</user>
    </users>
  </write_message>
echo_message [mode=``mode'']

Echo the matched line, or an optional <message>. The message can have $N notation like exec_command. The text mode may be normal, bold, underscore, blink, inverse, black, red, green, yellow, blue, magenta, cyan, white, black_h, red_h, green_h, yellow_h, blue_h, magenta_h, cyan_h, and/or white_h. The _h colors specify a highlighting color. The other colors are assigned to the letters. Some modes may not work on some terminals. The default is no highlighting. Don't use this with the <daemon> global option, since there is nothing to output too.

  <echo_message mode="red"><message>Hello World</message></echo_message>
ring_bell

Echo the matched line, and send a bell <rings> times (default = 1) with an optional <delay> between them. Like echo_message, this won't do much in daemon mode.

  <ring_bell rings="3" delay="0.5"/>
continue

Use this action to cause plw to continue to try to match other pattern/action groups after it is done with the current pattern/action block.

  <continue/>
quit

Use this action to cause the plw process to clean up and quit immediately. Currently, this causes the exit of a child process in <fork/> mode. The main process in <fork/> mode starts a new child process, which continues monitoring.

  <quit/>

SPECIAL OPTION

The following may be used as an option for any of the above actions except for throttle.

when

Use this option to specify windows of time and days when the action can be performed. The format is day_of_week:hour_of_day. Use * for any, and - to separate a range. For example:

  <ring_bell rings="3" delay="0.5">
    <when>*:8-17</when>
  </ring_bell>

is any day of the week from 8AM to 5PM. PM to AM can be used for overnight rules like 17-9. For days, 0 is Sunday and 6 is Saturday. For time use 24 notation, so 0 is Midnight and 23 is 11PM.


IGNORE SECTION

ignore

Ignore tells plw not perform any actions for this log in specific instances. Those instances can defined by a regular expression, or perl statements that get eval-ed with the matched line as input.

  <ignore>
    <pattern>192\.168\.1\.\d+</pattern>
  </ignore>

CONFIGURATION EXAMPLE

This defines a single watched log file, that looks for failed root logins, then sends an email and adds the source IP address to an iptables rule. It begins with the Document Type Definition that describes the XML structure. This can be left out, or made a separate file by replacing the <!DOCTYPE....> with something like <!DOCTYPE plw SYSTEM "http://your.server.com/path/plw.dtd"E<gt>. After that, global declarations set up a pid file, indicates a log level, tells plw how to monitor the file, and where iptables is. Then the log file is defined (<log><file>), followed by what to <watchfor> in regular expression <patterns>, and <actions> to take when a pattern is matched.

  <?xml version='1.0'?>
  <!DOCTYPE plw PUBLIC "-//PLW//DTD PLW V1.0//EN" "examples/plw.dtd">
  <plw>
    <global>
      <pid_file>/var/run/plw</pid_file>
      <debug_level>9</debug_level>
      <select/>
      <use_cpan_file_tail/>
      <iptables cmd="/sbin/iptables" table="plw_rejects">
    </global>
    <log>
      <file>/var/log/secure</file>
      <watchfor>
        <!--Aug 31 14:23:33 juicy sshd[11314]: Failed password
         for root from ::ffff:209.151.135.103 port 46740 ssh2-->
        <group>
          <patterns>
            <pattern>: Failed password for root</pattern>
          </patterns>
          <actions>
            <block_at_firewall/>
            <send_mail>
              <subject>Failed Root Login</subject>
              <hostname_in_subject/>
              <addresses>
                <address>root@localhost</address>
              </addresses>
            </send_mail>
          </actions>
        </group>
      </watchfor>
      <ignore>
        <pattern>192\.168\.1\.\d+</pattern>
      </ignore>
    </log>
    <log>
      <file>/var/log/httpd/access_log</file>
      <watchfor>
        <!--83.15.173.115 - - [05/Aug/2007:05:02:39 -0400]
         "POST /test/Members/user/PlonePage HTTP/1.1"
         404 16790 "<a href="http://www.test.org/Members/user/PlonePage/sendto_form"">http://www.test.org/Members/user/PlonePage/sendto_form"</a>;
         "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"-->
        <group>
          <patterns>
            <pattern>(?i:\/(_|twiki|ziddlywiki)|mysql|admin|http:|\.personal).*" (4(0[0-9]|1[0-7])|50[0-5])</pattern>
          </patterns>
          <actions>
            <send_mail>
              <subject>Bad web client</subject>
              <hostname_in_subject/>
              <addresses>
                <address>abrenner</address>
              </addresses>
            </send_mail>
            <block_at_firewall>
              <iptable>plw_rejects</iptable>
            </block_at_firewall>
          </actions>
        </group>
      </watchfor>
      <ignore>
        <pattern>(192\.168\.1\.\d+|Googlebot|msnbot|Yahoo\! Slurp)</pattern>
      </ignore>
    </log>
  </plw>

FOR PERL PROGRAMMERS ONLY

<perl>perl code</perl>

This permits you to insert Perl code into your configuration file to configure the hostname, or ignore values. The code must be eval-able.


NOTES

Upon receiving a ALRM or HUP signal plw will re-read the configuration file and restart. plw will terminate gracefully when it receives a QUIT, TERM, or INT signal. plw will re-open the log file when it receives a USR1.

The protocol used by check_plw.pl and plw itself to talk to the process (or thread) that manages the event counter is simple. The server view is:

  1. Open the TCP port and wait for connections.

  2. On a new connection read one line, which will be either 'GET' or 'PUT'.

  3. On a GET:

    1. Send one line showing the number of lines that follow.

    2. Send that number of lines, which are formatted as item:number.

    3. After each sent line, reset the nrpe_values counter to 0.

    On a PUT:

    1. Read one line containing a value defined in <save_for_nrpe>.

    2. Incement that value in the nrpe_values dictionary.


AUTHOR

E. Todd Atkins - Todd.Atkins@StanfordAlumni.ORG; original author of SWATCH.

Alan Brenner - alan.brenner@ithaka.org; significant rewrite including a change to XML configuration and multiple log files in a single plw process via select, threads or forks. I've also added Nagios alerting via check_plw.pl.


BUGS

Undoubtedly there are some in here. I (Alan Brenner) have significantly modified Todd Atkin's code, and, although I run this on several systems with different configurations for each, there are parts that I have not tested much.


SEE ALSO

signal(3), perl(1), perlre(1)


AVAILABILITY

PLW's homepage is at http://tid.ithaka.org/software/PLW/.


COPYRIGHT

  Copyright (C) 1993-2004 E. Todd Atkins
  Copyright (C) 2006-2008 Ithaka Harbors, Inc.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.