Kaji 0.2 Notes

Status:

  • Last update: 2014-11-24
  • Last Kaji meta pull: 2014-11-16
  • Last Kaji pull for other projects: 2014-11-16 (kaji)

I am following development of Kaji toward version 0.2. This page includes notes as I install and use the upstream packages.

As of the 'Last update' above, I have suspended trying to get Kaji 0.2 to work. I am unable to get active checks to work reliably. I also see an Adagios issue from 20 November on official Shinken support from the Kaji developers, saying that more work is necessary.

Installation

Packages below are listed in installation order, on an up-to-date Debian jessie (testing) workstation.

nagvis

OK.

As of 2014-11-10, the postinst script failure no longer occurs. I maintain a clone of the nagvis submodule, but it's not necessary after this fix.

nagvis-demos

OK.

grafana

/grafana resource added to Apache. Uses port 80, apparently.

python-influxdb

Installed influxdb deb first. It does not actually start any services.

pynag

OK.

shinken-common

OK.

As of 2014-11-10, the postinst script adds the 'shinken' user to the 'nagios' group.

adagios

Adagios is built on the Django web framework, and it requires a version of Django no greater than v1.6. Presently Debian jessie uses Django v1.7. So, I install from local v1.6.6 debs downloaded from Debian Snapshots.

Manual changes while testing:
  • pynag: Parsers/__init__.py, Livestatus.__init__() method. Hardcode setting livestatus_socket_path to 'localhost:50000'. Adagios unable to find this settting in the Shinken config file.
  • pynag: Control/Command/__init__.py, send_command() method. Move the find_command_file() call within the try block. The command file will not be found, so the livestatus call is used. I see the pynag repository also includes a fix for this now.

Adagios uses basic auth. Kaji uses credentials: kaji/kaji

As of 2014-11-10, no longer need to manually change owner of /etc/adagios and /var/lib/adagios.

Kaji moves the /adagios HTTP resource to /. This change does not work well on my development workstation, where I use my public_html directory to serve copies of documentation. So, I manually changed the Apache adagios.conf to use a virtual host, so I can use a different virtual host to serve the local documentation.

# 2014-11-02 kbee Use virtual host to allow for coexistence with other
# uses of Apache.
<VirtualHost *:80>
    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined
... rest of content ...
</VirtualHost>

I suspect these sort of conflicts are why Kaji currently should be installed in a Docker container.

shinken-mod-booster-nrpe

OK.

shinken-mod-pickle-retention-file-generic

OK.

shinken-mod-influxdb

OK.

shinken-mod-simple-log

OK.

shinken-mod-logstore-sqlite

OK.

shinken-mod-livestatus

OK.

shinken

OK.

Setting up shinken-mod-webui (1.0-1) ... dpkg: warning: version 'shinken-module-broker-webui-cfgpassword' has bad syntax: version number does not start with digit

Looks like the problem is line 27. The issue involves moving a configuration file for a Shinken upgrade from 1.4, so we do not need to take any action to fix.

Reported to Debian BTS #769391.

kaji

As of 2014-11-10, the dependency on nagios-plugins-extra package has been converted to Recommends.

Using kb2ma package 0.1.99.7kb2ma.

After installation, must run /usr/sbin/kaji-finish-install. Output below.

root@verix:/srv/debs# /usr/sbin/kaji-finish-install
Setting up InfluxBD Apache reverse proxy
Module proxy already enabled
Considering dependency proxy for proxy_http:
Module proxy already enabled
Module proxy_http already enabled
Site influxdb already enabled
[ ok ] Reloading web server: apache2.
DONE
Reset Nagvis Authentication file
Reset Done
Restarting all necessary services for Kaji
Restarting InfluxDB
Setting ulimit -n 4096
Setting ulimit -n 4096
influxdb process was stopped [ OK ]
Setting ulimit -n 4096
Starting the process influxdb [ OK ]
influxdb process was started [ OK ]
Restarting Apache
[ ok ] Restarting web server: apache2.
Restarting Shinken
Restarting scheduler
. ok
Restarting poller
. ok
Restarting reactionner
. ok
Restarting broker
. ok
Restarting receiver
. ok
Restarting arbiter
Doing config check
. ok
. ok
Restarting DONE
Create user and databases influxDB
Create Shinken Database
Create Grafana Database
Creatation of user and databases influxDB DONE
Prepare Shinken/Adagiosconfig folder
Initialized empty Git repository in /etc/shinken/.git/
[master (root-commit) 0faf727] Initial Kaji installation commit
 29 files changed, 1355 insertions(+)
 create mode 100644 arbiters/arbiter.cfg
 create mode 100644 brokers/broker.cfg
 create mode 100644 commands.cfg
 create mode 100644 contacts.cfg
 create mode 100644 daemons/brokerd.ini
 create mode 100644 daemons/pollerd.ini
 create mode 100644 daemons/reactionnerd.ini
 create mode 100644 daemons/receiverd.ini
 create mode 100644 daemons/schedulerd.ini
 create mode 100644 hosts/localhost.cfg
 create mode 100644 modules/.placeholder
 create mode 100644 modules/booster_nrpe.cfg
 create mode 100644 modules/influxdb.cfg
 create mode 100644 modules/livestatus.cfg
 create mode 100644 modules/logstore_sqlite.cfg
 create mode 100644 modules/retention-pickle-arbiter.cfg
 create mode 100644 modules/retention-pickle-broker.cfg
 create mode 100644 modules/retention-pickle-scheduler.cfg
 create mode 100644 modules/simple-log.cfg
 create mode 100644 packs/.placeholder
 create mode 100644 pollers/poller.cfg
 create mode 100644 reactionners/reactionner.cfg
 create mode 100644 realms/realms.cfg
 create mode 100644 receivers/receiver.cfg
 create mode 100644 resource.cfg
 create mode 100644 schedulers/scheduler.cfg
 create mode 100644 shinken.cfg
 create mode 100644 templates.cfg
 create mode 100644 timeperiods.cfg
Preparation DONE
Fix nagios plugins rights
DONE

Usage Notes

These notes are for issues discovered after installation, during normal use.

  • Shinken starts at boot time, but something attempts to start it a second time. See the boot console fragment below. A similar problem occurs at shutdown.
  • Adagios does not run external commands, like check_ping. As of 2014-11-13, I expect this capability to work due to a meta project commit. Actually, I have seen external commands work, but only twice.
  • As of 2014-11-24, able to access Shinken configuration from the "Configure" tab.
  • When viewing host status, the graphs tab complains about pnp4nagios. Installed this package, without the recommended icinga packages. Graphs still don't work, with the message that the perfdata directory (/var/lib/pnp4nagios/perfdata) is empty.

Boot console:

Sun Nov  2 03:23:50 2014: [....] Starting system message bus: dbus^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting scheduler:
Sun Nov  2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting poller:
Sun Nov  2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting reactionner:
Sun Nov  2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting broker:
Sun Nov  2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting receiver:
Sun Nov  2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:53 2014: Starting arbiter:
Sun Nov  2 03:23:56 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:56 2014: [....] Starting web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message
Sun Nov  2 03:23:59 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:59 2014: [....] Starting cgroup management proxy daemon: cgproxy^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:59 2014: [....] Starting CUPS Bonjour daemon: cups-browsed^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:59 2014: [^[[36minfo^[[39;49m] GNUstep distributed object mapper disabled, see /etc/default/gdomap.
Sun Nov  2 03:23:59 2014: [....] Starting bluetooth: bluetoothd^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:23:59 2014: Starting arbiter:
Sun Nov  2 03:23:59 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Sun Nov  2 03:23:59 2014: Starting broker:
Sun Nov  2 03:23:59 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Sun Nov  2 03:23:59 2014: [....] Starting Light Display Manager: lightdm^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:00 2014: [....] Starting periodic command scheduler: cron^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:00 2014: [....] Starting deferred execution scheduler: atd^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:01 2014: [....] Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:03 2014: [....] Recovering schroot sessions:^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:03 2014: [....] Starting MTA: exim4^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c.
Sun Nov  2 03:24:03 2014: saned disabled; edit /etc/default/saned
Sun Nov  2 03:24:03 2014: Starting poller:
Sun Nov  2 03:24:03 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Sun Nov  2 03:24:04 2014: Starting receiver:
Sun Nov  2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Sun Nov  2 03:24:04 2014: Starting reactionner:
Sun Nov  2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Sun Nov  2 03:24:04 2014: Starting scheduler:
Sun Nov  2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m

Uninstall Notes

I need to uninstall the packages at times to try new versions. This process basically is the reverse of installation. Below is a list of tasks beyond just removing the packages.

Be sure to purge packages so they are removed from the cache.

kaji

  • Remove /var/www/.git.
  • Remove influxdb databases grafana and shinken if not removing influxdb package. It's simple to use the influxdb web tool at localhost:8083. Of course, the influxdb daemon must be running.
  • To be safe/easy: Remove apache2 sites-enabled entries: adagios.conf, grafana.conf, influxdb.conf. Relink to 000-default.conf. Stop apache.
  • Remove /etc/apache2/sites-available/influxdb.conf

adagios

  • Remove shinken user
  • Ensure /etc/shinken is removed

python-influxdb

  • Also remove influxdb package itself

nagvis

  • Ensure /etc/nagvis removed
Contents © 2014 Ken Bannister - Powered by Nikola