Status:
- Last update: 2014-11-24
- Last Kaji meta pull: 2014-11-16
- Last Kaji pull for other projects: 2014-11-16 (kaji)
I am following development of Kaji toward version 0.2. This page includes notes as I install and use the upstream packages.
As of the 'Last update' above, I have suspended trying to get Kaji 0.2 to work. I am unable to get active checks to work reliably. I also see an Adagios issue from 20 November on official Shinken support from the Kaji developers, saying that more work is necessary.
Installation
Packages below are listed in installation order, on an up-to-date Debian jessie (testing) workstation.
nagvis
OK.
As of 2014-11-10, the postinst script failure no longer occurs. I maintain a clone of the nagvis submodule, but it's not necessary after this fix.
nagvis-demos
OK.
grafana
/grafana resource added to Apache. Uses port 80, apparently.
python-influxdb
Installed influxdb deb first. It does not actually start any services.
pynag
OK.
shinken-common
OK.
As of 2014-11-10, the postinst script adds the 'shinken' user to the 'nagios' group.
adagios
Adagios is built on the Django web framework, and it requires a version of Django no greater than v1.6. Presently Debian jessie uses Django v1.7. So, I install from local v1.6.6 debs downloaded from Debian Snapshots.
- Manual changes while testing:
- pynag: Parsers/__init__.py, Livestatus.__init__() method. Hardcode setting livestatus_socket_path to 'localhost:50000'. Adagios unable to find this settting in the Shinken config file.
- pynag: Control/Command/__init__.py, send_command() method. Move the find_command_file() call within the try block. The command file will not be found, so the livestatus call is used. I see the pynag repository also includes a fix for this now.
Adagios uses basic auth. Kaji uses credentials: kaji/kaji
As of 2014-11-10, no longer need to manually change owner of /etc/adagios and /var/lib/adagios.
Kaji moves the /adagios HTTP resource to /. This change does not work well on my development workstation, where I use my public_html directory to serve copies of documentation. So, I manually changed the Apache adagios.conf to use a virtual host, so I can use a different virtual host to serve the local documentation.
# 2014-11-02 kbee Use virtual host to allow for coexistence with other # uses of Apache. <VirtualHost *:80> ErrorLog ${APACHE_LOG_DIR}/error.log CustomLog ${APACHE_LOG_DIR}/access.log combined ... rest of content ... </VirtualHost>
I suspect these sort of conflicts are why Kaji currently should be installed in a Docker container.
shinken-mod-booster-nrpe
OK.
shinken-mod-pickle-retention-file-generic
OK.
shinken-mod-influxdb
OK.
shinken-mod-simple-log
OK.
shinken-mod-logstore-sqlite
OK.
shinken-mod-livestatus
OK.
shinken
OK.
Setting up shinken-mod-webui (1.0-1) ... dpkg: warning: version 'shinken-module-broker-webui-cfgpassword' has bad syntax: version number does not start with digit
Looks like the problem is line 27. The issue involves moving a configuration file for a Shinken upgrade from 1.4, so we do not need to take any action to fix.
Reported to Debian BTS #769391.
kaji
As of 2014-11-10, the dependency on nagios-plugins-extra package has been converted to Recommends.
Using kb2ma package 0.1.99.7kb2ma.
After installation, must run /usr/sbin/kaji-finish-install. Output below.
root@verix:/srv/debs# /usr/sbin/kaji-finish-install Setting up InfluxBD Apache reverse proxy Module proxy already enabled Considering dependency proxy for proxy_http: Module proxy already enabled Module proxy_http already enabled Site influxdb already enabled [ ok ] Reloading web server: apache2. DONE Reset Nagvis Authentication file Reset Done Restarting all necessary services for Kaji Restarting InfluxDB Setting ulimit -n 4096 Setting ulimit -n 4096 influxdb process was stopped [ OK ] Setting ulimit -n 4096 Starting the process influxdb [ OK ] influxdb process was started [ OK ] Restarting Apache [ ok ] Restarting web server: apache2. Restarting Shinken Restarting scheduler . ok Restarting poller . ok Restarting reactionner . ok Restarting broker . ok Restarting receiver . ok Restarting arbiter Doing config check . ok . ok Restarting DONE Create user and databases influxDB Create Shinken Database Create Grafana Database Creatation of user and databases influxDB DONE Prepare Shinken/Adagiosconfig folder Initialized empty Git repository in /etc/shinken/.git/ [master (root-commit) 0faf727] Initial Kaji installation commit 29 files changed, 1355 insertions(+) create mode 100644 arbiters/arbiter.cfg create mode 100644 brokers/broker.cfg create mode 100644 commands.cfg create mode 100644 contacts.cfg create mode 100644 daemons/brokerd.ini create mode 100644 daemons/pollerd.ini create mode 100644 daemons/reactionnerd.ini create mode 100644 daemons/receiverd.ini create mode 100644 daemons/schedulerd.ini create mode 100644 hosts/localhost.cfg create mode 100644 modules/.placeholder create mode 100644 modules/booster_nrpe.cfg create mode 100644 modules/influxdb.cfg create mode 100644 modules/livestatus.cfg create mode 100644 modules/logstore_sqlite.cfg create mode 100644 modules/retention-pickle-arbiter.cfg create mode 100644 modules/retention-pickle-broker.cfg create mode 100644 modules/retention-pickle-scheduler.cfg create mode 100644 modules/simple-log.cfg create mode 100644 packs/.placeholder create mode 100644 pollers/poller.cfg create mode 100644 reactionners/reactionner.cfg create mode 100644 realms/realms.cfg create mode 100644 receivers/receiver.cfg create mode 100644 resource.cfg create mode 100644 schedulers/scheduler.cfg create mode 100644 shinken.cfg create mode 100644 templates.cfg create mode 100644 timeperiods.cfg Preparation DONE Fix nagios plugins rights DONE
Usage Notes
These notes are for issues discovered after installation, during normal use.
- Shinken starts at boot time, but something attempts to start it a second time. See the boot console fragment below. A similar problem occurs at shutdown.
- Adagios does not run external commands, like check_ping. As of 2014-11-13, I expect this capability to work due to a meta project commit. Actually, I have seen external commands work, but only twice.
- As of 2014-11-24, able to access Shinken configuration from the "Configure" tab.
- When viewing host status, the graphs tab complains about pnp4nagios. Installed this package, without the recommended icinga packages. Graphs still don't work, with the message that the perfdata directory (/var/lib/pnp4nagios/perfdata) is empty.
Boot console:
Sun Nov 2 03:23:50 2014: [....] Starting system message bus: dbus^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting scheduler: Sun Nov 2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting poller: Sun Nov 2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting reactionner: Sun Nov 2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting broker: Sun Nov 2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting receiver: Sun Nov 2 03:23:53 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:53 2014: Starting arbiter: Sun Nov 2 03:23:56 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:56 2014: [....] Starting web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.1.1. Set the 'ServerName' directive globally to suppress this message Sun Nov 2 03:23:59 2014: ^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:59 2014: [....] Starting cgroup management proxy daemon: cgproxy^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:59 2014: [....] Starting CUPS Bonjour daemon: cups-browsed^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:59 2014: [^[[36minfo^[[39;49m] GNUstep distributed object mapper disabled, see /etc/default/gdomap. Sun Nov 2 03:23:59 2014: [....] Starting bluetooth: bluetoothd^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:23:59 2014: Starting arbiter: Sun Nov 2 03:23:59 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m Sun Nov 2 03:23:59 2014: Starting broker: Sun Nov 2 03:23:59 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m Sun Nov 2 03:23:59 2014: [....] Starting Light Display Manager: lightdm^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:00 2014: [....] Starting periodic command scheduler: cron^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:00 2014: [....] Starting deferred execution scheduler: atd^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:01 2014: [....] Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:03 2014: [....] Recovering schroot sessions:^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:03 2014: [....] Starting MTA: exim4^[[?25l^[[?1c^[7^[[1G[^[[32m ok ^[[39;49m^[8^[[?25h^[[?0c. Sun Nov 2 03:24:03 2014: saned disabled; edit /etc/default/saned Sun Nov 2 03:24:03 2014: Starting poller: Sun Nov 2 03:24:03 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m Sun Nov 2 03:24:04 2014: Starting receiver: Sun Nov 2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m Sun Nov 2 03:24:04 2014: Starting reactionner: Sun Nov 2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m Sun Nov 2 03:24:04 2014: Starting scheduler: Sun Nov 2 03:24:04 2014: [....] Already running ...^[[?25l^[[?1c^[7^[[1G[^[[33mwarn^[[39;49m^[8^[[?25h^[[?0c ^[[33m(warning).^[[39;49m
Uninstall Notes
I need to uninstall the packages at times to try new versions. This process basically is the reverse of installation. Below is a list of tasks beyond just removing the packages.
Be sure to purge packages so they are removed from the cache.
kaji
- Remove /var/www/.git.
- Remove influxdb databases grafana and shinken if not removing influxdb package. It's simple to use the influxdb web tool at localhost:8083. Of course, the influxdb daemon must be running.
- To be safe/easy: Remove apache2 sites-enabled entries: adagios.conf, grafana.conf, influxdb.conf. Relink to 000-default.conf. Stop apache.
- Remove /etc/apache2/sites-available/influxdb.conf
adagios
- Remove shinken user
- Ensure /etc/shinken is removed
python-influxdb
- Also remove influxdb package itself
nagvis
- Ensure /etc/nagvis removed