gamekeeper is a low resource application to perform multiple roles for your RabbitMQ infrastructure:
- The ability to poll either a local or remote RabbitMQ HTTP API for metrics which are then delivered to either Ganglia, Graphite, or stdout.
- It serves as a Nagios NPRE plugin endpoint for monitoring a node or individual queues' health.
- Node management features such as pruning of idle connections and inactive queues.
gamekeeper has three modes of operation, each corresponding to a different subset of functionality and accessible via the following subcommands:
The measure
subcommand will emit a series of metrics from the specified
--uri
for a number of RabbitMQ/AMQP primitives.
All metrics are prefixed into sinks (Ganglia, Graphite, etc) with the
identifier: <node_name>.rabbit.
The
<node_name>
constant is currently determined by escaping the local hostname, and will be configurable in a future release.
Overview
message.total
message.ready
message.unacked
rate.publish
rate.deliver
rate.redeliver
rate.confirm
rate.ack
Connections
connection.total
connection.idle
- Calculated relative to the specified--days
setting
Channels
channel.total
channel.publisher
- Number of publisher/ingress channelschannel.consumer
- Number of consumer/egress channelschannel.duplex
- Number of channels marked as both publishing and consumingchannel.inactive
Exchanges
exchange.rate.<name>
- Message rate per exchange
Queues
queue.total
queue.idle
- Determined by message residence and flowqueue.messages.<name>
- Ready messages per queuequeue.consumers.<name>
- Consumers per queuequeue.memory.<name>
- Memory usage per queuequeue.ingress.<name>
- Average message ingress per queuequeue.egress.<name>
- Average message ingress per queue
Bindings
binding.total
- Overall number of AMQP bindings
The output sink can be configured to emit to Stdout,,
,
Ganglia,<host>,<port>
, or Graphite,<host>,<port>
using the --sink
argument. The underlying network-metrics also
supports writing to Statsd,<host>,<port>
but this is pointless, and not
recommended due to the fact the RabbitMQ management plugin performs pre-aggregation.
By default metrics will be printed to stdout.
At time of writing SoundCloud emits all RabbitMQ metrics to Ganglia specifically
The check
subcommand is used to perform a high-level inspection of
either the general node health, or a specific queue's health.
All output is to stdout
in the
Nagios NPRE Plugin
format.
Node
The Distributed Erlang sname of the
RabbitMQ node needs to be specified via the --name
argument, so gamekeeper can calculate the correct HTTP API uri to
request. For example, the node rabbit@localhost
would result in HTTP requests to http://localhost:15672/#/nodes/rabbit%40localhost
Warning and critical levels can be specified for both message residence and memory usage. A single check is performed and the output is combined.
A warning or critical for either memory residence or memory usage will result in the most severe being used as the NPRE exit code and one line summary.
The memory usage warning and critical levels are specified in Gigabyte units
Queue
Queue checks are the same as the node level check, but local to a specifically named queue.
The memory usage warning and critical levels are specified in Megabyte units
The prune
subcommand is used via a manual invocation of gamekeeper and is
used to remove (via HTTP DELETE) idle connections and unused queues.
This is primarily useful if you do not use AMQP heartbeats and have problems with dangling load-balancer connections through something like LVS or HAProxy.
These commands are destruction, please use caution!
At present, it is assumed the user knows some of the Haskell eco system and in particular wrangling cabal-dev to obtain dependencies. I plan to offer pre-built binaries for x86_64 OSX and Linux in future.
You will need reasonably new versions of GHC and the Haskell Platform which
you can obtain here, then run make install
in the root directory to compile gamekeeper.
There is also a Chef Cookbook which can be used to manage gamekeeper, if that's how you swing: https://github.com/brendanhay/gamekeeper-cookbook
Command line flags are used to configure gamekeeper, you can access help for
the top-level program and various subcommands via the --help
switch.
Command | Flag | Default | Format | About |
---|---|---|---|---|
measure |
--uri |
guest@localhost:15672 |
URI |
Address of the RabbitMQ API to poll |
--days |
30 |
INT |
Number of days before a conncetion is considered stale | |
--sink |
Stdout,, |
SINK,HOST,PORT |
Sink options describing the type and host/port combination | |
check node |
--name |
STR |
An Erlang atom represent the RabbitMQ node name | |
--uri |
guest@localhost:15672 |
URI |
Address of the RabbitMQ API to poll | |
--messages |
15000000,30000000 |
WARN,CRIT |
Message residence thresholds | |
--memory |
4,8 |
WARN,CRIT |
Memory thresholds, in Gigabytes | |
check queue |
--name |
STR |
The name of the queue to check | |
--uri |
guest@localhost:15672 |
URI |
Address of the RabbitMQ API to poll | |
--messages |
125000,250000 |
WARN,CRIT |
Message residence thresholds | |
--memory |
250,500 |
WARN,CRIT |
Memory thresholds, in Megabytes | |
prune connections |
--uri |
guest@localhost:15672 |
URI |
Address of the RabbitMQ API to poll |
--days |
30 |
INT |
Number of days before a connection is considered idle | |
prune queues |
--uri |
guest@localhost:15672 |
URI |
Address of the RabbitMQ API to poll |
There is also a
--verbose
switch which is useful when debugging metric emission to stdout
After a successful compile, the ./gamekeeper
symlink will be pointing to
the built binary under ./dist
For any problems, comments or feedback please create an issue here on GitHub.
gamekeeper is released under the Mozilla Public License Version 2.0