riak_sysmon
is an Erlang/OTP application that manages the event
messages that can be generated by the Erlang virtual machine's
system_monitor
BIF (Built-In Function). These messages can notify a
central data-gathering process about the following events:
- Processes that have their private heaps grow beyond a certain size.
- Processes whose private heap garbage collection ops take too long
- Ports that are busy, e.g., blocking file & socket I/O
- Network distribution ports are busy, e.g., lots of communication with a slow peer Erlang node.
The problem with system_monitor
events is that there isn't a
mechanism within the Erlang virtual machine that limits the rate at
which the events are generated. A busy VM can easily create many
hundreds of these messages per second. Some kind of rate-limiting
filter is required to avoid further overloading a system that may
already be overloaded.
This app will use two processes for system_monitor
message handling.
- A
gen_server
process to provide a rate-limiting filter. - A
gen_event
server to allow flexible, user-defined functions to respond tosystem_monitor
events that pass through the first stage filter.
(Silly reference to The Highlander omitted....)
The Erlang/OTP documentation is pretty clear on this point: only one
process can receive system_monitor
messages. But using the
riak_sysmon
OTP app, if multiple parties are interested in receiving
system_monitor
events, each party can add an event handler to the
riak_sysmon_handler
event handler.
The event handler process in this application uses the registered name
riak_sysmon_handler
. To add your handler, use something like:
gen_event:add_sup_handler(riak_sysmon_handler, yourModuleName, YourInitialArgs)
.
See the
gen_event
documentation for add_sup_event/3
for API details. See the example event handler module in the source
repository, src/riak_sysmon_example_handler.erl
, for example usage.
The following events can be sent from the riak_sysmon
filtering/rate-limiting process (a.k.a. riak_sysmon_filter
) to the
event handler process (a.k.a. riak_sysmon_handler
).
{monitor, pid(), atom(), term()}
... These aresystem_monitor
messages as they are received verbatim by theriak_sysmon_filter
process. See the reference documentation forerlang:system_monitor/2
for details.{suppressed, proc_events | port_events, Num::integer()}
... These messages inform your event handler thatNum
events of a certain type (proc_events
orport_events
) were suppressed in the last second (i.e. their arrival rate exceeded the configured rate limit).