Skip to content

Scaling Analysis Plugins

Mike Trinkala edited this page Nov 8, 2017 · 2 revisions

Monolithic

Pros

  • conceptually easy i.e. fire up a single plugin to monitor all inputs

Cons

  • usually becomes a bottleneck for things like ingestion monitors as it has to consume all messages
  • cfgs and data structures become more complicated as they have to be setup as nested maps
  • code become more complicated as it needs pruning of expired entries
  • alerting requires an ever increasing number of possible inject_message calls leading to the limit being set very high or unrestricted. By default these will not be deploy able through the Hindsight Admin UI.

Individual

Pros

  • easy to reason about
    • simplifies the design of the plugin
      • easier configuration for alerting and thresholds (avoids nested look-ups and modifying monolithic cfgs)
      • reduces the need for pruning code
  • more flexible
    • easy spin up/down as necessary (can be automated with dynamic loading)
    • easier to create a cfg template for a single instance than a monolithic cfg
  • scales better
    • each plugin only processes a subset of the data
    • load is spread out over different threads

Cons

  • pollutes the plugin list i.e. 100 inputs == 100 monitors (this could be addressed in the UI presentation)
  • cost of the additional messages matchers as each one will run and reject 99% of the messages
    • In the current message matcher design this will quickly become an issue so improvements are needed:
      1. Map based router (create map using a common attribute i.e. Type == 'x' and perform a lookup instead of evaluating every matcher)
      2. Tree based router (group related matchers together failing entire branches in a single evaluation)
      3. More cache friendly matching. The current benchmarks show there is room for up to a 10x improvement

Scaling When Individual Plugins Are Still Too Slow

Partitioning

  • Random (uses UUID)
  • Consistent (use some in-message identifier i.e. Fields[sampleId])

Pros

  • Can allow heavier analysis to be run without back pressuring the the system

Cons

  • Manual process to configure the partitioning and balance the work between threads
  • Requires the extra step of down stream aggregation