An very interesting and insightful presentation by Jordan Sissel about why and how Logstash came about. This is from PuppetConf 2012.

  1. Yearly Sysadvent blog.
  2. FPM - Build packages for multiple platforms (deb, rpm, etc) with great ease and sanity
  3. There’s too much data to read in a log file. We need some way of filtering it to make sense.

What else sucks? Shitty error messages!

  1. Write better error messages.
  2. Hacks work as one-offs - not everyday. Hard to maintain. You are asked to write hacks all the time.
xkcd - regular expressions
  1. People are using you as their computer interface.

Don’t be a human keyboard.

  1. What is a log?

DATA + TIMESTAMP = LOG

  1. Lifecycle of a log entry: record > transmit > analyse > store > delete

  2. Opensource tools: transport: flume, fluentd, scribe, rsyslog, syslog-ng search+analytics: hadoop, graylog2, elsa storage: hdfs, cassandra, elasticsearch

  3. Use Grok:

    • named pattern: %{patternName:Name}.
    • reuse matched patterns and transformations.
    • has types: Numbers, Strings etc.
    • patterns are unit tested.
    • multiline matches for Stacktraces etc.

Stop inventing shitty time formats!

  1. Statsd metrics can be visualized with tools like:
    • graphite
    • ganglia
    • circonus
    • boundary
    • librato
    • opentsdb
    • graylog2
  2. Apache uses gettimeofday() which changes when NTP synchronizes its clock. Leads to negative time values.

Does Apache have a Time Machine?

  1. Features:
    • Transport and process logs to and from anywhere.
    • Search and analytics.
  2. Design:
    • Logstash should fit your infrastructure.
    • Logstash is extendable (via plugins).
  3. Community:
    • If a newbie has a hard time it’s a bug (in the code or documentation etc).
    • Contributions are more than code (file bugs, feature requests, ideas,documentation etc).
    • Tools: Kibana, puppet module, logstash cli.
  4. Links: