172.17.0.1 - - [05/Sep/2016:20:06:17 +0000] "GET /images/logos/hubpress.png HTTP/1.1" 200 5432 "http://localhost/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/51.0.2704.79 Chrome/51.0.2704.79 Safari/537.36" "-"
Testing Logstash configuration
2016-09-07 logstash
You wrote a piece of Logstash configuration which can parse some logs. You tested several corner cases to ensure the output in Elasticsearch was alright. How do you protect this clever configuration file against regressions?
Unit testing to the rescue of course!
Simple example
For the sake of simplicity, we will take an obvious example: access logs. The input looks like
The output, once in Elasticsearch, should look like
{ "@version":"1",
"@timestamp":"2016-09-05T20:06:17.000Z",
"type":"nginx",
"host":"nginx-server", "path":"/var/log/nginx/access.log",
"clientip":"172.17.0.1", "ident":"-", "auth":"-",
"verb":"GET","request":"/images/logos/hubpress.png","httpversion":"1.1",
"response":200, "bytes":5432, "referrer":"\"http://localhost/\"",
"agent": "\"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/51.0.2704.79 Chrome/51.0.2704.79 Safari/537.36\""
}
The configuration could look like
input {
file {
path => "/var/log/nginx/access*.log"
type => "nginx"
}
}
filter {
if [type] == "nginx" {
grok {
match => [ "message" , "%{COMBINEDAPACHELOG}"]
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
}
mutate {
convert => ["response", "integer"]
convert => ["bytes", "integer"]
}
}
}
output {
elasticsearch {
hosts => [ "es-server"]
index => "logstash-%{+YYYY.MM.dd}"
document_type => "%{type}"
}
}
Split the file
In the above config file, the interesting part, the one containing logic is the filter part. In order to test it, the first thing to do is split this big file into small pieces:
-
01_logstash_input_nginx.conf
contains the nginx file input -
02_logstash_filter_nginx.conf
contains the nginx filter section -
03_logstash_output.conf
contains the elasticsearch output
In production, you can load multiple config files as if they were a single one:
logstash agent -f /etc/logstash.d/*.conf"
At test time, by picking a single configuration file 02_logstash_filter_nginx.conf
, the Nginx log parsing can be tested in isolation.
Write the unit test
Now let’s test the 02_logstash_filter_nginx.conf
file alone and write a simple Ruby test case.
As you may know, Logstash is written in JRuby.
# encoding: utf-8
require "logstash/devutils/rspec/spec_helper"
# Load the configuration file
@@configuration = String.new
@@configuration << File.read("conf/02_logstash_nginx_filter.conf")
describe "Nginx filter" do
config(@@configuration) (1)
# Inject input event/message into the pipeline
message = "172.17.0.1 - - [05/Sep/2016:20:06:17 +0000] \"GET /images/logos/hubpress.png HTTP/1.1\" 200 5432 \"http://localhost/\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/51.0.2704.79 Chrome/51.0.2704.79 Safari/537.36\" \"-\""
sample("message" => message, "type" => "nginx") do (2)
# Check the ouput event/message properties
insist { subject.get("type") } == "nginx" (3)
insist { subject.get("@timestamp").to_iso8601 } == "2016-09-05T20:06:17.000Z"
insist { subject.get("verb") } == "GET"
insist { subject.get("request") } == "/images/logos/hubpress.png"
insist { subject.get("response") } == 200
insist { subject.get("bytes") } == 5432
reject { subject.get("tags").include?("_grokparsefailure") }
reject { subject.get("tags").include?("_dateparsefailure") }
end
end
1 | Load configuration file |
2 | Inject input event/message into the pipeline |
3 | Check the ouput event/message properties |
This test uses the JRuby testing framework called RSpec (describe
method).
The config
and sample
functions are located in the Logstash DevUtils library.
The insist
and reject
functions are part of the Ruby Insist assertion library.
Run the unit tests
First we will need to download and install additional development libraries like those mentioned above.
$ logstash-2.4.0/bin/logstash-plugin install --development Installing logstash-devutils, logstash-input-generator, logstash-codec-json, logstash-output-null, logstash-filter-mutate, flores, rspec, stud, pry, rspec-wait, childprocess, ftw, logstash-output-elasticsearch, rspec-sequencing, gmetric, gelf, timecop, jdbc-derby, docker-api, logstash-codec-plain, simplecov, coveralls, longshoreman, rumbster, logstash-filter-kv, logstash-filter-ruby, sinatra, webrick, poseidon, logstash-output-lumberjack, webmock, logstash-codec-line, logstash-filter-grok Installation successful
Now we can run the test, Logstash comes with a rspec
command to run these spec files.
$ logstash-2.4.0/bin/rspec 02_logstash_nginx_filter_spec.rb Using Accessor#strict_set for specs Run options: exclude {:redis=>true, :socket=>true, :performance=>true, :couchdb=>true, :elasticsearch=>true, :elasticsearch_secure=>true, :export_cypher=>true, :integration=>true, :windows=>true} . Finished in 0.115 seconds (files took 0.784 seconds to load) 1 example, 0 failures Randomized with seed 4384
The rspec
command can also run multiple tests at once.
$ logstash-2.4.0/bin/rspec spec -P '**/*_spec.rb'
To prevent test dependencies, they are randomly ordered: This called randomized testing.
Give me the code!
All the code shown in this article is available in Github.
Other posts
- 2020-11-28 Build your own CA with Ansible
- 2020-01-16 Retrieving Kafka Lag
- 2020-01-10 Home temperature monitoring
- 2019-12-10 Kafka connect plugin install
- 2019-07-03 Kafka integration tests