Jekyll2020-12-03T09:02:47+00:00/feed.xmlJRald blogGérald's technical blog about Java, Kafka, Elasticsearch, DevOps...Gérald QuintanaBuild your own CA with Ansible2020-11-28T00:00:00+00:002020-11-28T00:00:00+00:00/2020/11/28/Build-your-own-CA-with-Ansible<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>Securing your Kafka, Elasticsearch, Cassandra, or whatever distributed software requires configuring using SSL (also known as TLS) to encrypt communications:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Node to node communication</p>
</li>
<li>
<p>Client to node communication</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Setting up SSL means providing SSL certificates for each node.
But generating SSL certificates is a cumbersome task:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The <a href="https://kafka.apache.org/documentation/#security_ssl">Kafka documentation</a> describes extensively the process.</p>
</li>
<li>
<p>Elasticsearch brings its own <a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/configuring-tls.html#node-certificates">elasticsearch-certutil</a> tool.</p>
</li>
<li>
<p>Datastax also <a href="https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/configuration/secureSSLCertWithCA.html">documents</a> a similar process for Cassandra</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>I will describe here how to generate an SSL certificate for each node using Ansible.
It makes sense as I am also deploying Kafka, Elasticsearch and the like with Ansible.</p>
</div>
<div class="paragraph">
<p>There are several important rules to know when generating certificates:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The name present in the certificate must match the public name of the host.
We can not share the same certificate on all nodes unless using star certificates.
Any TLS client connecting to a node will check that certificate name and hostname matches unless disabling hostname verification.</p>
</li>
<li>
<p>The name present in the certificate, should match the reverse DNS name corresponding to the IP of the host.
Java clients connecting to a node, will do a reverse DNS lookup to get the public name of the host they are connecting to.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>These two rules are meant to prevent <strong>Man in the middle</strong> attacks.
A TLS certificate allows checking you’re talking to the wanted target,
not something in between which could spy and steal information.</p>
</div>
<div class="paragraph">
<p>When a machine has multiple names (think about DNS aliases, virtual hosts), a certificate can contain multiple names.
The main name is called CN (Common Name),
while other names are called SAN (Subject Alt Names).</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="the_certificate_authority">The certificate authority</h2>
<div class="sectionbody">
<div class="paragraph">
<p>As Kafka or Elasticsearch clusters should never be publicly exposed,
using a public certificate authority (Thawte, Verisign and the like) is not necessary.
A self-signed certificate authority local to the cluster or the environment (Dev, Q/A) should be enough.</p>
</div>
<div class="paragraph">
<p>So the first step is to create a certificate authority that will be used to sign the certificates of all hosts belonging to our cluster.
As this step will be done only once, I won’t automate it.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">$ mkdir ownca
$ openssl req -new -x509 \
-days 1825 \ <i class="conum" data-value="1"></i><b>(1)</b>
-extensions v3_ca \ <i class="conum" data-value="2"></i><b>(2)</b>
-keyout ownca/root.key -out ownca/root.crt <i class="conum" data-value="3"></i><b>(3)</b>
Generating a RSA private key
......+++++
....+++++
writing new private key to 'ownca/root.key'
Enter PEM pass phrase: <i class="conum" data-value="4"></i><b>(4)</b>
Verifying - Enter PEM pass phrase:
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:FR <i class="conum" data-value="5"></i><b>(5)</b>
State or Province Name (full name) [Some-State]:.
Locality Name (eg, city) []:.
Organization Name (eg, company) [Internet Widgits Pty Ltd]:eNova Conseil
Organizational Unit Name (eg, section) []:.
Common Name (e.g. server FQDN or YOUR name) []:Root
Email Address []:rootca@enova-conseil.com</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>The CA root certificate will last 5 years</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>This certificate will be used as a CA</td>
</tr>
<tr>
<td><i class="conum" data-value="3"></i><b>3</b></td>
<td>Generate both key and self-signed certificate</td>
</tr>
<tr>
<td><i class="conum" data-value="4"></i><b>4</b></td>
<td>The key is protected with a password</td>
</tr>
<tr>
<td><i class="conum" data-value="5"></i><b>5</b></td>
<td>Information describing the Root certificate</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>For safety reasons, the generated key should be kept secret and stored in a secure place:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>It must not be transfered to target Kafka servers</p>
</li>
<li>
<p>It must not be kept in source control (Git) unless hidden in Ansible Vault password file</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="the_nodes_certificates">The nodes certificates</h2>
<div class="sectionbody">
<div class="paragraph">
<p>This is where Ansible comes in.
As your cluster might have many nodes, automating certificate generation makes sense.
For each target host, I will repeat the same process:</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/images/2020-11-28-Build-your-own-CA-with-Ansible/process.svg" alt="Process">
</div>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>On the target host, generate a key <code>target.key</code> and a CSR (Certificate signing request) <code>target.csr</code></p>
</li>
<li>
<p>Pull the CSR on the control host.</p>
</li>
<li>
<p>Sign the CSR with the CA key.
This will generate a certificate <code>target.crt</code>.</p>
</li>
<li>
<p>Push the generated certificate <code>target.crt</code> on the target host.
The CA certificate <code>root.crt</code> is also pushed.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>As the TLS keys <code>.key</code> are sensitive, they do not travel, they stay where they were generated.
On the contrary, certificates <code>.crt</code> and CSRs <code>.csr</code> only contain public information.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="comment"># Step 1</span>
- <span class="string"><span class="content">name: Generate private key</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">openssl_privatekey</span>:
<span class="key">path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.key</span><span class="delimiter">"</span></span>
- <span class="string"><span class="content">name: Generate CSR</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">openssl_csr</span>:
<span class="key">path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.csr</span><span class="delimiter">"</span></span>
<span class="key">privatekey_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.key</span><span class="delimiter">"</span></span>
<span class="key">country_name</span>: <span class="string"><span class="content">FR</span></span>
<span class="key">organization_name</span>: <span class="string"><span class="delimiter">"</span><span class="content">eNova Conseil</span><span class="delimiter">"</span></span>
<span class="key">common_name</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_name }}</span><span class="delimiter">"</span></span>
<span class="key">subject_alt_name</span>: <span class="string"><span class="delimiter">"</span><span class="content">DNS:{{ ansible_host }},DNS:{{ ansible_fqdn }}</span><span class="delimiter">"</span></span>
<span class="comment"># Step 2</span>
- <span class="string"><span class="content">name: Pull CSR</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">fetch</span>:
<span class="key">src</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.csr</span><span class="delimiter">"</span></span>
<span class="key">dest</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/{{ openssl_name }}.csr</span><span class="delimiter">"</span></span>
<span class="key">flat</span>: <span class="string"><span class="content">true</span></span>
<span class="comment"># Step 3</span>
- <span class="string"><span class="content">name: Sign CSR with CA key</span></span>
<span class="key">connection</span>: <span class="string"><span class="content">local</span></span>
<span class="key">delegate_to</span>: <span class="string"><span class="content">localhost</span></span>
<span class="key">openssl_certificate</span>:
<span class="key">path</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/{{ openssl_name }}.crt</span><span class="delimiter">"</span></span>
<span class="key">csr_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/{{ openssl_name }}.csr</span><span class="delimiter">"</span></span>
<span class="key">ownca_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/root.crt</span><span class="delimiter">"</span></span>
<span class="key">ownca_privatekey_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/root.key</span><span class="delimiter">"</span></span>
<span class="key">provider</span>: <span class="string"><span class="content">ownca</span></span>
<span class="comment"># Step 4</span>
- <span class="string"><span class="content">name: Push certificate</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">copy</span>:
<span class="key">src</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/{{ openssl_name }}.crt</span><span class="delimiter">"</span></span>
<span class="key">dest</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.crt</span><span class="delimiter">"</span></span>
- <span class="string"><span class="content">name: Push CA</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">copy</span>:
<span class="key">src</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_ownca_dir }}/root.crt</span><span class="delimiter">"</span></span>
<span class="key">dest</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/ca-trust/source/anchors/root.pem</span><span class="delimiter">"</span></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Once you have the key, the certificate and CA certificate chain on the target host, you can start using them:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml">- <span class="string"><span class="content">name: Update CA Trust</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">command</span>: <span class="string"><span class="delimiter">"</span><span class="content">update-ca-trust extract</span><span class="delimiter">"</span></span>
- <span class="string"><span class="content">name: Build PKCS12 file containing key and cert</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">openssl_pkcs12</span>:
<span class="key">action</span>: <span class="string"><span class="content">export</span></span>
<span class="key">path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.p12</span><span class="delimiter">"</span></span>
<span class="key">friendly_name</span>: <span class="string"><span class="delimiter">"</span><span class="content">{{ openssl_name }}</span><span class="delimiter">"</span></span>
<span class="key">privatekey_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.key</span><span class="delimiter">"</span></span>
<span class="key">certificate_path</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/tls/private/{{ openssl_name }}.crt</span><span class="delimiter">"</span></span>
<span class="key">other_certificates</span>: <span class="string"><span class="delimiter">"</span><span class="content">/etc/pki/ca-trust/source/anchors/root.pem</span><span class="delimiter">"</span></span>
<span class="key">state</span>: <span class="string"><span class="content">present</span></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The produced PKCS12 file can be used as a Java Keystore. The <code>java_keystore</code> Ansible module can be used to create a JKS file instead.</p>
</div>
<div class="paragraph">
<p>The attentive reader has noticed I am using a bunch of <code>openssl_xxx</code> Ansible modules (namely <code>openssl_privatekey</code>, <code>openssl_csr</code>, <code>openssl_certificate</code> and <code>openssl_pkcs12</code>).
These modules require to have openssl and PyOpenSSL installed on each host.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml">- <span class="string"><span class="content">name: Python OpenSSL package</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">yum</span>:
<span class="key">name</span>:
- <span class="string"><span class="content">pyOpenSSL</span></span>
- <span class="string"><span class="content">python2-pip</span></span>
- <span class="string"><span class="content">ca-certificates</span></span>
- <span class="string"><span class="content">name: Upgrade Python OpenSSL</span></span>
<span class="key">become</span>: <span class="string"><span class="content">true</span></span>
<span class="key">pip</span>:
<span class="key">name</span>: <span class="string"><span class="content">pyOpenSSL>=0.15</span></span></code></pre>
</div>
</div>
</div>
</div>Gérald QuintanaRetrieving Kafka Lag2020-01-16T00:00:00+00:002020-01-16T00:00:00+00:00/2020/01/16/Retrieving-Kafka-lag<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>This article shows how to get Kafka lag for a given consumer group using the Java API.
It’s about implementing part of the <code>kafka-consumer-group</code> command-line tool in pure Java.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/images/2020-01-16-Retrieving-Kafka-lag/lag.svg" alt="Consumer Lag">
</div>
</div>
<div class="paragraph">
<p>To get consumer lag we will go through several steps:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Get consumer group current offset, 4 in the above example</p>
</li>
<li>
<p>Get topic end offset: the producers offset, 8 in the above example</p>
</li>
<li>
<p>Compute the lag: the difference between both</p>
</li>
</ol>
</div>
</div>
</div>
<div class="sect1">
<h2 id="getting_consumer_group_offset">Getting consumer group offset</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Kafka 2.0 introduced an <code>AdminClient</code> class which contains a very useful
<a href="https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/admin/AdminClient.html#listConsumerGroupOffsets-java.lang.String-org.apache.kafka.clients.admin.ListConsumerGroupOffsetsOptions-">listConsumerGroupOffsets</a> method.
This method returns for a given consumer group a dictionary <em>(topic name, partition) → current offset</em></p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="keyword">return</span> adminClient
.listConsumerGroupOffsets(groupId)
.partitionsToOffsetAndMetadata().get();</code></pre>
</div>
</div>
<div class="paragraph">
<p>Obviously, this solution expects consumer offsets to be stored in Kafka’s <code>__consumer_offsets</code> topic.
It does not apply, for example, to some Kafka Connect sink implementations which store their lag in the target data store.</p>
</div>
<div class="paragraph">
<p>The <code>listConsumerGroupOffsets</code> is asynchronous and returns a <code>KafkaFuture</code> (some kind promise) which implements Java’s <code>Future</code>.
My code is blocking, there is room for improvement.</p>
</div>
<div class="paragraph">
<p>To get consumer group Ids, there is a <code>listConsumerGroups</code> in the same <code>AdminClient</code> class:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="keyword">return</span> adminClient
.listConsumerGroups()
.valid()
.thenApply(r -> r.stream()
.map(ConsumerGroupListing::groupId)
.collect(toList())
).get();</code></pre>
</div>
</div>
<div class="paragraph">
<p>By computing the current offset derivative, we could compute the consumer message rate.</p>
</div>
<div class="paragraph">
<p>There is another method to get consumer offsets, it is in the consumer client and is named <a href="https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#committed-java.util.Set-">committed</a>.
Contrary to <code>listConsumerGroupOffsets</code> method, it requires to know the consumed topic partitions.
So it’s useless in our case.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="getting_topic_end_offset">Getting topic end offset</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The <code>KafkaConsumer</code> class contains an
<a href="https://kafka.apache.org/20/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#endOffsets-java.util.Collection-">endOffsets</a> method
to get the end offset of a topic partition.
It returns a dictionary <em>(topic name, partition) → end offset</em></p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="keyword">return</span> consumer.endOffsets(partitions);</code></pre>
</div>
</div>
<div class="paragraph">
<p>By computing the end offset derivative, we could compute the producer message rate.</p>
</div>
<div class="paragraph">
<p>Getting the topic start offset using <a href="https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#beginningOffsets-java.util.Collection-">beginningOffsets</a> method,
we also could compute the topic size per partition.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="joining_offsets_and_computing_lag">Joining offsets and computing lag</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Both consumer offsets and topic end offsets are given per partition.
To compute the lag we have to do a join using the topic partition as key.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="predefined-type">Map</span><TopicPartition, OffsetAndMetadata> consumerGroupOffsets = getConsumerGroupOffsets(groupId);
<span class="predefined-type">Map</span><TopicPartition, <span class="predefined-type">Long</span>> topicEndOffsets = getTopicEndOffsets(groupId, consumerGroupOffsets.keySet());
<span class="predefined-type">Map</span><<span class="predefined-type">Object</span>, <span class="predefined-type">Object</span>> consumerGroupLag = consumerGroupOffsets.entrySet().stream()
.map(entry -> mapEntry(entry.getKey(), <span class="keyword">new</span> OffsetAndLag(topicEndOffsets.get(entry.getKey()), entry.getValue().offset())))</code></pre>
</div>
</div>
<div class="paragraph">
<p>As consumer lag is equal to <em>topic end offset - consumer current offset</em>,
computing it is straightforward:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="type">long</span> lag = endOffset - currentOffset;
<span class="keyword">if</span> (lag < <span class="integer">0</span>) {
lag = <span class="integer">0</span>;
}
<span class="keyword">return</span> lag;</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="conclusion">Conclusion</h2>
<div class="sectionbody">
<div class="paragraph">
<p>We managed to get consumer lag using the Java Kafka client API and a few lines of code.</p>
</div>
<div class="paragraph">
<p>However, I regret several things about this API:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>The <code>endOffsets</code> method is not in the <code>AdminClient</code> class.
If it were the case, instantiating a consumer would be useless.</p>
</li>
<li>
<p>We have to open two connections, and repeat twice the connection settings like <code>bootstrap.servers</code>,
once for the admin client, then for the consumer client.
It would be interesting if they could share options and maybe even the TCP connection.</p>
</li>
<li>
<p><code>AdminClient</code> class often returns <code>KafkaFuture<Something></code>,
the API design is very different from <code>Consumer</code> and <code>Producer</code> clients.
I wonder why they created a <code>KafkaFuture</code> class instead of reusing <code>CompletableFuture</code>.</p>
</li>
</ol>
</div>
</div>
</div>Gérald QuintanaHome temperature monitoring2020-01-10T00:00:00+00:002020-01-10T00:00:00+00:00/2020/01/10/Home-temperature-monitoring<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>I had an unused Raspberry Pi 3 and some free time during holidays.
Here is what I built to monitor temperature and humidity.
It’s just a working prototype but it was both simple and fun to do.</p>
</div>
<div class="paragraph">
<p>I used Grafana and InfluxDB because both run on ARM and Raspberry,
and are very easy to set up.
Both use the Go language, so no specific runtime environment is needed.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/images/2020-01-10-Home-temperature-monitoring/grafana-dashboard.png" alt="Grafana Dashboard">
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="hardware">Hardware</h2>
<div class="sectionbody">
<div class="paragraph">
<p>It all started with the <a href="https://www.raspberryweather.com/">Raspberry Weather</a> web site,
which brought me to the DHT22 and DS18B20 sensors.</p>
</div>
<div class="paragraph">
<p>I bought the temperature module from <a href="https://www.az-delivery.com/products/dht22-temperatursensor-modul">AZ Delivery</a>.
It’s a DHT22 (AM2302) temperature and humidity sensor soldered on tiny board with a small resistance.
As a result, all you need is provided (even jumper wires), you don’t need anything else: no breadboard, no resistance…​
AZ Delivery also provides some documentation about their product as a PDF <a href="https://www.az-delivery.com/products/dht-22-modul-kostenfreies-e-book">e-book</a>.
which explains how to plug this module on either an Arduino or a Raspberry.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="/images/2020-01-10-Home-temperature-monitoring/raspberrypi-dht22.jpg" alt="Raspberry Pi 3">
</div>
</div>
<div class="paragraph">
<p>I used the Raspberry Pi 3 I had, but a smaller or older board should be enough.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="software">Software</h2>
<div class="sectionbody">
<div class="sect2">
<h3 id="raspbian">Raspbian</h3>
<div class="paragraph">
<p>First I flashed a brand new Raspbian Buster on the micro SD card.
The image can be downloaded <a href="https://www.raspberrypi.org/downloads/raspbian/">on Raspberry Pi web site</a>.
With <a href="https://www.raspberrypi.org/documentation/configuration/raspi-config.md">raspi-config</a> I configured the network (WiFi),
and enabled SSH.</p>
</div>
</div>
<div class="sect2">
<h3 id="influxdb">InfluxDB</h3>
<div class="paragraph">
<p>Then I installed InfluxDB on the Raspbian, by adding the InfluxData Debian repository</p>
</div>
<div class="listingblock">
<div class="title">/etc/apt/sources.list.d/influxdb.list</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="ini">deb https://repos.influxdata.com/debian buster stable</code></pre>
</div>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install influxdb
sudo systemctl start influxdb</code></pre>
</div>
</div>
<div class="paragraph">
<p>To check that InfluxDB is running, you can cUrl it:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">curl -s -v http://localhost:8086/ping
...
< HTTP/1.1 204 No Content
< Content-Type: application/json
< Request-Id: c28feb69-30bc-11ea-8628-b827eb3ca438
< X-Influxdb-Build: OSS
< X-Influxdb-Version: 1.7.9</code></pre>
</div>
</div>
</div>
<div class="sect2">
<h3 id="grafana">Grafana</h3>
<div class="paragraph">
<p>As Grafana is concerned, I downloaded the .deb file and installed it as described
on <a href="https://grafana.com/grafana/download?platform=arm">Grafana web site</a></p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">wget https://dl.grafana.com/oss/release/grafana-rpi_6.5.2_armhf.deb
sudo dpkg -i grafana-rpi_6.5.2_armhf.deb</code></pre>
</div>
</div>
<div class="paragraph">
<p>Grafana service is not enabled nor started by default.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server</code></pre>
</div>
</div>
<div class="paragraph">
<p>To check that Grafana is running, you can open a browser on <a href="http://myraspberry.local:3000" class="bare">http://myraspberry.local:3000</a>,
and log in.
The default admin/admin user account can be changed in <code>/etc/grafana/grafana.ini</code> config file.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="code">Code</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The code reads temperature and humidity from the sensor every minute or so.
And then writes the result in the InfluxDB database.</p>
</div>
<div class="paragraph">
<p>I hesitated to do it with Go language,
but I chose Python because I immediately found simple code examples.</p>
</div>
<div class="sect2">
<h3 id="python">Python</h3>
<div class="paragraph">
<p>To boostrap my Python environment, I installed several packages:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">sudo apt-get install build-essential python3-dev python3-openssl \
python3-setuptools python3-pip python3-wheel python3-yaml python3-influxdb</code></pre>
</div>
</div>
<div class="paragraph">
<p>To read from my DHT22 sensor, I used the <a href="https://github.com/adafruit/Adafruit_Python_DHT">Adafruit_Python_DHT</a> package even if it’s deprecated.
I’ll try to use the newer <a href="https://github.com/adafruit/Adafruit_CircuitPython_DHT">Adafruit_CircuitPython_DHT</a> package later.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">sudo pip3 install Adafruit_DHT</code></pre>
</div>
</div>
<div class="paragraph">
<p>The code I wrote is based on provided <a href="https://github.com/adafruit/Adafruit_Python_DHT/tree/master/examples">code samples</a>.</p>
</div>
<div class="paragraph">
<p>To write into the InfluxDB database I used the official Python client.
Its documentation can be found <a href="https://influxdb-python.readthedocs.io/en/latest/">here</a>.
The code I wrote is based on provided <a href="https://github.com/influxdata/influxdb-python/tree/master/examples">code samples</a>.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="sources">Sources</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The InfluxDB database can be created with <code>influx</code> CLI tool</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code>influx -execute 'create database dht22'</code></pre>
</div>
</div>
<div class="paragraph">
<p>The sources are <a href="https://github.com/gquintana/gquintana.github.io/tree/develop/sources/2020-01-10-Home-temperature-monitoring">here</a>,
there are 3 files:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>dht22.py</code>: The Python code to read DHT22 sensor and write the result into InfluxDB</p>
</li>
<li>
<p><code>dht22.yml</code>: The config file telling where the sensor is plugged and which database to use.</p>
</li>
<li>
<p><code>dht22.service</code>: The service unit to place in <code>/etc/systemd/system</code> to automatically start <code>dht22.py</code> when Raspberry Pi boots.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>I had 2 kinds of problems:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>There were holes in the graphs because the polling loop was stuck for more than a minute.
At the moment, I ignore why.</p>
</li>
<li>
<p>The sensor returned out of range values (3200% humidity, -10°C inside).
So I added measure validation to avoid weird graphs.</p>
</li>
</ol>
</div>
</div>
</div>Gérald QuintanaKafka connect plugin install2019-12-10T00:00:00+00:002019-12-10T00:00:00+00:00/2019/12/10/Kafka-connect-plugin-install<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>You want to use a Kafka Connect plugin with stock Apache Kafka,
or you can not use the confluent-hub tool because your server is behind
a firewall. Then this blog post is for you.</p>
</div>
<div class="paragraph">
<p>I’ll show how to get Kafka Connect JDBC running without using <code>confluent-hub install</code>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="download">Download</h2>
<div class="sectionbody">
<div class="paragraph">
<p>First, use the <a href="https://www.confluent.io/hub/">Confluent Hub</a> to find Kafka connect plugins.
Once you’ve found the plugin you were looking for, you should check the Licensing.
Most plugins created by Confluent Inc use the <a href="https://www.confluent.io/confluent-community-license/">Confluent Community License</a>
and are mostly open source.</p>
</div>
<div class="paragraph">
<p>When you click on the Download button, you’ll have to provide an email to get the plugin zip file.</p>
</div>
<div class="paragraph">
<p>I’ll take the Kafka Connect JDBC plugin as an example.
Once you’ve shown your passport to Confluent toll, you’ll get a <code>confluentinc-kafka-connect-jdbc-5.3.1.zip</code>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="install">Install</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Unzip the <code>confluentinc-kafka-connect-jdbc-5.3.1.zip</code> and you’ll get a <code>confluentinc-kafka-connect-jdbc-5.3.1</code> containing:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>lib</code> contains binaries Jar files</p>
</li>
<li>
<p><code>etc</code> contains sample configuration files</p>
</li>
<li>
<p><code>doc</code> contains some documentation and the license file</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Then in your Kafka folder (<code>/opt/kafka_2.12-2.3.1</code>):</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">$ cd /opt/kafka_2.12-2.3.1
$ mkdir plugins <i class="conum" data-value="1"></i><b>(1)</b>
$ cd plugins
$ ln -s /opt/confluentinc-kafka-connect-jdbc-5.3.1/lib jdbc <i class="conum" data-value="2"></i><b>(2)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Create a <code>plugins</code> folder to contain plugins</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Link the <code>lib</code> folder of the plugin in the <code>plugins</code> folder</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect1">
<h2 id="configure">Configure</h2>
<div class="sectionbody">
<div class="paragraph">
<p>In the Kafka Connect configuration file <code>connect-standalone.properties</code> (or <code>connect-distributed.properties</code>),
reference the <code>plugins</code> folder:</p>
</div>
<div class="listingblock">
<div class="title">connect-standalone.properties</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="ini">bootstrap.servers=localhost:9092
plugin.path=/opt/kafka_2.12-2.3.1/plugins <i class="conum" data-value="1"></i><b>(1)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Path to the <code>plugins</code> folder</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Finally use the sample config files in <code>confluentinc-kafka-connect-jdbc-5.3.1/etc</code> to create your own:</p>
</div>
<div class="listingblock">
<div class="title">thing-jdbc-sink.properties</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="ini">name=thing-jdbc-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=thing <i class="conum" data-value="1"></i><b>(1)</b>
connection.url=jdbc:sqlite:thing.db <i class="conum" data-value="2"></i><b>(2)</b>
auto.create=true</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Input topic containing Avro values. This means you’ll need the Avro Converter plugin and the Confluent Schema Registry as well.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Output database</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect1">
<h2 id="run">Run</h2>
<div class="sectionbody">
<div class="paragraph">
<p>To run Kafka Connect in standalone mode just run it with above config files:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">$ bin/connect-standalone.sh config/connect-standalone.properties config/thing-jdbc-sink.properties</code></pre>
</div>
</div>
<div class="paragraph">
<p>To check the connector is properly running, you can cUrl the REST API:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">$ curl -s http://127.0.1.1:8083/connectors/thing-jdbc-sink/status |jq '.'
{
"name": "thing-jdbc-sink",
"connector": {
"state": "RUNNING",
"worker_id": "127.0.1.1:8083"
},
"tasks": [
{
"id": 0,
"state": "RUNNING",
"worker_id": "127.0.1.1:8083"
}
],
"type": "sink"
}</code></pre>
</div>
</div>
</div>
</div>Gérald QuintanaKafka integration tests2019-07-03T00:00:00+00:002019-07-03T00:00:00+00:00/2019/07/03/Kafka-integration-tests<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>You’re developping a Java application plugged to Kafka,
or maybe you’re programming a data processing pipeline based on Kafka Streams. How do you automate tests involving both Java code and Kafka brokers?</p>
</div>
<div class="paragraph">
<p>Such an integration tests should be able to</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Start Zookeeper and then Kafka</p>
</li>
<li>
<p>Send messages into Kafka so as to trigger business code</p>
</li>
<li>
<p>Consume messages from Kafka and check their content</p>
</li>
<li>
<p>Stop Kafka and then Zookeeper</p>
</li>
</ol>
</div>
</div>
</div>
<div class="sect1">
<h2 id="kafka_embedded_in_the_test">Kafka embedded in the test</h2>
<div class="sectionbody">
<div class="paragraph">
<p>As both Kafka and Zookeeper are Java applications, it is possible to control them from Java code. It is possible (have a look at <a href="https://github.com/apache/camel/blob/master/components/camel-kafka/src/test/java/org/apache/camel/component/kafka/embedded/EmbeddedKafkaBroker.java">camel-kafka</a> or <a href="https://github.com/danielwegener/logback-kafka-appender/blob/master/src/test/java/com/github/danielwegener/logback/kafka/util/EmbeddedKafkaCluster.java">logback-kafka-appender</a>), but is not easy.</p>
</div>
<div class="paragraph">
<p>There are many libraries to run an embedded Kafka from JUnit without sweating:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><a href="https://github.com/charithe/kafka-junit">Kafka JUnit</a> by Charith Ellawala</p>
</li>
<li>
<p>Another <a href="https://mguenther.github.io/kafka-junit/">Kafka JUnit</a> by Markus Günther</p>
</li>
<li>
<p><a href="https://docs.spring.io/spring-kafka/docs/2.2.6.RELEASE/reference/html/#testing">Spring Kafka Test</a> by the Spring team</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>The drawback of this solution is that Kafka and Zookeeper servers are started in the same JVM as your test.
So one can fear unexpected behaviour.</p>
</div>
<div class="sect2">
<h3 id="kafka_junit">Kafka JUnit</h3>
<div class="paragraph">
<p>Charith’s Kafka JUnit library is one of the most simple and efficient.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>com.github.charithe<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>kafka-junit<span class="tag"></artifactId></span>
<span class="tag"><version></span>4.1.5<span class="tag"></version></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>This library supports both JUnit 4 & 5.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="annotation">@ExtendWith</span>(KafkaJunitExtension.class) <i class="conum" data-value="1"></i><b>(1)</b>
<span class="annotation">@KafkaJunitExtensionConfig</span>(startupMode = StartupMode.WAIT_FOR_STARTUP)
<span class="directive">public</span> <span class="type">class</span> <span class="class">CharitheMessageServiceIT</span> {
<span class="directive">private</span> <span class="directive">static</span> <span class="directive">final</span> <span class="predefined-type">String</span> TOPIC = <span class="string"><span class="delimiter">"</span><span class="content">kafka_junit</span><span class="delimiter">"</span></span>;
<span class="annotation">@Test</span>
<span class="type">void</span> testSendAndConsume(KafkaHelper kafkaHelper) <span class="directive">throws</span> <span class="exception">Exception</span> { <i class="conum" data-value="2"></i><b>(2)</b>
<span class="predefined-type">String</span> bootstrapServers = kafkaHelper.producerConfig().get(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG).toString();
sendAndConsume(bootstrapServers, TOPIC);
}</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Load JUnit 5 extension that will start an embedded Kafka</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>A <code>kafkaHelper</code> is injected to get embedded Kafka address</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This <code>KafkaHelper</code> contains several methods to easily produce and consume messages</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> ListenableFuture<<span class="predefined-type">List</span><<span class="predefined-type">String</span>>> futureMessages = kafkaHelper.consumeStrings(TOPIC, <span class="integer">3</span>); <i class="conum" data-value="1"></i><b>(1)</b>
kafkaHelper.produceStrings(TOPIC, <span class="string"><span class="delimiter">"</span><span class="content">one</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">two</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">three</span><span class="delimiter">"</span></span>); <i class="conum" data-value="2"></i><b>(2)</b>
<span class="predefined-type">List</span><<span class="predefined-type">String</span>> messages = futureMessages.get(<span class="integer">5</span>, <span class="predefined-type">TimeUnit</span>.SECONDS);
assertThat(messages).contains(<span class="string"><span class="delimiter">"</span><span class="content">one</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">two</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">three</span><span class="delimiter">"</span></span>);</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Start a non blocking consumer</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Produce some messages in a Topic</td>
</tr>
</table>
</div>
</div>
<div class="sect2">
<h3 id="spring_kafka_test">Spring Kafka Test</h3>
<div class="paragraph">
<p>Spring Kafka Test is an addition to Spring Kafka library.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.springframework.kafka<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>spring-kafka-test<span class="tag"></artifactId></span>
<span class="tag"><version></span>${spring-kafka.version}<span class="tag"></version></span>
<span class="tag"><scope></span>test<span class="tag"></scope></span>
<span class="tag"></dependency></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>This library supports only JUnit 4 at the moment,
as a result it contains a JUnit Rule to handle embedded Kafka lifecycle.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="directive">public</span> <span class="type">class</span> <span class="class">SpringMessageServiceIT</span> {
<span class="directive">private</span> <span class="directive">static</span> <span class="directive">final</span> <span class="predefined-type">String</span> TOPIC = <span class="string"><span class="delimiter">"</span><span class="content">spring</span><span class="delimiter">"</span></span>;
<span class="annotation">@ClassRule</span> <i class="conum" data-value="1"></i><b>(1)</b>
<span class="directive">public</span> <span class="directive">static</span> EmbeddedKafkaRule kafka = <span class="keyword">new</span> EmbeddedKafkaRule(<span class="integer">1</span>,
<span class="predefined-constant">false</span>, TOPIC);
<span class="annotation">@Test</span>
<span class="directive">public</span> <span class="type">void</span> testSendAndConsume() <span class="directive">throws</span> <span class="exception">Exception</span> {
sendAndConsume(kafka.getEmbeddedKafka().getBrokersAsString(), TOPIC); <i class="conum" data-value="2"></i><b>(2)</b>
}</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>JUnit 4 Rule that will start an embedded Kafka and create a topic.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>The <code>kafka</code> rule is used to get the embedded Kafka address.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Spring Kafka Test contains a <code>KafkaTestUtils</code> class which is a swiss army knife to write Kafka related tests.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"> <span class="keyword">try</span>(Consumer<<span class="predefined-type">Integer</span>, <span class="predefined-type">String</span>> consumer = <span class="keyword">new</span> KafkaConsumer<<span class="predefined-type">Integer</span>, <span class="predefined-type">String</span>>( <i class="conum" data-value="1"></i><b>(1)</b>
KafkaTestUtils.consumerProps(<span class="string"><span class="delimiter">"</span><span class="content">spring_group</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">true</span><span class="delimiter">"</span></span>, kafka.getEmbeddedKafka()))) {
KafkaTemplate<<span class="predefined-type">Integer</span>, <span class="predefined-type">String</span>> template = <span class="keyword">new</span> KafkaTemplate<>( <i class="conum" data-value="2"></i><b>(2)</b>
<span class="keyword">new</span> DefaultKafkaProducerFactory<>(
KafkaTestUtils.producerProps(kafka.getEmbeddedKafka())));
consumer.subscribe(<span class="predefined-type">Collections</span>.singleton(TOPIC));
template.send(TOPIC, <span class="string"><span class="delimiter">"</span><span class="content">one</span><span class="delimiter">"</span></span>);
template.send(TOPIC, <span class="string"><span class="delimiter">"</span><span class="content">two</span><span class="delimiter">"</span></span>);
ConsumerRecords<<span class="predefined-type">Integer</span>, <span class="predefined-type">String</span>> records = KafkaTestUtils.getRecords(consumer); <i class="conum" data-value="3"></i><b>(3)</b>
assertThat(records).are(value(<span class="string"><span class="delimiter">"</span><span class="content">one</span><span class="delimiter">"</span></span>)); <i class="conum" data-value="4"></i><b>(4)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Use <code>KafkaTestUtils</code> to create a consumer.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Use <code>KafkaTestUtils</code> along with the usual <code>KafkaTemplate</code> to quickly send messages.</td>
</tr>
<tr>
<td><i class="conum" data-value="3"></i><b>3</b></td>
<td>Use <code>KafkaTestUtils</code> to quickly consume messages.</td>
</tr>
<tr>
<td><i class="conum" data-value="4"></i><b>4</b></td>
<td>The <code>KafkaConditions</code> integrates with AssertJ to make received messages simpler.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>Spring Kafka Test is probably the way to go when you’re developping a Spring application.
However this library lacks some syntactic sugar to make tests more readable.</p>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="kafka_in_docker">Kafka in docker</h2>
<div class="sectionbody">
<div class="paragraph">
<p><a href="https://www.testcontainers.org/">Test containers</a> purpose is to start Docker containers from JUnit to do integration tests with any product: MySQL, Elasticsearch, Kafka…​ There is a base module, a Kafka extension and a JUnit 5 extension.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.testcontainers<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>testcontainers<span class="tag"></artifactId></span>
<span class="tag"><version></span>${testcontainers.version}<span class="tag"></version></span>
<span class="tag"><scope></span>test<span class="tag"></scope></span>
<span class="tag"></dependency></span>
<span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.testcontainers<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>kafka<span class="tag"></artifactId></span>
<span class="tag"><version></span>${testcontainers.version}<span class="tag"></version></span>
<span class="tag"><scope></span>test<span class="tag"></scope></span>
<span class="tag"></dependency></span>
<span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.testcontainers<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>junit-jupiter<span class="tag"></artifactId></span>
<span class="tag"><version></span>${testcontainers.version}<span class="tag"></version></span>
<span class="tag"><scope></span>test<span class="tag"></scope></span>
<span class="tag"></dependency></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Testcontainers library is strongly integrated with JUnit 5, a single annotation and you’re done. A JUnit 4 rule is also available.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="annotation">@Testcontainers</span> <i class="conum" data-value="1"></i><b>(1)</b>
<span class="directive">public</span> <span class="type">class</span> <span class="class">ContainersMessageServiceIT</span> {
<span class="directive">private</span> <span class="directive">static</span> <span class="directive">final</span> <span class="predefined-type">String</span> TOPIC = <span class="string"><span class="delimiter">"</span><span class="content">containers</span><span class="delimiter">"</span></span>;
<span class="annotation">@Container</span> <i class="conum" data-value="2"></i><b>(2)</b>
<span class="directive">public</span> KafkaContainer kafka = <span class="keyword">new</span> KafkaContainer(<span class="string"><span class="delimiter">"</span><span class="content">5.2.1</span><span class="delimiter">"</span></span>);
<span class="annotation">@Test</span>
<span class="directive">public</span> <span class="type">void</span> testSendAndConsume() <span class="directive">throws</span> <span class="exception">Exception</span> {
sendAndConsume(kafka.getBootstrapServers(), TOPIC);
}</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Trigger Testcontainers start</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Create a Kafka container.
By default the <a href="https://hub.docker.com/r/confluentinc/cp-kafka">cp-kafka Docker image</a> created by Confluent is used.
As a consequence the version number matches the Confluent Platform version, not Apache Kafka.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>As Testcontainers is a generic library to run containers,
there is no helper class to read/write messages.
Starting a Docker container is slower than starting an embedded Kafka,
but process isolation is stronger.
You are starting the real thing, no hacked Kafka broker, so you are closer to production.
3 Dockers containers are actually used by Testcontainers Kafka.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="kafka_consumer_subscriptions">Kafka Consumer subscriptions</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Dealing with asynchronous code in tests is often painful, Kafka consumers don’t help.</p>
</div>
<div class="paragraph">
<p>It can take a long time for the consumer group controller to be elected, and partitions to be assigned.
Between the consumer bootstrap and the first messages being received,
it can take a second or so.</p>
</div>
<div class="paragraph">
<p>Using a <code>ConsumerRebalanceListener</code> to wait for partitions to be assigned and check which ones are assigned can be useful.</p>
</div>
<div class="paragraph">
<p>The <a href="http://www.awaitility.org/">Awaitility</a> library can aleviate the burden of asynchronous testing.</p>
</div>
</div>
</div>Gérald QuintanaLogging configuration2019-05-17T00:00:00+00:002019-05-17T00:00:00+00:00/2019/05/17/Logging-configuration<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>There is nothing fundamentally new in this article.
It’s just a quick reminder (to myself) about Java logging framework configuration.</p>
</div>
<div class="paragraph">
<p>The configuration files will contain:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Console aka stdout output</p>
</li>
<li>
<p>Rolling file output with both date and size rolling policies.
Rotating the file every day is practical because it allows to find yesterday’s failure cause easily.
Rotating when the file reaches a given size allows to protect against disk flood.</p>
</li>
<li>
<p>Sample log patterns to help formating log file.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Logging libraries support multiple configuration file format: XML, Properties, YAML…​ I chose to use XML, it’s a matter of taste.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="logback">Logback</h2>
<div class="sectionbody">
<div class="listingblock">
<div class="title">pom.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>ch.qos.logback<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>logback-classic<span class="tag"></artifactId></span>
<span class="tag"><scope></span>runtime<span class="tag"></scope></span>
<span class="tag"></dependency></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>Logback implements SLF4J from the ground up.</p>
</div>
<div class="listingblock">
<div class="title">logback.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"><span class="preprocessor"><?xml version="1.0" encoding="UTF-8"?></span>
<span class="tag"><configuration</span> <span class="attribute-name">debug</span>=<span class="string"><span class="delimiter">"</span><span class="content">true</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="1"></i><b>(1)</b>
<span class="comment"><!-- Properties --></span>
<span class="tag"><property</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">log.dir</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">target/log</span><span class="delimiter">"</span></span> <span class="tag">/></span><i class="conum" data-value="2"></i><b>(2)</b>
<span class="comment"><!-- Appenders --></span>
<span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.ConsoleAppender</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><encoder></span><i class="conum" data-value="3"></i><b>(3)</b>
<span class="tag"><pattern></span>%date{HH:mm:ss.SSS} %-5level [%thread] %logger{1} - %msg%n<span class="tag"></pattern></span>
<span class="tag"></encoder></span>
<span class="tag"></appender></span>
<span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.rolling.RollingFileAppender</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><file></span>blog.log<span class="tag"></file></span>
<span class="tag"><rollingPolicy</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><fileNamePattern></span>${log.dir}/blog.%d{yyyy-MM-dd}-%i.log<span class="tag"></fileNamePattern></span>
<span class="tag"><maxFileSize></span>10MB<span class="tag"></maxFileSize></span>
<span class="tag"><maxHistory></span>10<span class="tag"></maxHistory></span>
<span class="tag"><totalSizeCap></span>100MB<span class="tag"></totalSizeCap></span>
<span class="tag"></rollingPolicy></span>
<span class="tag"><encoder></span><i class="conum" data-value="4"></i><b>(4)</b>
<span class="tag"><pattern></span>%date{ISO8601} %-5level [%thread] %logger - %msg%n<span class="tag"></pattern></span>
<span class="tag"></encoder></span>
<span class="tag"></appender></span>
<span class="comment"><!-- Loggers --></span>
<span class="tag"><logger</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">com.github.gquintana.logging</span><span class="delimiter">"</span></span> <span class="attribute-name">level</span>=<span class="string"><span class="delimiter">"</span><span class="content">DEBUG</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><root</span> <span class="attribute-name">level</span>=<span class="string"><span class="delimiter">"</span><span class="content">DEBUG</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><appender-ref</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><appender-ref</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></root></span>
<span class="tag"></configuration></span></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>The <code>debug</code> flag enables Logback startup logs.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>The <code>log.dir</code> property can be overriden at JVM (<code>-Dlog.dir=…​</code>) or OS level.</td>
</tr>
<tr>
<td><i class="conum" data-value="3"></i><b>3</b></td>
<td>The <code>pattern</code> is documented in the <a href="https://logback.qos.ch/manual/layouts.html#ClassicPatternLayout">layout</a> section.</td>
</tr>
<tr>
<td><i class="conum" data-value="4"></i><b>4</b></td>
<td>Format the <code>date</code> in ISO8601.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>You can force Logback to use a specific configuration file using a JVM property <code>-Dlogback.configurationFile=/path/to/config.xml</code>.</p>
</div>
<div class="paragraph">
<p>The <a href="https://logback.qos.ch/manual/index.html">Logback Manual</a> contains detailed information.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="log4j2">Log4J2</h2>
<div class="sectionbody">
<div class="listingblock">
<div class="title">pom.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.apache.logging.log4j<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>log4j-slf4j-impl<span class="tag"></artifactId></span>
<span class="tag"><scope></span>runtime<span class="tag"></scope></span>
<span class="tag"></dependency></span>
<span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.apache.logging.log4j<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>log4j-core<span class="tag"></artifactId></span>
<span class="tag"><scope></span>runtime<span class="tag"></scope></span>
<span class="tag"></dependency></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The Log4J2-SLF4J adapter is in the Log4J2 group.</p>
</div>
<div class="listingblock">
<div class="title">log4j2.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"><span class="preprocessor"><?xml version="1.0" encoding="UTF-8"?></span>
<span class="tag"><Configuration</span> <span class="attribute-name">status</span>=<span class="string"><span class="delimiter">"</span><span class="content">trace</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="1"></i><b>(1)</b>
<span class="comment"><!-- Properties --></span>
<span class="tag"><Properties></span><i class="conum" data-value="2"></i><b>(2)</b>
<span class="tag"><Property</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">logDir</span><span class="delimiter">"</span></span><span class="tag">></span>${sys:log.dir:-target/log}<span class="tag"></Property></span>
<span class="tag"></Properties></span>
<span class="comment"><!-- Appenders --></span>
<span class="tag"><Appenders></span>
<span class="tag"><Console</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="3"></i><b>(3)</b>
<span class="tag"><PatternLayout</span> <span class="attribute-name">pattern</span>=<span class="string"><span class="delimiter">"</span><span class="content">%date{HH:mm:ss.SSS} %-5level [%thread] %logger{1} - %msg%n</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></Console></span>
<span class="tag"><RollingFile</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span>
<span class="attribute-name">fileName</span>=<span class="string"><span class="delimiter">"</span><span class="content">${logDir}/blog.log</span><span class="delimiter">"</span></span>
<span class="attribute-name">filePattern</span>=<span class="string"><span class="delimiter">"</span><span class="content">${logDir}/blog.%d{yyyy-MM-dd}-%i.log.gz</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><PatternLayout></span><i class="conum" data-value="4"></i><b>(4)</b>
<span class="tag"><Pattern></span>%d{ISO8601} %-5level [%thread] %logger %m%n<span class="tag"></Pattern></span>
<span class="tag"></PatternLayout></span>
<span class="tag"><Policies></span>
<span class="tag"><TimeBasedTriggeringPolicy</span><span class="tag">/></span>
<span class="tag"><SizeBasedTriggeringPolicy</span> <span class="attribute-name">size</span>=<span class="string"><span class="delimiter">"</span><span class="content">1m</span><span class="delimiter">"</span></span> <span class="tag">/></span>
<span class="tag"></Policies></span>
<span class="tag"><Strategies></span>
<span class="tag"><DefaultRolloverStrategy</span> <span class="attribute-name">max</span>=<span class="string"><span class="delimiter">"</span><span class="content">10</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></Strategies></span>
<span class="tag"></RollingFile></span>
<span class="tag"></Appenders></span>
<span class="comment"><!-- Loggers --></span>
<span class="tag"><Loggers></span>
<span class="tag"><Logger</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">com.github.gquintana.logging</span><span class="delimiter">"</span></span> <span class="attribute-name">level</span>=<span class="string"><span class="delimiter">"</span><span class="content">debug</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><Root</span> <span class="attribute-name">level</span>=<span class="string"><span class="delimiter">"</span><span class="content">info</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><AppenderRef</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><AppenderRef</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></Root></span>
<span class="tag"></Loggers></span>
<span class="tag"></Configuration></span></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Setting <code>status</code> to <code>trace</code> or <code>debug</code> shows Log4J2 internal logs.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>The <code>logDir</code> property is set from JVM property (<code>-Dlog.dir=…​</code>) with default value.
See <a href="https://logging.apache.org/log4j/2.x/manual/configuration.html#Property_Substitution">Property substitution</a> in documentation.</td>
</tr>
<tr>
<td><i class="conum" data-value="3"></i><b>3</b></td>
<td>The <code>PatternLayout</code> is documented in the <a href="https://logging.apache.org/log4j/2.x/manual/layouts.html#PatternLayout">layout</a> section.</td>
</tr>
<tr>
<td><i class="conum" data-value="4"></i><b>4</b></td>
<td>Format the <code>date</code> in ISO8601.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>You can tell Log4J2 to load a specific configuration file using a JVM property <code>-Dlog4j.configurationFile=/path/to/config.xml</code>.</p>
</div>
<div class="paragraph">
<p>The <a href="http://logging.apache.org/log4j/2.x/manual/index.html">Log4J2 Manual</a> contains extensive documentation.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="log4j1">Log4J1</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Even if it’s deprecated, let’s end with the venerable Log4J v1 library.</p>
</div>
<div class="listingblock">
<div class="title">pom.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><dependency></span>
<span class="tag"><groupId></span>log4j<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>log4j<span class="tag"></artifactId></span>
<span class="tag"><scope></span>runtime<span class="tag"></scope></span>
<span class="tag"></dependency></span>
<span class="tag"><dependency></span>
<span class="tag"><groupId></span>org.slf4j<span class="tag"></groupId></span>
<span class="tag"><artifactId></span>slf4j-log4j12<span class="tag"></artifactId></span>
<span class="tag"><scope></span>runtime<span class="tag"></scope></span>
<span class="tag"></dependency></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The Log4J1-SLF4J adapter is in the SLF4J group.</p>
</div>
<div class="listingblock">
<div class="title">log4j.xml</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"><span class="preprocessor"><?xml version="1.0" encoding="UTF-8" ?></span>
<span class="doctype"><!DOCTYPE log4j:configuration SYSTEM "log4j.dtd"></span>
<span class="tag"><log4j:configuration</span> <span class="attribute-name">xmlns:log4j</span>=<span class="string"><span class="delimiter">"</span><span class="content">http://jakarta.apache.org/log4j/</span><span class="delimiter">"</span></span> <span class="attribute-name">debug</span>=<span class="string"><span class="delimiter">"</span><span class="content">true</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="1"></i><b>(1)</b>
<i class="conum" data-value="2"></i><b>(2)</b>
<span class="comment"><!-- Appenders --></span>
<span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">org.apache.log4j.ConsoleAppender</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><layout</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">org.apache.log4j.PatternLayout</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="3"></i><b>(3)</b>
<span class="tag"><param</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">ConversionPattern</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">%d{HH:mm:ss.SSS} %-5p %c{1} - %m%n</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></layout></span>
<span class="tag"></appender></span>
<span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">org.apache.log4j.RollingFileAppender</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="4"></i><b>(4)</b>
<span class="tag"><param</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">File</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">${log.dir}/blog.log</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><param</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">MaxFileSize</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">10MB</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><param</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">MaxBackupIndex</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">10</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><layout</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">org.apache.log4j.PatternLayout</span><span class="delimiter">"</span></span><span class="tag">></span><i class="conum" data-value="5"></i><b>(5)</b>
<span class="tag"><param</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">ConversionPattern</span><span class="delimiter">"</span></span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">%d{ISO8601} %-5p [%t] %c - %m%n</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></layout></span>
<span class="tag"></appender></span>
<span class="comment"><!-- Loggers --></span>
<span class="tag"><root></span>
<span class="tag"><priority</span> <span class="attribute-name">value</span>=<span class="string"><span class="delimiter">"</span><span class="content">INFO</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><appender-ref</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">CONSOLE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><appender-ref</span> <span class="attribute-name">ref</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"></root></span>
<span class="tag"></log4j:configuration></span></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Like Logback, the <code>debug</code> flag enables Log4J1 internal logs.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>There aren’t any properties in Log4J1.</td>
</tr>
<tr>
<td><i class="conum" data-value="3"></i><b>3</b></td>
<td>The <code>PatternLayout</code> is documented in the <a href="http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PatternLayout.html">JavaDoc</a>.</td>
</tr>
<tr>
<td><i class="conum" data-value="4"></i><b>4</b></td>
<td>There is no DailyRollingFileAppender in Log4J1 unless you add <code>log4j-extras</code> extension.
Even with this extension, one can not mix size and time rollover.</td>
</tr>
<tr>
<td><i class="conum" data-value="5"></i><b>5</b></td>
<td>Format the <code>date</code> in ISO8601.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>You shouldn’t forget <code>file:/</code> if you use specific configuration file with the JVM property <code>-Dlog4j.configuration=file:///path/to/config.xml</code>.</p>
</div>
<div class="paragraph">
<p>The <a href="https://logging.apache.org/log4j/1.2/manual.html">Log4J1 Manual</a> is only a quick introduction. By the time, there was a <a href="https://www.amazon.com/Complete-Log4j-Manual-Ceki-Gulcu/dp/2970036908">book</a> .</p>
</div>
</div>
</div>Gérald QuintanaAnsible collection processing2019-04-25T00:00:00+00:002019-04-25T00:00:00+00:00/2019/04/25/Ansible-collection-processing<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>As a Java developer, I sometimes dream that I can use the Java 8+ Stream API in my Ansible playbook to process list and dict variables.</p>
</div>
<div class="paragraph">
<p>In this article, I’ll show you can process a list of users:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">users</span>:
- <span class="string"><span class="content">id: bouh</span></span>
<span class="key">name</span>: <span class="string"><span class="delimiter">"</span><span class="content">Mary</span><span class="delimiter">"</span></span>
<span class="key">admin</span>: <span class="string"><span class="content">True</span></span>
<span class="key">role</span>: <span class="string"><span class="content">child</span></span>
- <span class="string"><span class="content">id: sulli</span></span>
<span class="key">name</span>: <span class="string"><span class="delimiter">"</span><span class="content">James Sullivan</span><span class="delimiter">"</span></span>
<span class="key">admin</span>: <span class="string"><span class="content">False</span></span>
<span class="key">role</span>: <span class="string"><span class="content">monster</span></span>
- <span class="string"><span class="content">id: bob</span></span>
<span class="key">name</span>: <span class="string"><span class="delimiter">"</span><span class="content">Bob Wazowski</span><span class="delimiter">"</span></span>
<span class="key">admin</span>: <span class="string"><span class="content">False</span></span>
<span class="key">role</span>: <span class="string"><span class="content">assistant</span></span>
- <span class="string"><span class="content">id: celia</span></span>
<span class="key">name</span>: <span class="string"><span class="delimiter">"</span><span class="content">Celia Mae</span><span class="delimiter">"</span></span>
<span class="key">admin</span>: <span class="string"><span class="content">False</span></span>
<span class="key">role</span>: <span class="string"><span class="content">assistant</span></span></code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="jinja_filters">Jinja Filters</h2>
<div class="sectionbody">
<div class="paragraph">
<p>The main tool to transform Ansible variables are Jinja filters. There are 2 libraries of filters available:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>The Jinja <a href="http://jinja.pocoo.org/docs/2.10/templates/#list-of-builtin-filters">builtin filters</a>.
This list can also be found in Jinja source code <a href="https://github.com/pallets/jinja/blob/master/jinja2/filters.py">filters.py</a>.</p>
</li>
<li>
<p>The <a href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html#">Ansible filters</a>
This list can also be found in Ansible source code <a href="https://github.com/ansible/ansible/tree/devel/lib/ansible/plugins/filter">filter</a> directory.</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Filters are similar to Unix or Anguler pipes and can be chained.</p>
</div>
<div class="paragraph">
<p>Like in other data processing libraries there two kinds of operators:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><strong>Mappers</strong> take stream of element and produce a stream of elements: selectattr, rejectattr, map, list</p>
</li>
<li>
<p><strong>Reducers:</strong> take a stream of elements and produce a single element: join, first, last, max, min</p>
</li>
</ul>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">admin_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|selectattr('admin')
|map(attribute='id')
|join(',') }} </span></span><i class="conum" data-value="1"></i><b>(1)</b>
<span class="key">normal_user_count</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|rejectattr('admin')
|list |count }} </span></span><i class="conum" data-value="2"></i><b>(2)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Take the <code>id</code> attribute of <code>users</code> having <code>admin</code> set to true and join them.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Take the <code>users</code> havin <code>admin</code> set to false and count them. As the <code>rejectattr</code> filter returns an iterator, but the <code>count</code> filter requires a list, I have to use <code>list</code> filter to convert it.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>The <code>selectattr</code>/<code>rejectattr</code> filters can take 3 arguments: the attribute, a boolean operator and an argument.
The operator can be chosen among Jinja’s <a href="http://jinja.pocoo.org/docs/2.10/templates/#list-of-builtin-tests">builtin tests</a>.
This list can also be found in Ansible source code <a href="https://github.com/pallets/jinja/blob/master/jinja2/filters.py">tests.py</a>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">assistant_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|selectattr('role', 'equalto', 'assistant')
|map(attribute='id')
|join(',') }}</span></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>With Ansible 2.7+, the <code>map</code> filter can take 3 arguments: the attribute, an operator and arguments.
The operator can be chosen among Jinja filters, and will be applied to each element of the list.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">user_first_names</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|map(attribute='name')
|map('regex_replace', '(\\w+)( .*)?', '\\g<1>')
|join(',') }} </span></span><i class="conum" data-value="1"></i><b>(1)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>For each <code>users</code> take its name and when the regular expression matches apply the replacement, then join the result.</td>
</tr>
</table>
</div>
</div>
</div>
<div class="sect1">
<h2 id="json_query">JSON Query</h2>
<div class="sectionbody">
<div class="paragraph">
<p>Another strategy is to use a JSON Path to walk down the YAML tree.
It’s bit less verbose and bit more powerful than the previous solution.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">jq_admin_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|json_query("[?admin].id")
|join(',') }} </span></span><i class="conum" data-value="1"></i><b>(1)</b>
<span class="key">jq_assistant_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|json_query("[?role == 'assistant'].id")
|join(',') }} </span></span><i class="conum" data-value="2"></i><b>(2)</b></code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Take the <code>id</code> attributes of <code>users</code> having <code>admin</code> set to true and then join them.</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Take the <code>id</code> attributes of <code>users</code> having <code>role</code> set to <code>assistant</code> and then join them.</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This <code>json_query</code> is based on the <code>jmespath</code> Python <a href="https://pypi.org/project/jmespath/">library</a>, this means 2 things:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>You can use <a href="http://jmespath.org/">jmespath.org</a> web site to cook your JSON path query.</p>
</li>
<li>
<p>You’ll have to add the <code>jmespath</code> library to your Python environnement.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Sadly, nesting JMESPath expressions inside Jinja template expressions inside YAML files can be tricky.
This following example fails, even if the JMES path is alright in Python interpreter.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">jq_bid_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ users
|json_query("[?starts_with(id,'b')].id")
|join(',') }}</span></span></code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="conclusion">Conclusion</h2>
<div class="sectionbody">
<div class="paragraph">
<p>It’s possible to transform a variable containing an array into another list.
However it’s still painful to do because neither YAML nor Jinja tare programming languages.
I personnaly regret I can’t invoke Python code from Ansible playbook and use for comprehensions, imagine something like:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">py_admin_user_ids</span>: <span class="string"><span class="delimiter">|</span><span class="content">
{{ ','.join([ user.id for user in users if user.admin ]) }}</span></span></code></pre>
</div>
</div>
</div>
</div>Gérald QuintanaStructured logging with SLF4J and Logback2017-12-01T00:00:00+00:002017-12-01T00:00:00+00:00/2017/12/01/Structured-logging-with-SL-FJ-and-Logback<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>I don’t know who first coined the term <strong>structured logging</strong>.
There is <a href="https://kartar.net/2015/12/structured-logging/">a 2015 blog post by James Turnbull</a> to get started.</p>
</div>
<div class="paragraph">
<p>Python and .Net developers have libraries dedicated to structured logging : <a href="http://www.structlog.org">structlog</a> and <a href="https://serilog.net/">serilog</a>. In this article I will describe how to do structured logging in Java with usual logging libraries like SLF4J et Logback.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="structured_logging_with_slf4j">Structured logging with SLF4J</h2>
<div class="sectionbody">
<div class="paragraph">
<p>All Java developers know how to log a message:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="predefined-type">Logger</span> demoLogger = LoggerFactory.getLogger(<span class="string"><span class="delimiter">"</span><span class="content">logodyssey.DemoLogger</span><span class="delimiter">"</span></span>);
demoLogger.info(<span class="string"><span class="delimiter">"</span><span class="content">Hello world!</span><span class="delimiter">"</span></span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>Properly configured, it produces a log like</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code>21:10:29.178 [Thread-1] INFO logodyssey.DemoLogger - Hello world!</code></pre>
</div>
</div>
<div class="paragraph">
<p>Notice how this "Hello world!" message is qualified with several fields:
a timestamp, a thread Id, a level and a logger/category.</p>
</div>
<div class="paragraph">
<p>This is what the term "structured logging" means,
a log is more than a message string.
The message is associated with contextual information about what was occurring,
it tells more detail about what was going on when this log was printed</p>
</div>
<div class="paragraph">
<p>How can we enrich this contextual information provided by default,
and add the user Id for example?
It is the purpose of the MDC (Mapped Diagnostic Context):</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java">MDC.put(<span class="string"><span class="delimiter">"</span><span class="content">userId</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">gquintana</span><span class="delimiter">"</span></span>);
demoLogger.info(<span class="string"><span class="delimiter">"</span><span class="content">Hello world!</span><span class="delimiter">"</span></span>);
MDC.remove(<span class="string"><span class="delimiter">"</span><span class="content">userId</span><span class="delimiter">"</span></span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>The MDC is a map-like object filled in the Java code,
and used in the back-end logging library to output custom data.
With the adequate configuration, we can get the user Id the log:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code>21:10:29.178 [Thread-1] gquintana INFO logodyssey.DemoLogger - Hello world!</code></pre>
</div>
</div>
<div class="paragraph">
<p>The MDC can store any information about the user (user Id, session Id, token Id), about the current request (request Id, transaction Id), about long running threads (batch instance Id, broker client Id).
Later on, this information will be part of the log.</p>
</div>
<div class="paragraph">
<p>Having this kind of information allows to group logs by user, by request, by processing.
Remember that logs may be scattered across different servers, on different time periods.
These additional fields allow correlating logs belonging to the same scenario and finding answers to questions like "what was the user X doing when he met this nasty error?"</p>
</div>
<div class="paragraph">
<p>Let’s get back to the example, we saw the MDC stores extra information about logs.
The MDC is usually based on a thread local variable, this has two drawbacks:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>It must be properly cleaned after being used, or you may experience information leaks if the thread is reused. Think about thread pools in web servers like Tomcat.</p>
</li>
<li>
<p>The information may no be properly transfered from one thread to another. Think about asynchronous calls.</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>As a result, calling <code>MDC.remove</code>, like the above example, (or <code>MDC.clear</code>) is required to clean the MDC after usage.
In order not to forget to do the housework afterwards, we can use a try-with-resource construct:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="keyword">try</span>(MDC.MDCCloseable mdc = MDC.putCloseable(<span class="string"><span class="delimiter">"</span><span class="content">userId</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">gquintana</span><span class="delimiter">"</span></span>)) {
demoLogger.info(<span class="string"><span class="delimiter">"</span><span class="content">Hello world!</span><span class="delimiter">"</span></span>);
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>It’s better but still verbose.
Hopefully, this kind of code won’t make its way in your business code because it is usually hidden in an interceptor like a Servlet filter, a Spring aspect or a JAXRS interceptor. In Logback, there is a <code>MDCInsertingServletFilter</code> class which can serve as an example.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="json_logging_with_logback">JSON logging with Logback</h2>
<div class="sectionbody">
<div class="paragraph">
<p>At this point, a log is more than simple string,
it is qualified with useful information: timestamp, level, thread, user Id…​
How can we write this data structure on disk or send it over the wire to a log collection tool?
We have to serialize it.
For a human being, a simple text format as shown above is readable enough.
However, for a machine, this is just a word soup without any structure.
In short, to send structured logs to a log collection tool
and benefit from this structure (search by user, by thread…​),
we must use a structured format, like JSON for example.</p>
</div>
<div class="paragraph">
<p>Compared to the Syslog format, another popular log format, the JSON format</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Can properly handle multi-line logs like stack traces/call traces or messages containing line separators (wanted or not)</p>
</li>
<li>
<p>Is a versatile format and can have custom fields like user Id, transaction Id</p>
</li>
<li>
<p>Is more verbose, so compression (GZip or the like) may be required to reduce the weight</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Most popular log collection tools likes Filebeat, Graylog, Fluentd already use some kind of compressed JSON format under the hood.
You should too.</p>
</div>
<div class="paragraph">
<p>Generating JSON logs with Logback is very easy.
I’ll show how to use two Logback extensions,
the <a href="https://github.com/logstash/logstash-logback-encoder">Logstash Logback encoder</a>
and the <a href="https://github.com/qos-ch/logback-contrib/wiki">Logback Contrib</a> library.</p>
</div>
<div class="paragraph">
<p>The first one uses a Logback extension point known as <strong>encoder</strong> that you can plug into any appender:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.FileAppender</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><file></span>log/log-odyssey.log<span class="tag"></file></span>
<span class="tag"><encoder</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">net.logstash.logback.encoder.LogstashEncoder</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><customFields></span>{"application":"log-odyssey"}<span class="tag"></customFields></span>
<span class="tag"></encoder></span>
<span class="tag"></appender></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>It will produce the expected result:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="json">{
<span class="key"><span class="delimiter">"</span><span class="content">@timestamp</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">2017-11-25T21:10:29.178+01:00</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">@version</span><span class="delimiter">"</span></span>: <span class="integer">1</span>,
<span class="key"><span class="delimiter">"</span><span class="content">message</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">Hello world!</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">logger_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">logodyssey.DemoLogger</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">thread_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">Thread-1</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">level</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">INFO</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">level_value</span><span class="delimiter">"</span></span>: <span class="integer">20000</span>,
<span class="key"><span class="delimiter">"</span><span class="content">HOSTNAME</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">my-laptop</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">userId</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">gquintana</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">application</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">log-odyssey</span><span class="delimiter">"</span></span>
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>The Maven coordinates for this library are <code>net.logstash.logback:logstash-logback-encoder:4.11</code>.</p>
</div>
<div class="paragraph">
<p>The second one uses a different extension point called <strong>layout</strong>.
In the end, it looks very similar to the first one, a bit more verbose though:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="xml"> <span class="tag"><appender</span> <span class="attribute-name">name</span>=<span class="string"><span class="delimiter">"</span><span class="content">FILE</span><span class="delimiter">"</span></span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.FileAppender</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><file></span>log/log-odyssey.log<span class="tag"></file></span>
<span class="tag"><encoder</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.core.encoder.LayoutWrappingEncoder</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><layout</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.contrib.json.classic.JsonLayout</span><span class="delimiter">"</span></span><span class="tag">></span>
<span class="tag"><jsonFormatter</span> <span class="attribute-name">class</span>=<span class="string"><span class="delimiter">"</span><span class="content">ch.qos.logback.contrib.jackson.JacksonJsonFormatter</span><span class="delimiter">"</span></span><span class="tag">/></span>
<span class="tag"><appendLineSeparator></span>true<span class="tag"></appendLineSeparator></span>
<span class="tag"></layout></span>
<span class="tag"></encoder></span>
<span class="tag"></appender></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>The result is very close as well, even though the fields are named differently:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="json">{
<span class="key"><span class="delimiter">"</span><span class="content">timestamp</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">1511814391083</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">level</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">INFO</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">thread</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">Thread-1</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">mdc</span><span class="delimiter">"</span></span>: {
<span class="key"><span class="delimiter">"</span><span class="content">userId</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">gquintana</span><span class="delimiter">"</span></span>
},
<span class="key"><span class="delimiter">"</span><span class="content">logger</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">logodyssey.DemoLogger</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">message</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">Hello world!</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">context</span><span class="delimiter">"</span></span>:<span class="string"><span class="delimiter">"</span><span class="content">default</span><span class="delimiter">"</span></span>
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>In order to be on par with the first example, it is possible to subclass the <code>JsonLayout</code> and add custom fields:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="directive">public</span> <span class="type">class</span> <span class="class">CustomJsonLayout</span> <span class="directive">extends</span> JsonLayout {
<span class="annotation">@Override</span>
<span class="directive">protected</span> <span class="type">void</span> addCustomDataToJsonMap(<span class="predefined-type">Map</span><<span class="predefined-type">String</span>, <span class="predefined-type">Object</span>> map, ILoggingEvent event) {
map.put(<span class="string"><span class="delimiter">"</span><span class="content">application</span><span class="delimiter">"</span></span>, <span class="string"><span class="delimiter">"</span><span class="content">log-odyssey</span><span class="delimiter">"</span></span>);
<span class="keyword">try</span> {
map.put(<span class="string"><span class="delimiter">"</span><span class="content">host</span><span class="delimiter">"</span></span>, <span class="predefined-type">InetAddress</span>.getLocalHost().getHostName());
} <span class="keyword">catch</span> (<span class="exception">UnknownHostException</span> e) {
}
}
}</code></pre>
</div>
</div>
<div class="paragraph">
<p>Several Maven dependencies are required <code>ch.qos.logback.contrib:logback-json-classic:0.1.5</code>,
<code>ch.qos.logback.contrib:logback-jackson:0.1.5</code> and <code>com.fasterxml.jackson.core:jackson-databind</code>
for this library to work.</p>
</div>
<div class="paragraph">
<p>In the end these libraries are similar, both use the Jackson library to generate JSON.
Contrary to the above JSON examples which have been prettyfied to be human readable, producing one JSON document per line is better because it is more compact, and each end of line marks the end of a log, there is no multi-line log.
This format is known as <a href="http://ndjson.org/">NDJSON</a> or and <a href="http://jsonlines.org/">JSON Lines</a>.
Logstash and Filebeat can easily read this kind of JSON file.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="conclusion">Conclusion</h2>
<div class="sectionbody">
<div class="paragraph">
<p>A log is more than a textual message, it can be enriched with information at different levels:</p>
</div>
<div class="ulist">
<ul>
<li>
<p>Line of code: message, timestamp, level, threadId, appender…​</p>
</li>
<li>
<p>User or transaction: user Id, session Id…​</p>
</li>
<li>
<p>Deployment unit: application Id, container Id, host Id, environment Id (production, staging)…​</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>Once qualified with this contextual information,
the log message becomes a structured piece of information
and must be processed as such.
Producing logs in JSON format allows to keep that structure
and eases storing these logs in Elasticsearch.
More on that later, it time permits.</p>
</div>
</div>
</div>Gérald QuintanaLog collection in AWS land2017-09-30T00:00:00+00:002017-09-30T00:00:00+00:00/2017/09/30/Log-collection-in-AWS-land<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>In AWS, it is really easy to popup new machines and scale.
The more machines you have, the more important it’s to centralize logs.
In this article, I will describe what I discovered while trying to collect logs from applications deployed on Beanstalk and send them into Elasticsearch.</p>
</div>
<div class="paragraph">
<p>Disclaimer: I am an AWS newbie.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="the_family_picture">The family picture</h2>
<div class="sectionbody">
<div class="imageblock">
<div class="content">
<img src="2017-09-30-Log-collection-in-AWS-land/big-picture.svg" alt="Big picture">
</div>
</div>
<div class="dlist">
<dl>
<dt class="hdlist1">Beanstalk</dt>
<dd>
<p>contains a web server and runs an application (Java, JS, Go…​), both produce logs in a local <code>/var/log/something</code> directory.
There can be multiple instances of the same application for scalability, or it can be different applications in the same environment.</p>
</dd>
<dt class="hdlist1">Cloudwatch</dt>
<dd>
<p>is used to monitor EC2, Beanstalk…​ instances, it is the place where logs and metrics are gathered.
From there you can trigger alerts, schedule tasks…​</p>
</dd>
<dt class="hdlist1">S3</dt>
<dd>
<p>is a file storage where can be used to archive logs on long term and survive instances stop.
However theses logs are not easily searchable because they are compressed files.</p>
</dd>
<dt class="hdlist1">Elasticsearch</dt>
<dd>
<p>can be used to index logs and make them searchable.
A Kibana UI is provided to make search and dashboard building even more easy.</p>
</dd>
<dt class="hdlist1">Lambda</dt>
<dd>
<p>a provided Lambda function is used to bridge logs from Cloudwatch to Elasticsearch</p>
</dd>
</dl>
</div>
<div class="paragraph">
<p>To ship logs into Cloudwatch, an AWSLogs agent is provided.
To archive logs into S3, a script is cron-ed along with logrotate.</p>
</div>
<div class="paragraph">
<p>This article will skip the security (IAM) settings which are required to allow these components to communicate.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="beanstalk">Beanstalk</h2>
<div class="sectionbody">
<div class="paragraph">
<p>There are several components producing logs in a Beanstalk instance:</p>
</div>
<div class="olist arabic">
<ol class="arabic">
<li>
<p>Beanstalk deployment</p>
</li>
<li>
<p>Proxy server: either Apache or Nginx produce Access logs and Error logs</p>
</li>
<li>
<p>Web server: Tomcat, NodeJS…​</p>
</li>
<li>
<p>Application with its own logging framework</p>
</li>
</ol>
</div>
<div class="paragraph">
<p>Each component produces logs with its own format.
Beanstalk knows how to automatically collect logs for the first three components: deploy logs, access logs, web server logs…​
As it knows where this logs are located it is in charge of rotating, archiving on S3, and purging log files.</p>
</div>
<div class="paragraph">
<p>The Beanstalk console allows to <a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.logging.html">download 100</a> lines of each file for a given instance.
It is useful to understand why a deployment fails, but it’s not meant to dig into the logs of a running application cluster.</p>
</div>
<div class="paragraph">
<p>To tell Beanstalk to take care of your application specific log files, just at some configuration files to indicate where they are located:</p>
</div>
<div class="listingblock">
<div class="title">/opt/elasticbeanstalk/tasks/taillogs.d/my-app.conf</div>
<div class="content">
<pre class="CodeRay highlight"><code>/var/log/my-app/my-app.log</code></pre>
</div>
</div>
<div class="listingblock">
<div class="title">/opt/elasticbeanstalk/tasks/bundlelogs.d/my-app.conf</div>
<div class="content">
<pre class="CodeRay highlight"><code>/var/log/my-app/my-app.*.log</code></pre>
</div>
</div>
<div class="paragraph">
<p>The first file allows <code>my-app.log</code> (current log file) content to appear in the Beanstalk console.
The second file allows all (old log files) files to be archived.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="cloudwatch_logs">Cloudwatch Logs</h2>
<div class="sectionbody">
<div class="paragraph">
<p>First of all, the log stream from Beanstalk to Cloudwatch must be enabled in Beanstalk configuration file:</p>
</div>
<div class="listingblock">
<div class="title"> .ebextensions/cloudwatch.config</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="yaml"><span class="key">option_settings</span>:
<span class="error">aws:elasticbeanstalk:cloudwatch:logs</span>:
<span class="key">StreamLogs</span>: <span class="string"><span class="content">true</span></span>
<span class="key">DeleteOnTerminate</span>: <span class="string"><span class="content">false</span></span>
<span class="key">RetentionInDays</span>: <span class="string"><span class="content">30</span></span></code></pre>
</div>
</div>
<div class="paragraph">
<p>This sets up the Cloudwatch log agent.
Provided you’re using an image based on Amazon Linux, you can also install it on any EC2 instance with yum:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="shell">yum update -y
yum install -y awslogs
service awslogs start</code></pre>
</div>
</div>
<div class="paragraph">
<p>Then the Cloudwatch log agent must be configured to watch your custom log files.</p>
</div>
<div class="listingblock">
<div class="title">/etc/awslogs/config/my-app.conf</div>
<div class="content">
<pre class="CodeRay highlight"><code data-lang="toml">[/var/log/my-app/my-app.log]
log_group_name=/aws/elasticbeanstalk/my-app-dev/var/log/my-app/my-app.log
log_stream_name={instance_id}
file=/var/log/my-app/my-app*.log</code></pre>
</div>
</div>
<div class="paragraph">
<p>Log files produced my components (Apache, Tomcat…​) managed by Beanstalk are already configured.
The above config file is only for application specific log files.
This agent support multiline logs (like stacktraces, call trace…​) provided they start with a whitespace (space, tab).
It’s not as powerful as Filebeat or Logstash.</p>
</div>
<div class="paragraph">
<p>At this point, you’ll be able to see logs aggregated from multiple Beanstalk instances in the Cloudwatch console.</p>
</div>
<div class="imageblock">
<div class="content">
<img src="2017-09-30-Log-collection-in-AWS-land/cloudwatch_log_search.png" alt="Cloudwatch log viewer">
</div>
</div>
<div class="paragraph">
<p>It’s better than Beanstalk console to monitor a running platform.
Yet, log search is still limited because logs are not structured (split into fields) and the full text search is simplistic.</p>
</div>
<div class="paragraph">
<p>Using Cloudwatch, it is also possible :</p>
</div>
<div class="ulist">
<ul>
<li>
<p>to raise alerts when a specific pattern is found in logs</p>
</li>
<li>
<p>to extract metrics from logs (HTTP request time, number of 404 errors in Access logs for example) and draw charts</p>
</li>
</ul>
</div>
<div class="paragraph">
<p>More information can be found <a href="https://aws.amazon.com/fr/blogs/aws/cloudwatch-log-service/">here</a>.</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="elasticsearch">Elasticsearch</h2>
<div class="sectionbody">
<div class="paragraph">
<p>To send logs into Elasticsearch and get a better log search experience,
subscribe a log filter to each Cloudwatch log group.
There is a special Lambda which can do Log filtering and send logs to Elasticsearch.
This log filter can be used to split text logs into fields:</p>
</div>
<div class="imageblock">
<div class="content">
<img src="2017-09-30-Log-collection-in-AWS-land/cloudwatch_log_filter.png" alt="Cloudwatch log filter">
</div>
</div>
<div class="paragraph">
<p>This tool can split space delimited logs, like access logs, into fields.
But it’s hard to "grok" more complicated logs with such basic tool.
It supports JSON formatted logs, so a good solution for application logs, is to configure your favorite logging framework to produce JSON logs.</p>
</div>
<div class="paragraph">
<p><a href="https://medium.com/wolox-driving-innovation/centralized-logging-in-microservices-using-aws-cloudwatch-elasticsearch-f5db7a57e553">This article</a> is worth reading.</p>
</div>
<div class="paragraph">
<p>At this point, we can open Kibana and configure an index pattern named <code>cwl-*</code>.
Cloudwatch log filter mimics Logstash and uses a field named <code>@timestamp</code> for timestamp</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="conclusion">Conclusion</h2>
<div class="sectionbody">
<div class="paragraph">
<p>AWS provides all the building blocks to centralize logs and monitor your whole infrastructure.
It’s not hard to collect logs and send them in Elasticsearch.
But it’s also far less powerful than the complete Elastic stack.</p>
</div>
</div>
</div>Gérald QuintanaJava File vs Path2017-09-02T00:00:00+00:002017-09-02T00:00:00+00:00/2017/09/02/Java-File-vs-Path<div id="preamble">
<div class="sectionbody">
<div class="paragraph">
<p>I’ve been using <code>java.io.File</code> and <code>java.io.File*Stream</code> since Java 1.1, a long time ago.
Java 7 introduced a new file API named <strong>NIO2</strong> containing, among others, the <code>java.nio.file.Path</code> and <code>java.nio.file.Files</code> classes.
It took me a while to lose my habits and embrace the new API.</p>
</div>
<div class="paragraph">
<p>Spoiler: The most funny part of this article is at the end!</p>
</div>
</div>
</div>
<div class="sect1">
<h2 id="quick_comparison">Quick comparison</h2>
<div class="sectionbody">
<table class="tableblock frame-all grid-all stretch">
<colgroup>
<col style="width: 50%;">
<col style="width: 50%;">
</colgroup>
<thead>
<tr>
<th class="tableblock halign-left valign-top">java.io.File (class)</th>
<th class="tableblock halign-left valign-top">java.nio.file.Path (interface)</th>
</tr>
</thead>
<tbody>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file = new File("path/to/file.txt")</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>path = Paths.get("path/to/file.txt")</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file = new File(parentFile, "file.txt")</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>path = parentPath.resolve("file.txt")</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.getFileName()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>path.getFileName().toString()</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.getParentFile()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>path.getParent()</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.mkdirs()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.createDirectories(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.length()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.size(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.exists()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.exists(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.delete()</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.delete(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>new FileOutputStream(file)</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.newOutputStream(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>new FileInputStream(file)</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.newInputStream(path)</code></p></td>
</tr>
<tr>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>file.listFiles(filter)</code></p></td>
<td class="tableblock halign-left valign-top"><p class="tableblock"><code>Files.list(path) .filter(filter) .collect(toList())</code></p></td>
</tr>
</tbody>
</table>
<div class="paragraph">
<p>Some additional notes:</p>
</div>
<div class="ulist">
<ul>
<li>
<p><code>Path</code> throws <code>IOException</code> more often than <code>File</code>, and rarely return a <code>boolean</code> to tell if something was done (<code>mkdirs()</code>, <code>delete()</code>)</p>
</li>
<li>
<p><code>File</code> is more object oriented than <code>Path</code>: I regret that <code>size()</code>, <code>exists()</code>…​ methods are not on the <code>Path</code> interface. This is probably due to the fact that this API was added in Java 7, but default methods on interfaces were added later in Java 8.</p>
</li>
<li>
<p><code>Path</code> based <code>InputStream</code>/`OutputStream`s are less expensive from a GC point view. Thanks <a href="https://twitter.com/thekittster/status/905326864251670532">@kittster</a> for mentionning <a href="https://www.cloudbees.com/blog/fileinputstream-fileoutputstream-considered-harmful">this article from Cloudbees</a>.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="sect1">
<h2 id="one_liners">One liners</h2>
<div class="sectionbody">
<div class="paragraph">
<p><code>java.nio.file.Files</code> allows to read, write, copy files in a single line:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java">Files.write(Paths.get(<span class="string"><span class="delimiter">"</span><span class="content">image.png</span><span class="delimiter">"</span></span>), bytes); <i class="conum" data-value="1"></i><b>(1)</b>
<span class="predefined-type">List</span><<span class="predefined-type">String</span>> lines = Files.readAllLines(Paths.get(<span class="string"><span class="delimiter">"</span><span class="content">letter.txt</span><span class="delimiter">"</span></span>), StandardCharsets.UTF_8); <i class="conum" data-value="2"></i><b>(2)</b>
Files.lines(Paths.get(<span class="string"><span class="delimiter">"</span><span class="content">letter.txt</span><span class="delimiter">"</span></span>), StandardCharsets.UTF_8)
.forEach(<span class="predefined-type">System</span>.out::println);</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>Write a binary file</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td>Read a text file</td>
</tr>
</table>
</div>
<div class="paragraph">
<p>This nearly makes Guava IO and Commons IO useless. I regret that there isn’t any method out of the box to read/write a whole file as a single string.</p>
</div>
<div class="paragraph">
<p>Many APIs (JAXB, Jackson to name a few) don’t use <code>Path`s to read/write files, the workaround is usually use an `InputStream</code> or an <code>OutputStream</code>.</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="keyword">try</span>(<span class="predefined-type">InputStream</span> inputStream = Files.newInputStream(path)) {
Thing thing = (Thing) unmarshaller.unmarshal(inputStream);
}</code></pre>
</div>
</div>
</div>
</div>
<div class="sect1">
<h2 id="multiple_file_systems">Multiple file systems</h2>
<div class="sectionbody">
<div class="paragraph">
<p>When the <code>File</code> is only for local files, <code>Path</code> can also be used to access remote files.
A <code>Path</code> is associated to a <code>FileSystem</code>.</p>
</div>
<div class="paragraph">
<p>To create a new <code>Path</code> instances, there is not constructor (<code>Path</code> is interface), we need to call a factory method. The above 2 lines are the same:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java">path = Paths.get(<span class="string"><span class="delimiter">"</span><span class="content">path/to/file.txt</span><span class="delimiter">"</span></span>);
path = FileSystems.getDefault().getPath(<span class="string"><span class="delimiter">"</span><span class="content">path/to/file.txt</span><span class="delimiter">"</span></span>);</code></pre>
</div>
</div>
<div class="paragraph">
<p>As the default file system is the local one, you get a path to a local file.
Depending on the underlying file system, you’ll get a different implementation: <code>sun.nio.fs.UnixPath</code>, <code>sun.nio.fs.WindowsPath</code>…​</p>
</div>
<div class="paragraph">
<p>With this trick in mind, we can read the content of a Zip file, as if we had extracted it:</p>
</div>
<div class="listingblock">
<div class="content">
<pre class="CodeRay highlight"><code data-lang="java"><span class="predefined-type">URI</span> zipUri = <span class="keyword">new</span> <span class="predefined-type">URI</span>(<span class="string"><span class="delimiter">"</span><span class="content">jar:file:/path/to/archive.zip</span><span class="delimiter">"</span></span>)
<span class="keyword">try</span>(FileSystem zipFS = FileSystems.newFileSystem(zipUri, emptyMap())) { <i class="conum" data-value="1"></i><b>(1)</b>
Path zipPath = zipFS.getPath(<span class="string"><span class="delimiter">"</span><span class="content">/archive</span><span class="delimiter">"</span></span>) <i class="conum" data-value="2"></i><b>(2)</b>
Files.list(zipPath)
.map(Path::toString)
.forEach(<span class="predefined-type">System</span>.out::println);
}</code></pre>
</div>
</div>
<div class="colist arabic">
<table>
<tr>
<td><i class="conum" data-value="1"></i><b>1</b></td>
<td>"Mount" the Zip file as a file system</td>
</tr>
<tr>
<td><i class="conum" data-value="2"></i><b>2</b></td>
<td><code>zipPath</code> is of type <code>com.sun.nio.zipfs.ZipPath</code></td>
</tr>
</table>
</div>
<div class="paragraph">
<p>You can even plug additional file systems: <a href="http://docs.oracle.com/javase/7/docs/technotes/guides/io/fsp/zipfilesystemprovider.html">ZIP</a>, <a href="https://github.com/maddingo/nio-fs-provider">SFTP, SMB, WebDAV</a>, <a href="https://github.com/lucastheisen/jsch-nio">SSH/SCP</a>, <a href="https://github.com/Upplication/Amazon-S3-FileSystem-NIO2">Amazon S3</a>, <a href="https://github.com/google/jimfs">In memory</a>, <a href="https://github.com/damiencarol/jsr203-hadoop">HDFS</a>, …​
This almost means you can read a remote file as if it were local.</p>
</div>
</div>
</div>Gérald Quintana