Saturday 17 February 2018

Setup EFK Stack | Elasticsearch, Fluentd & Kibana With High Availability

How To Setup Fluentd With High Availability
Setup Fluentd With Forwarder-Aggregator Architecture

Fluentd is one of the most popular alternative to logstash because of its features which are missing in logstash. So before setting up fluentd let's have a look and compare both:
  • Fluentd has builtin architecture for high availability (There could be more than one aggregator)
  • Fluentd consumes less memory as compare to logstash
  • Log parsing, tagging is more easy
  • Tag based log routing is possible
Let's start building our centralized logging system using Elasticsearch, Fluentd and Kibana (EFK).
We will be following the architecture which has fluentd-forwarder(td-agent), fluentd-aggregator, Elasticsearch and Kibana. Fluentd-forwarder (the agent) reads the logs from a file and forward the logs to aggregator. Aggregator decides what should be the index_name and which host of Elasticsearch to send logs. The Elasticsearch is on a separate instance to receive logs and it also have kibana setup to visualise elasticsearch data.

# Architecture:

Following is the architecture for High Availability. Here there are multiple log-forwarder on each application node which are forwarding logs to log-aggregators. There are two aggregators shown in below architecture, if one fails then forwarders start sending logs to the second one.

# Video Tutorial


# Setup Elasticsearch & Kibana

We have already covered the setup of Elasticsearch and Kibana in one of our tutorial. Please follow this post to install Elasticsearch and Kibana.

# Log Pattern

We are considering the log format shown below:

INFO  [2018-02-17 17:14:55,827 +0530] [pool-5-thread-4] [] com.amazon.sqs.javamessaging.AmazonSQSExtendedClient: S3 object deleted, Bucket name: sqs-bucket, Object key: 63c1a5b8-4ddc-4136-b086-df6a8486414a.
INFO  [2018-02-17 17:14:56,124 +0530] [pool-5-thread-9] [] com.amazon.sqs.javamessaging.AmazonSQSExtendedClient: S3 object read, Bucket name: sqs-bucket, Object key: 2cc06f96-283f-4da7-9402-f08aab2df999.

# Log Regex

This regex is based on the logs above and needs to be specified in the source section of td-agent.conf file of forwarder.

/^(?<level>[^ ]*)[ \t]+\[(?<time>[^\]]*)\] \[(?<thread>[^\]]*)\] \[(?<request>[^\]]*)\] (?<class>[^ ]*): (?<message>.*)$/

# Setup fluentd-aggregator

We will setup only one aggregator for this tutorial. However, you may setup two aggregators for high availability. On aggregator instance run following command:

curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
sudo apt-get install make libcurl4-gnutls-dev --yes
sudo apt-get install build-essential
sudo /opt/td-agent/embedded/bin/fluent-gem install fluent-plugin-elasticsearch

# After setup edit conf file and customise configuration
sudo vi /etc/td-agent/td-agent.conf

Content of /etc/td-agent/td-agent.conf. Replace host IP with your elasticsearch instance IP.

<source>
  @type forward
   port 24224
</source>

<match myorg.**>
  @type copy
    <store>
    @type file
    path /var/log/td-agent/forward.log
  </store>

  <store>
    @type elasticsearch_dynamic
    #elasticsearch host IP/domain
    host 192.168.1.4
    port 9200
    index_name fluentd-${tag_parts[1]+ "-" + Time.at(time).getlocal("+05:30").strftime(@logstash_dateformat)}

    #logstash_format true
    #logstash_prefix fluentd

    time_format %Y-%m-%dT%H:%M:%S
    #timezone +0530
    include_timestamp true

    flush_interval 10s
  </store>
</match>

Restart fluentd-aggregator process and check the logs with the following command:

sudo service td-agent restart

# check logs
tail -f /var/log/td-agent/td-agent.log

# Setup fluentd-forwarder

To setup forwarder, run following command on application instance.

curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
# customise config in file td-agent.conf
sudo vi /etc/td-agent/td-agent.conf

Content of /etc/td-agent/td-agent.conf. Replace path with the path of your application log and aggregator IP with the IP of your aggregator Instance. You may use domain instead of IPs.

<match td.*.*>
  @type tdlog
  apikey YOUR_API_KEY
  auto_create_table
  buffer_type file
  buffer_path /var/log/td-agent/buffer/td

  <secondary>
    @type file
    path /var/log/td-agent/failed_records
  </secondary>
</match>

## match tag=debug.** and dump to console
<match debug.**>
  @type stdout
</match>

## built-in TCP input
## @see http://docs.fluentd.org/articles/in_forward

<source>
  @type forward
  port 24224
</source>

<source>
  @type http
  port 8888
</source>

## live debugging agent

<source>
  @type debug_agent
  bind 127.0.0.1
  port 24230
</source>

<source>
  @type tail
  path /var/log/myapp.log
  pos_file /var/log/td-agent/myorg.log.pos
  tag myorg.myapp
  format /^(?<level>[^ ]*)[ \t]+\[(?<time>[^\]]*)\] \[(?<thread>[^\]]*)\] \[(?<request>[^\]]*)\] (?<class>[^ ]*): (?<message>.*)$/

  time_format %Y-%m-%d %H:%M:%S,%L %z
  timezone +0530
  time_key time
  keep_time_key true
  types time:time
</source>

<match myorg.**>
   @type copy
   <store>
    @type file
    path /var/log/td-agent/forward.log
  </store>

  <store>
    @type forward
    heartbeat_type tcp

    #aggregator IP
    host 192.168.1.86
    flush_interval 30s
  </store>

  # secondary host is optional
  # <secondary>
  #    host 192.168.0.12
  # </secondary>
</match>

Restart fluentd-forwarder process and check logs with the following command:

sudo service td-agent restart

# check logs
tail -f /var/log/td-agent/td-agent.log

Now after restarting td-agent on both forwarder and aggregator, you can see data being stored to elasticsearch.  When elasticsearch start receiving data from aggregator, you can make index pattern in kibana and start visualising the logs.

# Create Index Pattern In Kibana

Once you start getting logs in Elasticsearch, You can create an index pattern in kibana to visualise the logs. We have specified the index_name in fluentd to be of format fluentd-myapp-2018.02.12, so we will create an index pattern fluentd-* Follow the below steps shown in pictures to create an index pattern.


Finally after creating index pattern, logs will start appearing in Discover tab of dashboard


Hurray!!! You have successfully setup EFK stack to centralise your logging. 

13 comments:

  1. Replies
    1. We are glad that it helped you. Please like and subscribe our youtube channel to keep us motivated
      https://www.youtube.com/watch?v=USCSpeQrVZM
      Keep Learning & Keep Sharing

      Delete
    2. This comment has been removed by the author.

      Delete
  2. I am trying to setup the EFK stack. But im facing with an issue, Im not able to push the logs from Forwarder to Aggregator. Can u please help me in resolving this issue

    ReplyDelete
    Replies
    1. Sure, please check your forwarder logs for the error. You can also follow our video tutorial which exactly explains the above steps. video tutorial will give you the exact idea how am doing the stuff.
      https://www.youtube.com/watch?v=USCSpeQrVZM
      In any case you can post your error from logs and we would be happy to help.

      Delete
    2. This comment has been removed by the author.

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. how to configure it on localhost?

    ReplyDelete
  5. Thanks @Ajit , this document helps me to configure HA Fluentd , Now i am able to index log of my kubernetes Cluster ,
    Good works, Keep it up , Kudos :)

    ReplyDelete
  6. Good doc.Thankyou

    ReplyDelete
  7. Thanks for sharing these information it was really helpful just one query could you please provide some detail about fluentd output kafka

    ReplyDelete
  8. How do we do the same HA setup in Amazon EKS ecosystem?

    ReplyDelete
  9. Is there a way to create Disaster Recovery/Business Continuity setup for a ES 6x cluster in a different region?

    ReplyDelete

 

Copyright @ 2013 Appychip.

Designed by Appychip & YouTube Channel