How to monitor your virtual environment on PILW.IO

Now, you have set of virtual machines installed and you would like to see, what is going on in there. Thus you might want to monitor your created virtual environment to see the load and predict the issues. There are many ways to do it. However in this blog article, we will setup systems monitoring environment which is built on three major components:

  • Telegraf – metrics collection component
  • InfluxDB – metrics datastore
  • Grafana – visualisation dashboard

We have Ubuntu system set up for InfluxDB instance. We use upgraded 18.04LTS version.

Install influxdb on Ubuntu 18.04LTS

InfluxDB is an open source database platform for collecting metrics and events from different devices. It is time series database therefore optimised for metrics and events collection.

To install influxdb, first we need to set up the installation repository and add Key to apt package management tool. Following command creates reference to influxdb repository:

$ source /etc/lsb-release 
$ echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
Then we need to import the key:
$ sudo curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -

Key will enable you to install signed packages. Then we can install influxdb from newly added repository:


$ sudo apt-get update
$ sudo apt-get install influxdb

Lets start it and enable the service to be started on next boot:


$ sudo systemctl start influxdb
$ sudo systemctl enable influxdb
$ sudo systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/lib/systemd/system/influxdb.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2019-01-05 23:21:31 EET; 43s ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 3184 (influxd)
    Tasks: 11 (limit: 2320)
   CGroup: /system.slice/influxdb.service
           └─3184 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

If you see something like above, you are good.

Setting up and update firewall

Lets secure the installation from outside attacks. We would need to setup the firewall for the purpose. These are very important steps to follow. We use Ubuntu own firewall ufw. Firstly we need to see the available applications:

$ sudo ufw status
Status: inactive
$ sudo ufw app list
Available applications:
  OpenSSH

Lets enable OpenSSH access in firewall before we will activate the firewall:

$ sudo ufw allow OpenSSH
Rules updated
Rules updated (v6)

Now we are good to enable firewall:

$ sudo ufw enable
Command may disrupt existing ssh connections. Proceed with operation (y|n)? y
Firewall is active and enabled on system startup

Lets see how does the firewall status command look like now:

$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere
OpenSSH (v6)               ALLOW       Anywhere (v6)

Now lets open port in firewall that enables Telegraf to push metrics through – 8086:

$ sudo ufw allow 8086/tcp
Rule added
Rule added (v6)
$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere
8086/tcp                   ALLOW       Anywhere
OpenSSH (v6)               ALLOW       Anywhere (v6)
8086/tcp (v6)              ALLOW       Anywhere (v6)

Install Grafana

Grafana is open source visualisation platform for in examle to visualise data collected with Telegraf and stored in  InfluxDB.

There are few ways to install Grafana, but here is apt based installation that makes your life easier in future in terms of upgradeability. Like with InfluxDB installation, we need to tell the system where to find Grafana installation packages.

$ source /etc/lsb-release 
$ echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
 

Add a key:

$ sudo curl https://packages.grafana.com/gpg.key | sudo apt-key add -

Then we need to update the apt indexes:

$ sudo apt-get update

And finally install the Grafana:

$ sudo apt-get install grafana

Once done, we need to start and enable grafana also on each boot of the system:

$ sudo systemctl start grafana-server
$ sudo systemctl enable  grafana-server
Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.

Lets check the status of Grafana server now:

$ sudo systemctl enable  grafana-server
Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
vtarmo@tick:~$ sudo systemctl status grafana-server
● grafana-server.service - Grafana instance
   Loaded: loaded (/usr/lib/systemd/system/grafana-server.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2019-01-06 08:00:56 EET; 1min 38s ago
     Docs: http://docs.grafana.org
 Main PID: 13355 (grafana-server)
    Tasks: 10 (limit: 2320)
   CGroup: /system.slice/grafana-server.service
           └─13355 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid --packaging=deb cfg:default.paths.logs=/var/log/grafana cfg:default.paths.da

If you see something like above, you are good.

Adjust firewall settings:

We need to open network port in firewall to be able to access Grafana. Grafana can be accessed from port 3000. Lets enable the access to the port 3000 now.

$ sudo ufw allow 3000/tcp
Rule added
Rule added (v6)
$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere
8086/tcp                   ALLOW       Anywhere
3000/tcp                   ALLOW       Anywhere
OpenSSH (v6)               ALLOW       Anywhere (v6)
8086/tcp (v6)              ALLOW       Anywhere (v6)
3000/tcp (v6)              ALLOW       Anywhere (v6)

Bear in mind that we opened port for accessing Grafana from everywhere. There are ways to open access from specific IP addresses or subnets.

Once done, we can open web browser and test if our Grafana dashboard works by pointing browser to http://{IP to our server}:3000. We should be getting something like that:
enter image description here
First login is with username admin and password admin. Grafana will ask you to change the password with first login.

Install Telegraf

Telegraf is agent responsible for collecting and reporting the metrcis and data. Telegraf needs to be running on all monitored systems. You can of course monitor your InfluxDB/Grafana host, but ideally you’d install Telegraf to systems which you want to see in monitoring dashboards.
Installation of Telegraf can be done done from influxdata repositories (Telegraf is component which is part of Influx repository). As we have the repository configured already to apt, we just need to run following command to get Telegraf installed:

$ sudo apt-get install telegraf

And again, we need to start and enable Telegraf to be started after each reboot:

$ sudo systemctl start telegraf
$ sudo systemctl enable telegraf

Lets see the status of Telegraf service:

$ sudo systemctl status telegraf
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
   Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2019-01-06 08:28:19 EET; 1min 41s ago
     Docs: https://github.com/influxdata/telegraf
 Main PID: 14115 (telegraf)
    Tasks: 12 (limit: 2320)
   CGroup: /system.slice/telegraf.service
           └─14115 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

You are done! Next step is to start configuring the newly installed monitoring environment.

Basic configuration

There could be several additional components installed, but we can cover these later on. For now, lets stick to simple next steps.

Once everything is installed, we can do some basic configuration of the monitoring. There is /etc/influxdb/influxdb.conf, that holds the configuration and all default settings after the installation. We can also check these wit command:

$ influxd config > influxdb.generated.conf

You can go through the configuration and get some help from influxdb documentation. Once you have done some changes, you can either copy the file to /etc/influxdb/influxdb.conf or just start influxdb with following command:

$ influx -config influxdb.generated.conf

First we need to have database created in influx which collects all the information. Since we added Telegraf to our system, the database telegraf was added to InfluxDB. Here is what needs to be done:

$ influx
Connected to http://localhost:8086 version 1.7.2
InfluxDB shell version: 1.7.2
Enter an InfluxQL query
>

With that you are logged in to Influx CLI. help command will give you some insight what you can do there, but lets go through couple of important stuff herewith:

> show databases
name: databases
name
----
_internal
telegraf
>

The command shows us that we have two databases already configured – _internal and telegraf. Telegraf database is created usually when started Telegraf first time. If you did not install Telegraf to your monitoring server, you might not see the database either. Then it is ok to create it manually with following command:


> CREATE DATABASE telegraf

_internal database is used by InfluxDB itself for internal metrics. However we can see what is going on with the telegraf database:

> use telegraf
Using database telegraf

We are connected to database telegraf now. And we can issue some commands like:

> show measurements
name: measurements
name
----
cpu
disk
diskio
kernel
mem
processes
swap
system
>

This shows the information metrics collected to the database. Since our Telegraf is not configured yet, we should not have any information stored there. We can test it with:

> SELECT * FROM cpu
> 

There should not be any data stored. You can exit from influx now with exit command.

Configure Telegraf

When all done, we can go ahead and configure Telegraf now to send data to our InfluDB and telegraf database. For that we need to ukpdate/etc/telegraf/telegraf.conf file.

$ sudo nano /etc/telegraf/telegraf.conf

Look for a section [[outputs.influxdb]] and under there uncomment and change following lines:

  urls = [ "http://{IP address to your InfluxDB host}:8086" ]
database = "telegraf"

When creating database, you can also set username and password to store data in database. When done so, you also would need to uncomment and change following lines:

  username = "your influxdb username"
  password = "your influxdb password"

This tells to Telegraf, where to send the collected data. If we would have Telegraf only in our local host, we can describe the IP address as 127.0.0.1. However, if you install Telegraf to different hosts, then you need to change ip address to your influxdb server. Port should be the same.

Under [agent] you need to provide name of your monitored host:

hostname = "{monitored host name}"

Although most of the input plugins are enabled, few you can enable yourself still. Just go through the telegraf configuration file to see the collected data with [[inputs.*]] meanings. Most of the inputs are already enabled and you should be good to start monitoring with restart of telegraf service. 

$ sudo systemctl restart telegraf

Now check the influx database commands once again and you should be seeing lot more information there.

If this is the case, then you are done with the basic configuration. Go ahead and install Telegraf to other hosts also and set influxdb IP address in telegraf.conf file.

When done, you can open Grafana and start to create dashboards for monitoring your systems. There are plenty of grafana dashboards available also from the Grafana website,  which you can install to your monitoring system and enjoy the graphs. Just pay attention that the data source is InfluxDB.

Final word

There are many other solutions available to monitor your environments. They are as easy to install as described herewith. You just need to understand the need to pick the right one. Why we went for InfluxDB/Telegraf setup herewith? InfluxDB is time series database to store event. As soon as event is triggered, it will be recognised by Telegraf and stored in InfluxDB database. So there is a likely all occurred events stored and visible in Grafana view. Disadvantage is that if there hasn’t been any change, you might not see any data in Grafana either or you need to configure graph to store the last state. If you want to see a current or last state, you might need to use system that pulls data from monitored host. However, there is a risk you might miss some important event which is not stored in database.

So choose wisely!

0 Comments

Add Yours →

Leave a Reply