This post is the second (and last) part in this short series about building an Application Performance Monitoring (APM) solution with open source tools.
In part 1, we were able to build a simple solution to collect log statements, business metrics and JVM performance metrics (Logstash, JMX), and to retrieve data out of the store where they are indexed (Elasticsearch).
In part 2, we will configure a rich dashboard to easily visualize the information stored in Elasticsearch.
We will start from what we completed in part 1: Pet Clinic application is ready and exposing logs through logging framework to a text file and publishing metrics out of the JVM through JMX. Logstash is parsing logs and filtering out interesting JVM metrics sending them to Elasticsearch, which is up and running and providing results to queries sent via browser using the REST API.
Step 5 – Running Kibana
The next step is to configure and launch Kibana. Kibana is the visualization platform that will be used to easily consume the information collected from the application (or many applications). Kibana will allows us to create predefined queries, filters and visualizations (a chart of a given type with a given query to feed it). By composing visualizations, Kibana allows to define dashboards, which can be customized to match every stakeholder needs, i.e. JVM performance metrics and debug logs for developers, info logs and business metrics for business analysts, and so on.
Kibana can be downloaded here: https://www.elastic.co/downloads/kibana
Extract the Zip contents to any folder of your choice. We will refer to it as KIBANA_HOME if needed.
In KIBANA_HOME/config/kibana.yml file, we must configure the correct URL (host:port) where Elasticsearch is running, as well as the IP or DNS alias to which the Kibana server will bind:
# Specifies the address to which the Kibana server will bind. IP addresses and host names are both valid values. # The default is 'localhost', which usually means remote machines will not be able to connect. # To allow connections from remote users, set this parameter to a non-loopback address. server.host: "localserver" ... # The URL of the Elasticsearch instance to use for all your queries. elasticsearch.url: "http://localserver:9200"
Next, launch Kibana by running the command:
If everything is ok, you should be able to access Kibana dashboard by accessing to this URL in any browser (replace localserver by your server name or IP address):
Kibana dashboard loads for the first time.
On first run, Kibana is not connected to any index in Elasticsearch, that is, to any time-series database where data is being indexed and stored. One Elasticsearch instance (or cluster) may be holding many different indexes of diverse nature and source. In this case, we will simply connect with the “logstash-*” index – please be sure that the timestamp field is populated as the field that contain the time part of the event before clicking the Create button:
Initializing Kibana with Elasticsearch index to consume.
Next, Kibana will show a list of fields identified from events already indexed. It is possible to select individually which fields are searchable, aggregatable and specific formats if needed. At this point, let’s accept the sensible defaults (by doing nothing).
Click the Discovery icon in the left-menu. The default histogram and event list should load shortly:
Kibana showing the default histogram of events in the “logstash-*” index.
The search box at the top of the page, initially showing ‘*’ as it is visualizing every event in the index, accept any criteria we may need to filter out the events that must be visualized in every moment. For example, let’s submit this query:
app-name:petclinic AND metric_path:*MemoryUsed
The results in the histogram will be filtered to match the given criteria:
Kibana showing filtered results based on a given query.
Step 6 – Create Searches
As it is neither practical nor productive to be typing queries every time, not to add that it requires certain skills that some stakeholders may not possess, Kibana allows for saving and restoring as many search criteria as needed. Every composition of a given query, as well as which columns will be showed in the event list, is stored in what is defined as a Search.
At design time, it is also impractical to create Searches for every single criteria we might be needing. It is better therefore to define carefully what is going to be visualized in an eventual dashboard and fragment it into individual searches that can be created, tested and saved for future usage.
In this example, the following four Searches will be needed:
- All log events, regardless of log level, source app name or source host. Columns to be visualized should include the three aforementioned fields plus the timestamp, the log message and the log class.
- All system and process CPU load events, regardless of source app name or source host. Columns to be visualized should include the two aforementioned fields plus the timestamp, the metric name (metric_path) and the metric value (metric_value_number).
- All heap memory usage events, regardless of source app name or source host. Columns to be visualized should include the two aforementioned fields plus the timestamp, the metric name (metric_path) and the metric value (metric_value_number).
- All call monitor count and average time events, regardless of source app name or source host. Columns to be visualized should include the two aforementioned fields plus the timestamp, the metric name (metric_path) and the metric value (metric_value_number).
6.1 Creating the log events Search
Let’s create, step by step, the log events Search. Let’s start by clicking New in the top menu (to ensure the search is clean of any previous state), and then let’s type this simple query in the search box:
That simple query will filter out events by their type. As Logstash is already adding the type field with the value “log” for every log statement read from log text files, the task of filtering the events is very simple.
Next, before saving the Search, let’s select which fields will be shown as columns in the event list.
As the timestamp field was already identified as the time selector field, that one does not need to be specifically added to the field list. Therefor, locate the fields – type, app-name, host, class, loglevel, logmessage – in the left column and click the Add button that will highlight next to each field name:
Adding fields to the event list.
The result, once all fields are selected, will be like this:
The event list once columns are configured.
And now that the Search is ready, it can be saved by clicking the Save button on the top menu and naming it, for example “app logs”:
Saving a Search.
6.2 Creating the CPU load Search
Next, let’s create the CPU load Search. As the definitions above suggest, this Search will be very similar to all the JMX-based Searches, this one will be explained step-by-step and for the other two, only the query will be explained.
Let’s start by clicking New in the top menu (to clean the state from the previous Search), and then let’s type this simple query in the search box:
type:jmx AND (metric_path:jvm.OperatingSystem.*CpuLoad)
This query is a bit more elaborated, as it filters not only events by type “jmx” but also filters only certain metrics by their name. As for this Search the filter is for all “CpuLoad” metrics, a wildcard can be used to simplify the query. Therefore, the query above will be filtering events and returning values for metric names “jvm.OperatingSystem.SystemCpuLoad” and “jvm.OperatingSystem.ProcessCpuLoad”.
Next, let’s select which fields will be shown as columns in the event list. For this Search, the fields to add are: type, app-name, host, metric_path and metric_value_number:
Selecting columns for CPU load Search.
And now that the Search is ready, let’s save it with a meaningful name like for example “app jmx cpu load”.
6.3 Heap memory usage and call monitor Searches
Once the previous step is completed, only two Searches remain. As already explained, those two are very similar to the previous one. The field selection can be reused (so do not click on New button) so just change the query to filter the right metrics and save the Searches under a different name.
For the heap memory usage Search, this is the query to be used:
type:jmx AND (metric_path:jvm.Memory.HeapMemoryUsage.*)
A possible name for it would be “app jmx memory heap”.
Finally, for the call monitor Search, this is the query to be used:
type:jmx AND (metric_path:app.CallMonitor.CallCount OR metric_path:app.CallMonitor.CallTime)
And the suggested name would be “app jmx call monitor”.
Step 7 – Create Visualizations
Now that Searches are ready, it is time to create the Visualizations.
Each Visualization is a combination of a Search providing with the data, a chart type defining the presentation of the data, and further filters and aggregations to control how data points are layered out in every chart axis, i.e. which data goes in X axis and which goes in Y axis.
At design time, not only Searches but also Visualizations must be defined, to ensure that only those needed are being created.
In this example, the following five Visualizations will be needed:
- A vertical bar chart for log events.
- A line chart for CPU load events, as two overlapping series: one for system CPU and the second for process CPU.
- A line chart for heap memory usage events, as four overlapping series: initial heap, used heap, committed head and maximum heap.
- A line chart for call monitor call count.
- A line chart for call monitor average time.
7.1 Creating the log events Visualization
To start the creation of the Visualization, click on Visualize button in the left menu, and then click on Vertical bar chart option at the bottom of the left column:
Selecting the Vertical bar chart.
The next step is to select which Search will be used. In the right column all saved Searches are listed so it is easy to select “app logs” Search.
For the Y-Axis, ensure that aggregation is Count, and use a friendly custom label for the axis, e.g. “Log events”:
Configuring the Y-Axis.
For the buckets, select a X-Axis type. Aggregation will be a Date Histogram on the @timestamp field. For interval, let’s select Minute and as a custom label for the axis let’s use “Server time”:
Configuring the X-Axis.
Clicking the ‘play’ (‘triangle’) button at the top of the left column will apply the definition and show it on the right column. If the application is being used, logs will start to show in the chart as expected:
Validating the Visualization configuration is Ok.
Once validated and ready, click on Save at the top and enter a meaningful name for the Visualization, for example “app logs”.
7.2 Creating the CPU load Visualization
The CPU load Visualization is started by selecting a Line chart in the list of available types. The configuration process is very similar to the log events one: select the Search (“app jmx cpu load”), Y-Axis configuration and buckets configuration.
For the Y-Axis, the most meaningful aggregation to be selected is Average. By selecting Average, the Visualization will play nicely with the aggregation of metrics across multiple apps/modules/services (identified by app-name field) and instances (identified by host field), as well as giving meaningful values when any filter is applied on top of the base Search, i.e. further filtering metric values when app-name equals “petclinic” or any other app name.
For the buckets, having that there are two time series overlapping in the same chart, it is required to add a Split Lines bucket to filter which values will be shown in each series before adding the X-Axis bucket.
Once the Split Lines bucket is added, use Filters as the aggregation type, and include two filters with the following sub-queries for CPU system load:
And for CPU process load:
It is also a good idea to add a label to each series by clicking the ‘label’ button and entering the actual label:
Configuring the filters for multiple time series along the X-Axis.
Once the Split Lines are configured, add the X-Axis following the same steps as with the previous log events Visualization, apply the configuration (‘play’ button) and check whether it is working as expected on the right column preview:
Validating that the CPU load Visualization is working as expected.
Finally, save the Visualization with name “app jmx cpu load”.
7.3 Creating the heap memory usage Visualization
For the heap memory usage Visualization, the process is similar to the one just described for CPU load. Obviously the Search is “app jmx memory heap” and this time there are four time series to be configured in the Split Lines filters.
These are the sub-queries that must be added as filters for each time series. For the initial memory:
For the used memory:
For the committed memory:
And for the maximum memory:
The rest is straightforward once the steps are well known and followed in the previous Visualizations:
Finally, check the Visualization and save it under the name “app jmx memory heap”:
Checking the heap memory usage Visualization.
7.4 Creating the call monitor Visualizations
For the two call monitor Visualizations, the process may seem straightforward after having created a chart with four series. However, we did something on purpose when the Searches were defined, and we have one Search to feed two Visualizations.
In this case, each Visualization will have a Split Lines bucket with only one filter.
For the call monitor call count Visualization, the filter will be:
type:jmx AND metric_path:app.CallMonitor.CallCount
And for the call monitor average time Visualization, the filter will be:
type:jmx AND metric_path:app.CallMonitor.CallTime
Configure the Y-Axis and X-Axis as with the other Visualizations, enter friendly labels to both axis and finally validate and save the Visualizations.
The call monitor call count Visualization.
The call monitor average time Visualization.
Step 8 – Composing the Dashboard
And now it’s finally the time. All the steps have prepared us for this: creating the dashboard to monitor our application.
Let’s start by clicking the Dashboard button in the left menu and then clicking Add at the top menu. A list of available Visualizations will be presented.
Click on each of them sequentially and they will be added to the dashboard. For the moment do not worry about the layout, we will work on that in a minute. Once the five are added, click the Add button again to hide the list of Visualizations. This is how the dashboard looks like:
The starting point of the dashboard design.
Let’s work on the layout. First, let’s hide data series legends to maximize the real estate for the charts. This is done by clicking on the grey circled arrow buttons:
Hiding legends to maximize chart visualization area.
Next, let’s expand the log histogram to maximum width and enough height to fill in a typical 1080 screen:
Expanding the log histogram.
And that is the last customization needed. Now let’s save the dashboard so it can be used by either opening it or simply referring to it in the URL. Click on the ‘Save’ button at the top menu to do it and provide a meaningful name like “app dashboard”.
Once saved, the monitoring dashboard is finished and ready to be shared with stakeholders by directly providing this URL:
To battle test the dashboard, and the ability to apply filters on top of it by adding queries to the search box at the top, let’s use a simple Apache JMeter script to inject load into the Pet Clinic application (a sample script can be found on Pet Clinic repository in GitHub by following the link provided in part 1 article).
After a few minutes, Kibana will be showing enough historic data in the dashboard and we can ‘ask questions’ to find out more about the behavior of the application and its services and components.
You may further customize the dashboard, or make edits and save them with a different name to fulfill needs from different stakeholders. You might even set up a dark background if you fancy that color schema:
Dark themed Kibana dashboard.
In these 2-part series I have shown how easy is to create a simple application performance monitoring by leveraging open source tools like Elasticsearch, Logstash, Kibana and the Java Management Extensions (JMX).
I hope this is useful to you, and as always, questions and comments will be welcomed.