Monitor Cassandra Clusters with Percona PMM - JMX Grafana and Prometheus
While reading this title you may think about what this guy is going to do? Its all about JMX exporter from prometheus and Grafana. Yes, its already implemented by many companies and Grafana has some cool dashboards. But as a DBA, in Searce we are managing many customers and all of them are using many types of databases. So from a DBA’s perspective to monitor all databases in one place is always a great thing right. If you are a MySQL DBA, then you must have heard about PMM. Its an awesome monitoring tool and its open source. Also it has Dashboards for Linux metrics, MongoDB and PostgreSQL.
I remember, 2 years back(2017) I was trying to setup monitoring for a huge Cassandra cluster with MX4J. It was very hard that time to understand the metrics. But now, when I stated using PMM, I became a big fan of it. So I want to integrate the Cassandra monitoring dashboard with PMM.
Before starting this, I was searching about how JMX exporter is behaving and will make any trouble for my cluster and etc. Unfortunately it does in few cases. Then I found this amazing custom JMX exporter which is developed by Criteo. Its has better control over the metrics. So I decided to use this.
Stage 1: Install cassandra_exporter
You don’t need to change anything from your Cassandra’s setting or cassandra-env.sh
Download the exporter:
Create config file
- blacklist - These metrics are never been collected.
- maxScrapFrequencyInSec - Metrics collection frequency.
- Here all the metrics are collected every 10sec and metrics are under 300 are collected every 300 sec.
Start the Exporter:
I have executed this using nohup but you can create a service for this.
Install and Configure PMM Server:
More customized installation: https://www.percona.com/doc/percona-monitoring-and-management/deploy/server/docker.setting-up.html
Install PMM Client on all Cassandra Nodes:
Add Cassandra Node to PMM Server:
Enable Linux Metrics:
It contains common linux monitoring metrics. So 42000 port should be opened to the PMM server.
Stage 3: Add Cassandra to PMM
PMM is having a feature called External services, So the PMM will capture the metrics from your own external services. Enable the Cassandra metrics as an external service.
config.yml file, we have added the listen port as 8080, so our external service will use this port to get the metrics. And this 8080 port should be opened to the PMM server.
Now, metrics are collecting by PMM, but we can’t visualize this without the proper Dashboard. So the critro team has already build a dashboard and published it in Grafana repo. So we can import it from there.
- Go to Grafana -> Click on Plus(+) button -> Import.
- Paste this dashboard id here:
- Click Prometheus as datasource.
Wait to 5 to 10mins. Then you’ll see the data.
Generally its not a good practice to scan the metrics very frequently, For my workload 10sec is fine, But do a complete test in your infra before going to prod setup. Also learn more about this custom exporter from here: https://github.com/criteo/cassandra_exporter/