Skip to main content

Command Palette

Search for a command to run...

How to create a Grafana Alert Rule.

Updated
2 min read
How to create a Grafana Alert Rule.
T

I am DevOps Engineer from Greece,I also founder of NgCloudOps is providing DevOps and Cloud Consulting Services. I love to work with cutting edge technologies like Kubernetes and of course with other Cloud Native Projects, I love building Continues Delivery Solutions with the use of Flux and ArgoCD.

More information about me can be found in the following url: https://t-velmachos.start.page/

Hello, I would like to show you how easy it is to create a User defined Grafana Alert. In this example I will use a simple PromQL to retrieve the current RAM allocation of a specific Job and evaluate it against a specific set of conditions. For example I would like to know when the specific Job has allocated more than "X" GB of Ram or if these specific Job has stopped.

So, Lets Dive in...

You can access the Official Documentation which include a really helpful video to help you get started.

Use of Kube-state-metrics

Kube-state-metrics listens to the Kubernetes API server and generates metrics about the state of numerous Kubernetes objects, including cron jobs, config maps, pods, and nodes. These metrics are unmodified, unlike kubectl metrics that use the same Kubernetes API but apply some heuristics to display comprehensible and readable messages.

Kube-state-metrics uses the Golang Prometheus client to export metrics in the Prometheus metrics exposition format and expose metrics on an HTTP endpoint. Prometheus can consume the web endpoint.

This tool is not oriented toward performance and health but rather toward cluster-wide, state-based metrics such as the number of desired pod replicas for deployment or the total CPU resources available on a node.

Actual Steps:

Step 1: Access Alerting Section by clicking Alert Icon is located in the Sidebar Section.

Step 2: Create a new Alert Rule as you can see from the following screenshot

Image description

Step 3: Select Prometheus as the preferred data source in the field metric paste the PromQL Query: container_memory_usage_bytes{pod="",container=""}/1073741824 # Convert to GB

Image description

Step 4: Specify the conditions you want trigger the alert when the evaluation of the conditions are violating the defined threshold.

Image description

Step 5: Specify the interval of the evaluation and the time Grafana needs to wait after an alert has breached the defined threshold.

Image description

Finally click the button Save and Exit button is located in the top right corner. Congratulations you have created the your first alert.

Image description

Also you can find some useful PromQL queries from here the official Prometheus Cheetsheet.

I hope you like the tutorial, if you do give a thumps up! and follow me in Twitter, also you can subscribe to my Newsletter in order to avoid missing any of the upcoming tutorials.

Media Attribution

I would like to thank Clark Tibbs for designing the awesome photo I am using in my posts.

More from this blog

T

TVelmachos-DailyDevOps

24 posts

I am a passionate DevOps. I working as a consultant, who can deliver quality services.