Observability
Overview
Observability is a key aspect of running modern applications. Providing robust Observability ensures business are able to meet the following goals:
- Having a holistic view of application's health.
- Identify and diagnose issues to get to the root cause faster.
- Achieve better operational reliability of the entire system.
Observability comprises of three key pillars, namely,
- Metrics - Useful to track occurrence of an event, time taken to perform an action, report current value of a resource (such as CPU, Memory).
- Logs - Useful to track details information about an event especially errors and warnings.
- Traces - Useful to understand how a request was processed across different services, how much time was spent in each service
Quark platform provides Observability for all deployed Workloads so that users of the platform can understand how their applications are performing.
Quark provides multiple options for Observability. Customers can choose their preferred Observability tool / stack depending on their needs. Quark provides the following Observability options:
- Unified Observability - Quark's default Observability solution based on the open source Grafana stack.
- Datadog.
Unified Observability
Quark's Unified Observability is based on the open source Grafana stack. It uses the following open source projects:
- Prometheus + Victoria Metrics - for Metrics.
- Loki - for Logs.
- Tempo - for Traces.
To use Unified Observability in your Clusters,
- Click on Clusters in the left menu and click on Observability tab.
- Click on Add New button at the top right.
- You will be presented a list of Observability stacks that you can add into Quark. Choose Unified Observability and click on Continue.
- Provide a Name and an optional Description.
- Select a Cluster and then select a Node Pool.
- Select a Load Balancer.
Note: The load balancer that you choose will be used to serve the Observability traffic when you use the Dashboards or other ways of querying Observability data. We recommend you create and use a dedicated load balancer for the Observability endpoint for better isolation and performance.
- Select the required components to be enabled as part of Unified Observability
- Under Logs, enable the Loki.
- Under Tracing, enable the Tempo.
- Under Metrics, enable Victoria Metrics and select the Volume Zone.
- Under Agents, enable Grafana Agent Operator, Kube State Metrics and Node Exporter. Optionally enable Push Gateway and Coroot Node Agent.
- Under UI, enable Grafana. Optionally enable Coroot CE.
- Click on the Review button to verify all the details you have filled. Once everything looks good, click on the Create button.
You should be taken back to the Observability list page where you should see your newly added Observability Stack. Once the Observability stack is deployed and fully configured in the cluster, the status of the stack would become Available.
DataDog
To create a DataDog stack,
- Click on Clusters in the left menu and click on Observability tab.
- Click on Add New button at the top right.
- You will be presented a list of Observability stacks that you can add into Quark. Choose the DataDog and click on Continue.
- Provide a Name and an optional Description.
- Select a Cluster and then select a Node Pool.
- Select a Load Balancer.
- Choose a DataDog APIKEY Provider.
To add DataDog Credentials, go to the Providers page and add your DataDog Provider details.Then,select the DataDog provider here.
- Enable the DataDog Operator values.
- Click on the Review button to verify all the details you have filled. Once everything looks good, click on the Create button.
You should be taken back to the Observability list page where you should see your newly added Observability stack. Once the Observability stack is deployed and fully configured in the cluster, the status of the stack would become Available.