Monitoring
Monitoring operational insights is an important part of any production system. In order to enable live insights into how traffic is flowing through the system, Synqly Embedded exports Prometheus formatted metrics. Prometheus metrics are an industry standard format for tracking operational data over time. Observability platforms such as NewRelic, DataDog, Grafana, and Logz.io all natively support ingesting Prometheus metrics.
Metrics Collection
Synqly Embedded exports metrics on the v1/metrics
API endpoint. In order to ingest the metrics into an observability platform, it is necessary to configure the observability platform's scraper to pull metrics from the embeddded
deployment's v1/metrics
endpoint.
As an example, when running a NewRelic helm chart to scrape metrics from an entire Kubernetes cluster, the only configuration needed is a set of prometheus annotations on the embedded
pod.
Name: embedded-78b977b899-tcmzc
Namespace: synqly-embedded
Priority: 0
Service Account: default
....
Annotations: prometheus.io/path: v1/metrics
prometheus.io/scrape: true
When deploying embedded
via the Synqly Embedded Helm Chart, these annotations can be added by setting the following value in values.yaml
:
...
# Configuration that will be applied to every pod
pods:
...
# true - adds "prometheus.io/scrape": true annotation to all Synqly pods
prometheusScrape: true
Please note, if your metrics ingest tool is configured to only pull metrics from a specific namespace or deployment, it may need to be updated to include embedded
.
Key Metrics
Request Durations
The http_durations_ms
metric tracks how long calls take to complete.
http_durations_ms
supports the following dimension labels:
method
: The HTTP method of the incoming requestpath
: The API endpoint of the incoming requestcode
: The HTTP response code returned byembedded
quantile
: The quantile bucket that the given value represents
As an example:
http_durations_ms{code="204",method="POST",path="/v1/siem",quantile="0.99"} 1
This metric point in time value represents that POST calls to the v1/siem
endpoint which result in a 204
have taken 1ms to complete for the 0.99 percentile of calls. This means 99% of similar calls have taken less than or equal to 1ms.
For more information on Prometheus Quantiles, please refer to Histograms and Summaries.
The http_durations_ms
metrics can be useful for tracking the performance of calls made to embedded
.
embedded
also tracks _sum
and _count
metrics for every code
, method
, and path
combination.
http_durations_ms_sum
: The sum of all request durations for the given label set since the last pod restart. http_durations_ms_count
: The total number of requests for the given label set since the last pod restart.
Both of these metrics support the following labels:
method
: The HTTP method of the incoming requestpath
: The API endpoint of the incoming requestcode
: The HTTP response code returned byembedded
For example:
http_durations_ms_sum{code="200",method="POST",path="/v1/integrations"} 103776
http_durations_ms_count{code="200",method="POST",path="/v1/integrations"} 147
These point in time values show that there have been 147 POST calls made to v1/integrations
that resulted in a 200
response code since the last embedded
restart. The http_durations_ms_sum
metric shows that those 147 calls took a total of 103776ms combined.
http_durations_ms_sum
and http_durations_ms_count
can be useful in combination with a rate function to track average call duration over time. For example, http_durations_ms_sum / http_durations_ms_count
gives the average request duration for the given label set since embedded
last restarted.
Provider Counts
The provider_count
metric represents the number of calls made by a given Synqly Organization to a target Provider since embedded
last restarted.
provider_count
supports the following labels:
organization
: A Synqly Organization in the targetembedded
instancetype
: A Provider
For example:
provider_count{organization="sandbox-embedded-e2e",type="defender"} 97
provider_count{organization="sandbox-embedded-e2e",type="elasticsearch"} 159
provider_count{organization="sandbox-embedded-e2e",type="entra_id"} 39
These point in time values show how many calls have been made by the sandbox-embedded-e2e
Synqly Organization to the target Provider type since embedded
last restarted.
When combined with a rate function in your observability tool of choice, provider_count
can be used to track Provider usage over time across all the Organizations within your embedded
instance.
Kubernetes Pod Metrics
When running Synqly Embedded via the Synqly Embedded Helm Chart, the embedded
Kubernetes Pod metrics provide operational insights into the resource utilization of the Pod.
Kubernetes Pod metrics should be automatically ingested by the Kubernetes data scraper of any major observability platform. For more information on the metrics and what they represent, please refer to Kubernetes Metric Reference.
The following metrics can be used to track the Memory and CPU usage of the embedded
Kubernetes Pod.
container_memory_working_set_bytes
: Represents the amount of memory in use by theembedded
container. Although there are multiple metrics for tracking memory pools,container_memory_working_set_bytes
is the most useful as it represents memory that cannot be safely evicted. If this metric exceeds the Pod's memory request, it is possible the Kubernetes scheduler could evict the Pod with an OOM error.container_cpu_usage_seconds_total
: Represents the cumulative CPU time consumed by the container in core-seconds. This metric can be combined with a rate function to track the per-second CPU usage. If the per-second CPU usage exceeds the CPU request of theembedded
Pod, the Pod could experience increased request latency while waiting for CPU time.