Scaling Synqly Embedded
Synqly Embedded deployments process all API calls within an embedded
container. As the the volume of requests increases, the embedded
container will require more CPU and memory in order to keep request latency within tolerable parameters. Vertical Scaling (i.e. allocating more resources to a single container) is the preferred method for scaling embedded
. While it is technically possible to run multiple embedded
replicas, the replicas will compete for database access when updating request metadata, leading to increased request latency.
Vertical Scaling
Vertical Scaling is the most effective way to increase embedded
throughput and reduce request latency. When deployed with the Synqly Embedded Helm chart, embedded
Pod resources can be configured directly within the chart's values.yaml
file.
Helm Configuration
embedded:
...
# Resource allocations for the `embedded` pod(s)
resources:
requests:
cpu: "0.5"
memory: "200Mi"
limits:
cpu: "1"
memory: "500Mi"
Horizontal Scaling
Synqly Embedded supports running multiple replicas in order to increase availability. When deployed via the Synqly Embedded Helm chart, any additional embedded
Pods will automatically skew towards separate Nodes. Additional Pod skews can be patched in to ensure the Pods run in separate Availability Zones. Due to database contention limitations, operators should take care not to run too many embedded
replicas within a single Synqly Deployment. We recommend limiting the number of embedded
replicas to a maximum of 3.
When running embedded
in a multi-replica configuration, mock providers such as notifications_mock_notifications
may return inconsistent results. If you are utilizing mock providers to test Synqly APIs, we recommend running a single replica during testing.
Helm Configuration
embedded:
...
# Number of `embedded` replicas to run.
replicas: 1