Scaling Synqly Embedded

Synqly Embedded deployments process all API calls within an embedded container. As the the volume of requests increases, the embedded container will require more CPU and memory in order to keep request latency within tolerable parameters. Vertical Scaling (i.e. allocating more resources to a single container) is the preferred method for scaling embedded. While it is technically possible to run multiple embedded replicas, the replicas will compete for database access when updating request metadata, leading to increased request latency.

Vertical Scaling

Vertical Scaling is the most effective way to increase embedded throughput and reduce request latency. When deployed with the Synqly Embedded Helm chart, embedded Pod resources can be configured directly within the chart's values.yaml file.

Helm Configuration

embedded:
  
  ...

  # Resource allocations for the `embedded` pod(s)
  resources:
    requests:
      cpu: "0.5"
      memory: "200Mi"
    limits:
      cpu: "1"
      memory: "500Mi"

Horizontal Scaling

Synqly Embedded supports running multiple replicas in order to increase availability. When deployed via the Synqly Embedded Helm chart, any additional embedded Pods will automatically skew towards separate Nodes. Additional Pod skews can be patched in to ensure the Pods run in separate Availability Zones. Due to database contention limitations, operators should take care not to run too many embedded replicas within a single Synqly Deployment. We recommend limiting the number of embedded replicas to a maximum of 3.

When running embedded in a multi-replica configuration, mock providers such as notifications_mock_notifications may return inconsistent results. If you are utilizing mock providers to test Synqly APIs, we recommend running a single replica during testing.

Helm Configuration

embedded:
  
  ...

  # Number of `embedded` replicas to run.
  replicas: 1