# Scaling Synqly Embedded

Synqly Embedded deployments process all API calls within an `embedded` container. As the the volume of requests increases, the `embedded` container will require more CPU and memory in order to keep request latency within tolerable parameters. Vertical Scaling (i.e. allocating more resources to a single container) is the preferred method for scaling `embedded`. While it is technically possible to run multiple `embedded` replicas, the replicas will compete for database access when updating request metadata, leading to increased request latency.

## Vertical Scaling

Vertical Scaling is the most effective way to increase `embedded` throughput and reduce request latency. When deployed with the Synqly Embedded Helm chart, `embedded` Pod resources can be configured directly within the chart's `values.yaml` file.

### Helm Configuration


```yaml
embedded:

  ...

  # Resource allocations for the `embedded` pod(s)
  resources:
    requests:
      cpu: "0.5"
      memory: "200Mi"
    limits:
      cpu: "1"
      memory: "500Mi"
```

## Horizontal Scaling

Synqly Embedded supports running multiple replicas in order to increase availability. When deployed via the Synqly Embedded Helm chart, any additional `embedded` Pods will automatically skew towards separate Nodes. Additional Pod skews can be patched in to ensure the Pods run in separate Availability Zones. Due to database contention limitations, operators should take care not to run too many `embedded` replicas within a single Synqly Deployment. We recommend limiting the number of `embedded` replicas to a maximum of 3.

When running `embedded` in a multi-replica configuration, mock providers such as `notifications_mock_notifications` may return inconsistent results. If you are utilizing mock providers to test Synqly APIs, we recommend running a single replica during testing.

### Helm Configuration


```yaml
embedded:

  ...

  # Number of `embedded` replicas to run.
  replicas: 1
```