Scaling

Scaling gNMIc clusters horizontally

The gNMIc Operator supports horizontal scaling of collector clusters. This page explains how scaling works and best practices for production deployments.

Scaling a Cluster

To scale a cluster, update the replicas field:

# Scale to 5 replicas
kubectl patch cluster my-cluster --type merge -p '{"spec":{"replicas":5}}'

Or edit the Cluster resource:

spec:
  replicas: 5  # Changed from 3

What Happens When You Scale

Scale Up (e.g., 3 → 5 pods)

Kubernetes creates new pods (gnmic-3, gnmic-4)
Operator waits for pods to be ready
Operator redistributes targets using bounded load rendezvous hashing
Some targets move from existing pods to new pods
Configuration is applied to all pods

Scale Down (e.g., 5 → 3 pods)

Operator redistributes targets away from pods being removed
Configuration is applied to remaining pods
Kubernetes terminates pods (gnmic-4, gnmic-3)
Targets from terminated pods are handled by remaining pods

Target Redistribution

The operator uses bounded load rendezvous hashing to distribute targets:

Stable: Same target tends to stay on same pod
Even: Targets are distributed evenly (within 1-2 of each other)
Minimal churn: Only ~1/(N+1) targets move when adding a pod

Example Distribution

# 10 targets, 3 pods
Pod 0: [target1, target5, target8]      (3 targets)
Pod 1: [target2, target4, target9]      (3 targets)
Pod 2: [target3, target6, target7, target10] (4 targets)

# After scaling to 4 pods
Pod 0: [target1, target5, target8]      (3 targets) - unchanged
Pod 1: [target2, target4]               (2 targets) - lost target9
Pod 2: [target3, target7, target10]     (3 targets) - lost target6
Pod 3: [target6, target9]               (2 targets) - new pod

Best Practices

Start with Appropriate Size

Estimate based on:

Number of targets
Subscription frequency
Data volume per target

Rule of thumb: Start with 1 pod per 50-100 targets for high-frequency subscriptions.

Use Resource Limits

Ensure pods have appropriate resources:

spec:
  resources:
    requests:
      memory: "256Mi"
      cpu: "200m"
    limits:
      memory: "1Gi"
      cpu: "2"

Monitor Before Scaling

Check metrics before scaling:

# CPU usage per pod
rate(container_cpu_usage_seconds_total{pod=~"gnmic-.*"}[5m])

# Memory usage per pod
container_memory_usage_bytes{pod=~"gnmic-.*"}

# Targets per pod (from gNMIc metrics)
gnmic_target_status{cluster="my-cluster"}

Scale Gradually

For large changes, scale gradually:

# Instead of 3 → 10
kubectl patch cluster my-cluster -p '{"spec":{"replicas":5}}'
# Wait for stabilization
kubectl patch cluster my-cluster -p '{"spec":{"replicas":7}}'
# Wait for stabilization
kubectl patch cluster my-cluster -p '{"spec":{"replicas":10}}'

Horizontal Pod Autoscaler

You can use HPA for automatic scaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: gnmic-cluster-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: gnmic-my-cluster
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Note: HPA scales the StatefulSet directly. The operator will detect the change and redistribute targets accordingly.

Considerations

Subscription Duplication

All pods receive all subscriptions. Only targets are distributed. This means:

Each pod maintains subscription definitions
Each pod connects only to its assigned targets
Network overhead scales with number of targets, not subscriptions

Output Connections

All pods connect to all outputs. For outputs like Kafka or Prometheus:

Each pod writes to the same destination
Data is naturally partitioned by target
No deduplication needed

Stateless Operation

gNMIc pods are stateless by design:

No persistent volumes required
Configuration comes from operator via REST API
Targets can move between pods without data loss