Scaling
The gNMIc Operator supports horizontal scaling of collector clusters. This page explains how scaling works and best practices for production deployments.
Scaling a Cluster
To scale a cluster, update the replicas field:
# Scale to 5 replicas
kubectl patch cluster my-cluster --type merge -p '{"spec":{"replicas":5}}'
Or edit the Cluster resource:
spec:
replicas: 5 # Changed from 3
What Happens When You Scale
Scale Up (e.g., 3 → 5 pods)
- Kubernetes creates new pods (
gnmic-3,gnmic-4) - Operator waits for pods to be ready
- Operator redistributes targets using bounded load rendezvous hashing
- Some targets move from existing pods to new pods
- Configuration is applied to all pods
Scale Down (e.g., 5 → 3 pods)
- Operator redistributes targets away from pods being removed
- Configuration is applied to remaining pods
- Kubernetes terminates pods (
gnmic-4,gnmic-3) - Targets from terminated pods are handled by remaining pods
Target Redistribution
The operator uses bounded load rendezvous hashing to distribute targets:
- Stable: Same target tends to stay on same pod
- Even: Targets are distributed evenly (within 1-2 of each other)
- Minimal churn: Only ~1/(N+1) targets move when adding a pod
Example Distribution
# 10 targets, 3 pods
Pod 0: [target1, target5, target8] (3 targets)
Pod 1: [target2, target4, target9] (3 targets)
Pod 2: [target3, target6, target7, target10] (4 targets)
# After scaling to 4 pods
Pod 0: [target1, target5, target8] (3 targets) - unchanged
Pod 1: [target2, target4] (2 targets) - lost target9
Pod 2: [target3, target7, target10] (3 targets) - lost target6
Pod 3: [target6, target9] (2 targets) - new pod
Best Practices
Start with Appropriate Size
Estimate based on:
- Number of targets
- Subscription frequency
- Data volume per target
Rule of thumb: Start with 1 pod per 50-100 targets for high-frequency subscriptions.
Use Resource Limits
Ensure pods have appropriate resources:
spec:
resources:
requests:
memory: "256Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "2"
Monitor Before Scaling
Check metrics before scaling:
# CPU usage per pod
rate(container_cpu_usage_seconds_total{pod=~"gnmic-.*"}[5m])
# Memory usage per pod
container_memory_usage_bytes{pod=~"gnmic-.*"}
# Targets per pod (from gNMIc metrics)
gnmic_target_status{cluster="my-cluster"}
Scale Gradually
For large changes, scale gradually:
# Instead of 3 → 10
kubectl patch cluster my-cluster -p '{"spec":{"replicas":5}}'
# Wait for stabilization
kubectl patch cluster my-cluster -p '{"spec":{"replicas":7}}'
# Wait for stabilization
kubectl patch cluster my-cluster -p '{"spec":{"replicas":10}}'
Horizontal Pod Autoscaler
You can use HPA for automatic scaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gnmic-cluster-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: gnmic-my-cluster
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Note: HPA scales the StatefulSet directly. The operator will detect the change and redistribute targets accordingly.
Considerations
Subscription Duplication
All pods receive all subscriptions. Only targets are distributed. This means:
- Each pod maintains subscription definitions
- Each pod connects only to its assigned targets
- Network overhead scales with number of targets, not subscriptions
Output Connections
All pods connect to all outputs. For outputs like Kafka or Prometheus:
- Each pod writes to the same destination
- Data is naturally partitioned by target
- No deduplication needed
Stateless Operation
gNMIc pods are stateless by design:
- No persistent volumes required
- Configuration comes from operator via REST API
- Targets can move between pods without data loss