Skip to main content

Sizing Guide

The consumer service is lightweight and does not require significant system resources under steady-state conditions. This guide helps you plan resource allocation for different scenarios.

Consumer Resource Requirements

Normal Operation

When the consumer is processing messages in real time and not significantly behind the upstream broker:

ResourceMinimumRecommended
vCPUs22-4
Memory2 GB4 GB

This configuration is sufficient for ongoing consumption workloads.

Initial Load or Backlog Processing

When the consumer needs to process a large backlog (for example, over 20 million messages):

ResourceMinimumRecommended
vCPUs4Up to 16
Memory8 GBUp to 32 GB

This allows multiple consumers to run in parallel for efficient catch-up.

Resource Reduction

Once the backlog has been processed and the consumer has caught up to the upstream, resources can be reduced to the normal operating configuration.

Database Considerations

Database as Bottleneck

While the consumer can be scaled horizontally to process messages in parallel, the database is typically the largest bottleneck in the ingestion pipeline.

The database must persist all consumed records, and its performance is limited by:

  • Transaction management
  • Indexing
  • Checkpointing
  • Disk I/O

Key Insight

Adding more consumers beyond a certain point does not improve throughput and may instead reduce efficiency. The maximum sustainable throughput is determined by the database's ability to perform inserts in parallel, not necessarily by the number of consumers.

Database Resource Recommendations

ScenariovCPUsMemoryStorage
Small (< 100 topics)48 GBSSD
Medium (100-500 topics)816 GBSSD
Large (> 500 topics)16+32 GB+SSD

Storage Planning

Main Data Tables

Estimate storage based on:

  • Number of topics
  • Average message size
  • Data retention period

Audit Tables

If auditing is enabled:

  • Plan for 2x or more the storage of main data tables
  • Audit tables grow continuously
  • Implement retention policies

Formula

Estimated Storage = 
(Number of Topics × Avg Records × Avg Record Size) +
(Audit Factor × Main Data Size) +
20% Buffer

Network Requirements

RequirementSpecification
Bandwidth100+ Mbps recommended
Latency< 100ms to Kafka brokers
FirewallAllow outbound to Kafka and Schema Registry

Sample Sizing Scenarios

Scenario 1: Small Deployment

  • Topics: 50
  • Messages/day: 100,000
  • Audit: Enabled

Recommended:

  • Consumer: 2 vCPU, 4 GB RAM
  • Database: 4 vCPU, 8 GB RAM, 100 GB SSD

Scenario 2: Medium Deployment

  • Topics: 200
  • Messages/day: 1,000,000
  • Audit: Enabled

Recommended:

  • Consumer: 4 vCPU, 8 GB RAM
  • Database: 8 vCPU, 16 GB RAM, 500 GB SSD

Scenario 3: Large Deployment

  • Topics: 500+
  • Messages/day: 10,000,000+
  • Audit: Enabled

Recommended:

  • Consumers: Multiple instances, each 4 vCPU, 8 GB RAM
  • Database: 16 vCPU, 32 GB RAM, 1+ TB SSD

Initial Load Sizing

For initial load scenarios with large data volumes:

Sample Workload (Internal Testing)

ParameterValue
Total topics144
Average partitions per topic6
Largest partition1 million messages
Total messages46 million

Configurations Used

  • Consumer instances: 8 (each with 4 GB memory and 4 CPU)
  • Database instance: 8 CPU, 16 GB memory
  • Network: Local, no firewalls or packet inspection

Observed Duration

WorkloadDuration
Total messages (46M)2 hours
With Audit (92M)3 hours 47 mins
With Full Audit (95M)4 hours 17 mins

JVM Tuning

For production deployments, consider JVM options:

java \
-Xms2g \
-Xmx4g \
-XX:MaxDirectMemorySize=64m \
-XX:+UseG1GC \
-jar solifi-consumer-<version>.jar

JVM Settings Reference

SettingPurposeRecommendation
-XmsInitial heap50% of container memory
-XmxMaximum heap75% of container memory
-XX:MaxDirectMemorySizeDirect memory64m-128m
-XX:+UseG1GCGarbage collectorRecommended for large heaps

Monitoring for Sizing

Key metrics to monitor for right-sizing:

MetricHealthy RangeAction if Exceeded
Heap usage< 80%Increase -Xmx
CPU usage< 70%Increase vCPUs
Consumer lagDecreasingAdd consumers
DB connections< pool maxIncrease pool size

Next Steps