The Cost Spectrum
To illustrate the impact of pricing model selection, consider a standard 4 vCPU, 16 GB RAM instance running 24/7 for one year on AWS (m7i.xlarge in us-east-1):
- On-Demand: ~$1,752/year (100% — baseline comparison)
- 1-Year Reserved (No Upfront): ~$1,138/year (35% savings)
- 1-Year Reserved (All Upfront): ~$1,042/year (41% savings)
- 3-Year Reserved (All Upfront): ~$666/year (62% savings)
- Spot Instance (average): ~$526/year (70% savings, but interruptible)
The difference between the most and least expensive option is over $1,200/year for a single instance. For organizations running hundreds of instances, the pricing model directly determines whether cloud spend is $500K or $2M annually.
On-Demand: Maximum Flexibility
On-demand pricing is the most straightforward model: pay for what you use, when you use it, with no commitment. Despite being the most expensive per-hour rate, on-demand is the correct choice in many scenarios.
Use On-Demand When:
- Workload age < 3 months: You need usage data before committing
- Duration < 6 months: Too short for commitment break-even
- Highly variable demand: Auto-scaling groups serving spiky traffic
- Testing and development: Resources that may change frequently
- Disaster recovery: Standby capacity that rarely runs
Pro tip: On-demand doesn't mean "no optimization." You can still right-size, choose cheaper regions, use ARM instances, and schedule shutdowns on non-production resources.
Reserved/Committed: Predictable Savings
Commitment-based discounts trade flexibility for lower prices. The optimal strategy depends on your confidence level and planning horizon.
When to Commit:
- Production databases: These run 24/7 with predictable sizing — ideal for 3-year commitments
- Core application tier: Always-on API servers and web servers — 1-year minimum
- Baseline capacity: The minimum number of instances always running behind your auto-scaler
- Kubernetes cluster nodes: Core/system node pools that always need to be up
Commitment Strategies by Risk Tolerance
Conservative
1-year, No Upfront commitments. Covers only 60-70% of stable baseline. Easy to adjust annually. ~30% savings.
Balanced
Mix of 1-year and 3-year. All Upfront for 3-year (maximum discount). Covers 80% of baseline. ~45% average savings.
Aggressive
3-year All Upfront for all stable workloads (90%+ of baseline). Maximum savings (~55-60%) but least flexibility.
Spot/Preemptible: Maximum Savings
Spot instances deliver the deepest discounts but require architectural considerations to handle interruptions gracefully. The key is designing for failure — which is also a best practice for cloud-native architecture regardless of pricing model.
Spot Architecture Patterns
- Checkpointing: Save progress at regular intervals. When interrupted, resume from the last checkpoint on a new instance. Essential for long-running batch jobs.
- Queue-based processing: Workers pull jobs from a queue (SQS, Pub/Sub, Service Bus). If a worker is interrupted, the job returns to the queue and another worker picks it up.
- Mixed instance groups: Auto-scaling groups that combine on-demand and spot instances. Base capacity runs on-demand; burst capacity runs on spot.
- Multi-instance-type diversification: Spread spot requests across multiple instance types and availability zones to reduce interruption probability.
Spot Interruption Rates
Not all spot instances are equal. Interruption rates vary dramatically by instance type, region, and time. General-purpose instances (m5, m6i) in popular regions (us-east-1) have higher interruption rates (5-15%) compared to less common instance types in less popular regions (<5%). AWS publishes Spot Instance Advisor with historical interruption frequency for each instance type.
The Decision Framework
For each workload in your infrastructure, walk through these four questions:
- Is this workload interruptible?
Yes → Consider spot/preemptible (60-90% savings)
No → Continue to question 2 - Will this workload run for 1+ years?
Yes with high confidence → Continue to question 3
No or uncertain → Use on-demand - Is the resource requirement stable and predictable?
Yes → Reserved/committed pricing is optimal
Variable → Use Savings Plans (AWS) or on-demand with sustained use (GCP) - What is your commitment horizon?
3+ years of certainty → 3-year commitment (maximum savings)
1-2 years → 1-year commitment (balanced savings/flexibility)
Real-World Scenarios
Scenario 1: SaaS Production Stack
A SaaS company running 20 web servers, 5 API servers, and 3 database servers:
- Database servers: 3-year reserved (always needed, predictable size) → 60% savings
- 10 baseline web servers: 1-year Savings Plan → 35% savings
- 10 burst web servers: On-demand via auto-scaling → pay only when needed
- 5 API servers: 1-year reserved → 35% savings
- Estimated total savings: 35-40% vs all on-demand
Scenario 2: Data Processing Pipeline
A data team running nightly ETL jobs on 50-instance Spark clusters:
- Driver nodes (2): On-demand (must complete the job)
- Worker nodes (48): Spot instances with checkpointing → 75% savings
- Instance diversification: Spread across 6 instance types → reduces interruption risk
- Estimated total savings: 70% vs all on-demand
Scenario 3: Development Team
A team of 15 developers each with a development VM:
- Schedule: Auto-stop at 7 PM, auto-start at 9 AM weekdays → 65% runtime reduction
- Instance type: Burstable (T3/B-series) → 30-40% cheaper than general purpose
- Combined savings: ~75% vs always-on general purpose instances
Compare Pricing Models Side by Side
Use CloudMetrics to compare on-demand, reserved, and spot pricing across AWS, Azure, and GCP for your specific instance requirements.
Compare Cloud Prices →