System Design Cheat Sheets: Quick Reference for Every Key Concept
This is your one-stop quick reference for system design interviews. Every formula, number, pattern, and decision framework you need — condensed into scannable tables and lists. Bookmark this page and review it before your interview. For deeper coverage, see our System Design Interview Guide and Common Questions.
Numbers Every Engineer Should Know
| Operation |
Latency |
Notes |
| L1 cache reference |
0.5 ns |
Fastest memory access |
| L2 cache reference |
7 ns |
14x L1 |
| Main memory (RAM) |
100 ns |
200x L1 |
| SSD random read |
150 μs |
~1,000x RAM |
| HDD seek |
10 ms |
~100,000x RAM |
| Network round trip (same DC) |
500 μs |
0.5 ms |
| Network round trip (cross-continent) |
150 ms |
Use CDN to reduce |
| Read 1 MB from SSD |
1 ms |
~1 GB/s throughput |
| Read 1 MB from network (1 Gbps) |
10 ms |
~100 MB/s |
| Redis GET |
0.1-0.2 ms |
In-memory, very fast |
| Simple DB query (indexed) |
1-5 ms |
With warm cache |
// Time conversions
1 day = 86,400 seconds ≈ 100,000 seconds (for estimation)
1 month = 2.5 million seconds
1 year = 31.5 million seconds
// QPS (Queries Per Second)
QPS = DAU × (avg queries per user per day) / 86,400
Peak QPS = QPS × 2-3 (peak factor)
Write QPS = QPS × write_ratio
// Storage
Storage per year = daily_new_records × 365 × avg_record_size
Total storage (5 years) = Storage per year × 5
// Bandwidth
Incoming BW = write_QPS × avg_request_size
Outgoing BW = read_QPS × avg_response_size
// Cache (80/20 rule)
Cache size = daily_read_requests × 0.2 × avg_response_size
Power of 2 Quick Reference
| Power |
Exact |
Approx |
Name |
| 2^10 |
1,024 |
1 Thousand |
1 KB |
| 2^20 |
1,048,576 |
1 Million |
1 MB |
| 2^30 |
1,073,741,824 |
1 Billion |
1 GB |
| 2^40 |
~1 Trillion |
1 Trillion |
1 TB |
| 2^50 |
~1 Quadrillion |
1 Quadrillion |
1 PB |
CAP Theorem Quick Reference
| Property |
Meaning |
Example |
| Consistency |
All nodes see the same data at the same time |
Bank transactions |
| Availability |
Every request receives a response |
Social media feed |
| Partition Tolerance |
System works despite network partitions |
Required in distributed systems |
| Choose |
Trade-off |
Databases |
| CP |
May be unavailable during partition |
MongoDB, HBase, Redis |
| AP |
May return stale data during partition |
Cassandra, DynamoDB, CouchDB |
| CA |
Not partition tolerant (single node) |
Traditional RDBMS (single node PostgreSQL) |
Consistency Patterns
| Pattern |
Guarantee |
Use Case |
| Strong Consistency |
Reads always return latest write |
Banking, inventory |
| Eventual Consistency |
Reads eventually return latest write |
Social feeds, analytics |
| Read-your-writes |
User sees their own writes immediately |
User profile updates |
| Causal Consistency |
Causally related writes seen in order |
Comment threads |
Caching Strategies
| Strategy |
How It Works |
Best For |
| Cache-Aside (Lazy) |
App reads cache first; on miss, reads DB and populates cache |
Read-heavy, general purpose |
| Write-Through |
Write to cache and DB simultaneously |
When data freshness is critical |
| Write-Behind (Back) |
Write to cache; async write to DB later |
Write-heavy with eventual consistency OK |
| Write-Around |
Write directly to DB; cache populated on read |
Data rarely re-read after write |
| Read-Through |
Cache loads from DB on miss transparently |
Simplified application code |
Load Balancing Algorithms
| Algorithm |
How It Works |
Best For |
| Round Robin |
Cycles through servers sequentially |
Equal-capacity servers, stateless |
| Weighted Round Robin |
More traffic to higher-capacity servers |
Mixed-capacity servers |
| Least Connections |
Route to server with fewest active connections |
Long-lived connections, varying request times |
| IP Hash |
Hash client IP to determine server |
Session affinity without cookies |
| Consistent Hashing |
Minimize redistribution when servers change |
Caches, distributed databases |
Database Selection Guide
See our Database Cheatsheet for detailed comparison.
| Use Case |
Database Type |
Examples |
| Structured data, ACID transactions |
Relational (SQL) |
PostgreSQL, MySQL |
| Flexible schema, rapid iteration |
Document |
MongoDB, CouchDB |
| High write throughput, horizontal scale |
Wide Column |
Cassandra, HBase |
| Caching, sessions, leaderboards |
Key-Value |
Redis, Memcached, DynamoDB |
| Relationships, social graphs |
Graph |
Neo4j, Amazon Neptune |
| Full-text search |
Search Engine |
Elasticsearch, Solr |
| Metrics, monitoring, IoT |
Time Series |
InfluxDB, TimescaleDB |
Message Queue Comparison
| Feature |
Kafka |
RabbitMQ |
SQS |
| Model |
Log-based (pull) |
Queue (push) |
Queue (pull) |
| Throughput |
Very high (millions/sec) |
High (100K/sec) |
High (managed) |
| Ordering |
Per-partition |
Per-queue |
FIFO option |
| Retention |
Configurable (days/weeks) |
Until consumed |
14 days max |
| Best for |
Event streaming, log aggregation |
Task queues, RPC |
Serverless, simple decoupling |
Microservices Patterns
| Pattern |
Purpose |
| API Gateway |
Single entry point, routing, auth, rate limiting |
| Service Discovery |
Services find each other dynamically |
| Circuit Breaker |
Prevent cascading failures |
| Saga Pattern |
Distributed transactions via compensating actions |
| CQRS |
Separate read and write models |
| Event Sourcing |
Store state as sequence of events |
| Sidecar |
Attach helper process alongside main service |
| Strangler Fig |
Incrementally migrate from monolith |
API Design Checklist
API Design:
[ ] RESTful resource naming (nouns, not verbs)
[ ] Consistent HTTP methods (GET=read, POST=create, PUT=update, DELETE=delete)
[ ] Proper status codes (200, 201, 400, 401, 403, 404, 429, 500)
[ ] Pagination (cursor-based for real-time data, offset for static)
[ ] Versioning strategy (URL path: /v1/ recommended)
[ ] Rate limiting headers (X-RateLimit-*)
[ ] Authentication (OAuth 2.0, JWT, API keys)
[ ] Input validation and sanitization
[ ] Error response format (consistent JSON structure)
[ ] HATEOAS links (optional, for discoverability)
Security Quick Reference
For detailed security guides, see Authentication vs Authorization, OAuth 2.0, JWT, and Encryption.
| Topic |
Key Points |
| Authentication |
OAuth 2.0 + OIDC for user-facing; mTLS for service-to-service |
| Authorization |
RBAC for most apps; ABAC for fine-grained policies |
| Encryption |
AES-256-GCM at rest; TLS 1.3 in transit |
| Passwords |
bcrypt or Argon2id, never SHA-256 or MD5 |
| API Security |
Rate limiting, input validation, CORS, security headers |
Practice with Security Crypto Tools and API Network Tools. Visit swehelper.com/tools for all interactive tools.
Monitoring Checklist
The Four Golden Signals (Google SRE):
1. Latency — Time to process a request (p50, p95, p99)
2. Traffic — Requests per second
3. Errors — Rate of failed requests (5xx)
4. Saturation — How full the system is (CPU, memory, disk, connections)
RED Method (for request-driven services):
- Rate: Requests per second
- Errors: Number of failed requests
- Duration: Distribution of request durations
USE Method (for resources):
- Utilization: % of time resource is busy
- Saturation: Amount of work queued
- Errors: Count of error events
Availability SLA Reference
| SLA |
Downtime/Year |
Downtime/Month |
Downtime/Day |
| 99% (two 9s) |
3.65 days |
7.3 hours |
14.4 minutes |
| 99.9% (three 9s) |
8.76 hours |
43.8 minutes |
1.44 minutes |
| 99.99% (four 9s) |
52.6 minutes |
4.38 minutes |
8.6 seconds |
| 99.999% (five 9s) |
5.26 minutes |
26.3 seconds |
0.86 seconds |
Frequently Asked Questions
What is the most important concept for system design interviews?
Trade-offs. Every design decision involves trade-offs between consistency and availability, latency and throughput, simplicity and scalability, cost and performance. The ability to articulate why you chose one approach over another is the single most important skill. See our Interview Guide for the full framework.
How do I decide between SQL and NoSQL?
Default to SQL (PostgreSQL) unless you have a specific reason for NoSQL. Use NoSQL when you need: horizontal write scaling (Cassandra), flexible schemas (MongoDB), extreme read speed (Redis), or graph queries (Neo4j). See our Database Cheatsheet for detailed decision criteria.
When should I introduce caching in my design?
Introduce caching when: reads significantly outnumber writes (10:1+), data changes infrequently, latency requirements are strict, or you need to reduce database load. Cache-aside with Redis is the safest default. Always discuss cache invalidation strategy — it is one of the hardest problems in computer science.
What is the difference between horizontal and vertical scaling?
Vertical scaling (scale up) means adding more CPU/RAM to a single machine. It is simpler but has a hard ceiling. Horizontal scaling (scale out) means adding more machines. It requires distributed systems thinking (load balancing, sharding, consistency) but scales nearly infinitely. Most interview designs should plan for horizontal scaling. See our Scalability Cheatsheet for patterns.