Skip to main content
ARCHITECTVI

Software Engineer

Available for work

Open to opportunities
๐Ÿ—๏ธSystem Design

System Design for Beginners

A comprehensive guide to system design concepts, from database selection to caching strategies.

Feb 28, 2024ยท15 min read
System DesignArchitectureScalability

System design interviews frighten engineers because they're open-ended. But behind every "design Twitter" question lies a small set of recurring building blocks. Master those blocks and any system becomes approachable. This guide covers the ones that appear most often โ€” drawn from my experience designing systems at MuslimPro that serve 3M+ daily active users.

Start With Requirements, Not Solutions

The biggest mistake in system design is proposing a solution before understanding the constraints. Always clarify first.

  • Scale โ€” How many users? Requests per second? Data volume?
  • Latency โ€” Is this a real-time system (chat, gaming) or can we tolerate seconds (batch reports)?
  • Consistency โ€” Must all users see the same data immediately, or is eventual consistency acceptable?
  • Availability โ€” What's the acceptable downtime? 99.9% = ~9 hours/year. 99.99% = ~1 hour/year.

The CAP Theorem in Plain English

A distributed system can only guarantee two of three properties simultaneously: Consistency (all nodes see the same data), Availability (every request gets a response), and Partition Tolerance (the system works despite network splits). Since network partitions are inevitable, you're always choosing between CP and AP.

โ„นPrayer times data at MuslimPro is AP โ€” we'd rather show slightly stale times than block requests during a network partition. User account data is CP โ€” a user must never see a stale subscription state that lets them access paid content they haven't purchased.

Horizontal vs Vertical Scaling

Vertical scaling (bigger machine) is simple but has a hard ceiling and creates a single point of failure. Horizontal scaling (more machines) is infinitely extensible but introduces distributed complexity: you need a load balancer, stateless services, and external session storage.

๐Ÿ’กDesign services to be stateless from day one. Store sessions in Redis, not in memory. This makes horizontal scaling a configuration change, not a refactor.

Caching Strategies

Caching is the single highest-leverage performance tool. There are five common strategies, each suited to different data patterns.

  • Cache-aside (Lazy loading) โ€” Application checks cache first, fetches DB on miss, populates cache. Best for read-heavy workloads with irregular access patterns.
  • Write-through โ€” Write to cache and DB simultaneously. Ensures cache is always warm but adds write latency.
  • Write-behind (Write-back) โ€” Write to cache immediately, flush to DB asynchronously. High write throughput but risk of data loss on crash.
  • Read-through โ€” Cache sits in front of DB; on miss, the cache itself fetches and stores. Good for uniform access.
  • Refresh-ahead โ€” Pre-populate cache before expiry for predictable access patterns (e.g., prayer times at dawn).
typescript
// Cache-aside pattern with Redis in NestJS
@Injectable()
export class PrayerTimesService {
  constructor(
    private readonly redis: RedisService,
    private readonly db: PrayerTimesRepository,
  ) {}

  async getByCity(city: string): Promise<PrayerTimes> {
    const key = `prayer:${city}`;
    const cached = await this.redis.get(key);
    if (cached) return JSON.parse(cached);

    const data = await this.db.findByCity(city);
    await this.redis.set(key, JSON.stringify(data), 'EX', 21600); // 6h TTL
    return data;
  }
}

Database Selection

No database is universally best. Choose based on your access patterns.

  • PostgreSQL โ€” ACID compliance, complex queries, relationships. Default choice for most transactional data.
  • MongoDB โ€” Flexible schema, nested documents, fast iteration. Good for content management and catalogs.
  • Redis โ€” In-memory key/value. Sessions, caching, rate limiting, pub/sub.
  • Elasticsearch โ€” Full-text search, log aggregation, faceted filtering.
  • ClickHouse / TimescaleDB โ€” Time-series analytics, high-throughput event ingestion.

Message Queues for Async Decoupling

When a request triggers work that can happen asynchronously (sending emails, processing payments, generating thumbnails), don't do it in the request handler. Put a message on a queue and process it in a worker. This keeps response times fast and makes the system resilient to spikes.

typescript
// NestJS Bull queue โ€” fire and forget
@Injectable()
export class EmailService {
  constructor(@InjectQueue('email') private emailQueue: Queue) {}

  async sendWelcome(user: User): Promise<void> {
    await this.emailQueue.add('welcome', { userId: user.id }, {
      attempts: 3,
      backoff: { type: 'exponential', delay: 2000 },
    });
    // Returns immediately โ€” email sent asynchronously
  }
}

Written by

Md. Saniuzzaman Robin

Full-Stack Software Engineer

More Articles โ†’