Skip to main content
Back to Articles
Microservices

Building Scalable Microservices: Lessons from Production

Real-world insights from building and scaling microservices in production environments, including common pitfalls and solutions.

WH Studio logo
WH Studio

Product Engineering Studio

100+ Projects
15+ Countries
2025-11-03T05:34:32.178495+00:00
15 min read
Share:
Building Scalable Microservices: Lessons from Production

Building Scalable Microservices: Lessons from Production

After building microservices architectures for multiple high-traffic applications, I've learned some hard lessons who we are what works and what doesn't at scale.

The Reality Check

Microservices aren't a silver bullet. They introduce complexity that you need to be prepared to handle:

  • Distributed tracing becomes essential
  • Network calls are unreliable
  • Data consistency is harder
  • Deployment coordination gets complex

But when done right, they provide incredible benefits: independent scaling, technology flexibility, and team autonomy.

Pattern 1: API Gateway

Always use an API gateway as the single entry point:

TypeScript programming">// api-gateway/src/index.ts
import express from "express";
import { createProxyMiddleware } from "http-proxy-middleware";

const app = express();

// Route to user service
app.use("/api/users", createProxyMiddleware({
  target: process.env.USER_SERVICE_URL,
  changeOrigin: true,
  pathRewrite: { "^/api/users": "" }
}));

// Route to order service
app.use("/api/orders", createProxyMiddleware({
  target: process.env.ORDER_SERVICE_URL,
  changeOrigin: true,
  pathRewrite: { "^/api/orders": "" }
}));

Pattern 2: Circuit Breaker

Protect your services from cascading failures:

import CircuitBreaker from "opossum";

const options = {
  timeout: 3000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000
};

const breaker = new CircuitBreaker(fetchUserData, options);

breaker.on("open", () => {
  console.log("Circuit breaker opened!");
});

breaker.fallback(() => ({ 
  error: "Service temporarily unavailable" 
}));

Pattern 3: Event-Driven Communication

Use events for async communication between services:

// publisher.ts
import { EventEmitter } from "events";

class OrderService extends EventEmitter {
  createOrder(orderData: OrderData) {
    const order = this.saveOrder(orderData);
    
    // Emit event instead of direct service call
    this.emit("order.created", {
      orderId: order.id,
      userId: order.userId,
      total: order.total
    });
    
    return order;
  }
}

// subscriber.ts
class InventoryService {
  constructor(orderService: OrderService) {
    orderService.on("order.created", this.reserveInventory);
  }
  
  async reserveInventory(event: OrderCreatedEvent) {
    // Handle inventory reservation
  }
}

Pattern 4: Database Per Service

Each service should own its data:

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   User      │     │   Order     │     │  Inventory  │
│   Service   │     │   Service   │     │   Service   │
└──────┬──────┘     └──────┬──────┘     └──────┬──────┘
       │                   │                    │
       ▼                   ▼                    ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Users     │     │   Orders    │     │  Inventory  │
│     DB      │     │     DB      │     │     DB      │
└─────────────┘     └─────────────┘     └─────────────┘

Pattern 5: Health Checks

Implement proper health checks:

app.get("/health", async (req, res) => {
  const health = {
    uptime: process.uptime(),
    timestamp: Date.now(),
    checks: {
      database: await checkDatabase(),
      redis: await checkRedis(),
      externalApi: await checkExternalApi()
    }
  };
  
  const isHealthy = Object.values(health.checks)
    .every(check => check === "ok");
  
  res.status(isHealthy ? 200 : 503).json(health);
});

Common Pitfalls to Avoid

1. Too Many Microservices

Start with a monolith, extract services when you have clear boundaries.

2. Synchronous Communication

Prefer async communication via message queues when possible.

3. Shared Databases

Never share databases between services—it creates tight coupling.

4. No Monitoring

Invest in observability from day one: logging, metrics, tracing.

Production Checklist

✅ API Gateway configured
✅ Service discovery implemented
✅ Circuit breakers in place
✅ Distributed tracing setup
✅ Centralized logging
✅ Health checks on all services
✅ Automated deployment pipeline
✅ Database backups automated
✅ Secrets management configured
✅ Rate limiting implemented

Conclusion

Microservices are powerful but complex. Make sure you need them before committing to the architecture. If you do go with microservices, invest heavily in observability, automation, and developer tooling.

Your future self will thank you.

Service boundaries: the decision that determines everything

90% of failed microservices migrations get this single decision wrong. Service boundaries should follow business capabilities, not technical layers. "Orders," "Inventory," and "Pricing" are services. "Database," "API gateway," and "Cache" are not.

The bounded-context heuristic from domain-driven design holds up: if two teams need to coordinate on every release, they should probably own one service, not two. If a single team owns three services that always deploy together, they should probably be one service.

The smell tests for a wrong boundary:

  • Cross-service joins that pull the same data five different ways
  • A "common" library that every service depends on and breaks every service when it changes
  • A single endpoint that requires synchronous calls to 4+ services to respond
  • Deploys that require coordinating across teams to land safely

Any one of these is a sign you've sliced the system the wrong way. Two or more is a refactor.

The distributed systems failure modes

Microservices replace a single shared-memory failure mode (the monolith crashed) with seven new failure modes you now have to design for:

  1. Network partitions. A call between services can hang indefinitely. Default every client to aggressive timeouts (1–3s for sync calls).
  2. Cascading failures. Service A retries against degraded service B and amplifies the load. Circuit breakers (Hystrix-style or per-language equivalents) are non-negotiable.
  3. Thundering herds. A cache miss causes every replica to call the upstream simultaneously. Single-flight or request coalescing is the fix.
  4. Idempotency drift. Network retries cause duplicate writes. Every mutating endpoint needs an idempotency key — design it on day one, not after the first incident.
  5. Schema evolution. A producer ships a field rename; three consumers break. Use a schema registry (Confluent, Apicurio) and enforce backward-compatible changes in CI.
  6. Distributed transactions. Two-phase commit doesn't work at scale. Use the Saga pattern with explicit compensating actions, or design around eventual consistency.
  7. Clock skew. Don't rely on cross-service timestamps for ordering. Use monotonic IDs or vector clocks where ordering matters.

Each of these is well-documented and well-solved. The teams that struggle are the ones discovering them at 2am instead of designing for them at week zero.

Observability is not optional

A monolith you can attach a debugger to. A microservices system you cannot. Observability is what replaces the debugger — and "observability" specifically means three pillars, not just one:

  • Distributed tracing (OpenTelemetry, Jaeger, Honeycomb). Every request gets a trace ID, every service propagates it, and you can see the full call graph for any user-facing latency.
  • Structured logs with shared trace IDs. Plaintext logs are write-only at this scale. JSON logs with trace IDs cost the same to emit and are 10x more valuable to query.
  • Metrics with high cardinality. Prometheus is the floor; Honeycomb-style wide events are the ceiling. The difference shows up when you need to ask "why is p99 latency high for users in this specific tier on this specific endpoint."

Skip any one of the three and incident response time triples.

Deployment and platform requirements

Microservices increase the deploy frequency you need to support, the orchestration surface, and the security perimeter. Practical requirements:

  • A container platform. Kubernetes is the default; ECS/Cloud Run are valid for smaller surface areas.
  • A service mesh (Istio, Linkerd) once you cross ~20 services. Below that, library-level retries and mTLS via cert-manager are simpler.
  • Centralized secret management (Vault, AWS Secrets Manager, Doppler). Per-service .env files do not scale.
  • A real CI/CD platform with parallel pipelines, environment promotion, and rollback. GitHub Actions works; Argo CD or Flux for GitOps once you're past 50 deploys/day.

See our cloud solutions and CI/CD pipelines-services">DevOps services for how we structure these platforms in production engagements.

When microservices are wrong

We've helped more teams migrate off premature microservices than onto them. The pattern: a series-A team adopted microservices because their last company had them, ended up with 11 services and 4 engineers, spent 60% of engineering time on platform work, and lost a year.

Microservices are usually wrong when:

  • Your team is under 15 engineers
  • Your traffic is under 1M requests/day
  • Your data model has heavy cross-entity joins
  • You don't have dedicated platform/infra capacity
  • You're pre-PMF

A modular monolith — clear module boundaries inside a single deployable — gives you 80% of the architectural discipline of microservices with 20% of the operational cost. We default to modular monoliths for MVP and SaaS engagements and only extract services when scaling pressure makes the trade-off worth it.

When microservices are right

Microservices are right when:

  • Independent team autonomy is more valuable than deployment coordination
  • Different services have radically different scaling profiles (e.g. realtime + batch)
  • Different services have different compliance requirements (e.g. PCI scope isolation)
  • You have a platform team that owns the substrate

If two of these apply, start planning the extraction. If three or more apply, the migration is overdue.

Want a second opinion on your architecture?

WH Studio runs architecture reviews as a 1–2 week engagement: current-state diagram, failure-mode analysis, prioritized recommendations, and a realistic migration sequence. get in touch">Start a conversation or browse our IT consulting and API development practices.

Microservices FAQ

At what team size do microservices start to make sense? Roughly 15+ engineers, organized into 3+ teams that need independent release cadences. Below that, a modular monolith ships faster and breaks less often.

Can we mix monolith and microservices? Yes — and most healthy systems do. A modular monolith with 1–3 extracted services (typically the highest-traffic or most-isolated capabilities) is a stable end state, not an awkward middle.

What's the right size for a single service? Big enough that a single team owns it, small enough that one engineer can hold the whole thing in their head. Typically 10K–50K lines of code; below 5K usually means you over-split.

Should every service have its own database? Yes, conceptually — services should not share tables. Physically co-locating multiple service schemas on one Postgres instance for cost reasons is fine in early stages, as long as the access boundary is enforced in code.

UK Businesses Only

Let's Build Something Exceptional Together

Complimentary technical audit & consultation
Personalized roadmap for your business goals
Zero commitment 24-hour response time
Trusted by 50+ UK businesses
GDPR Compliant 98% Satisfaction Rate

Continue Reading

Explore related insights and strategies

1
Career
London, UK
8 min read

The Price of Belonging

When choosing your tech stack becomes choosing your future. Discover the most lucrative tech stacks in London's competitive market and understand which skills command the highest salaries in 2026.

Jan 15, 2026
2
Technical
Global
12 min read

Full Stack Development Best Practices 2026: Build Better, Faster, Smarter

Master modern full-stack development with proven best practices covering architecture, security, performance, and scalability. Learn from real-world production experience.

Jan 22, 2026
3
Career
Manchester, UK
6 min read

Manchester's Developer Gold Rush: The New Tech Hub

Why Manchester is becoming the UK's fastest-growing tech hub. Explore the opportunities, salaries, and lifestyle that's attracting developers from London.

Jan 10, 2026
Limited Availability - UK Businesses Only

Your Next Project Deserves Expert Execution

Partner with a proven full-stack developer who's delivered 100+ successful projects across fintech, healthcare, and SaaS. Let's discuss your vision in a free 30-minute strategy session.

100+
Projects Delivered
15+
Countries Served
98%
Client Satisfaction
24h
Response Time
30-minute consultation
No commitment required
Actionable insights
JD
SM
AL

"Exceptional technical expertise and delivery. Transformed our legacy system into a modern, scalable platform."

Join 50+ satisfied UK businesses