System DesignMarch 5, 20257 min read

Microservices vs Monolith: When to Split and When Not To

After building both, here's my honest take — microservices are not always better. The decision framework I use, the hidden costs nobody talks about, and when a modular monolith wins.

System DesignMicroservicesArchitectureSpring BootBackend

I've built systems as monoliths and as microservices. The honest truth is that most teams jump to microservices too early and pay a massive operational tax for years. Here's the decision framework I actually use.

The Hidden Costs of Microservices

  • Network latency: function call → HTTP call adds 1-50ms per hop. With 5 services in a chain, that's visible.
  • Distributed transactions: updating two services atomically requires sagas or two-phase commit. Both are painful.
  • Observability: tracing a request across 8 services needs distributed tracing (Jaeger/Zipkin) from day one.
  • Deployment complexity: 1 monolith deployment vs 12 independent service deployments per release.
  • Testing: integration tests across services are slow and flaky without a solid test environment strategy.

When Microservices Make Sense

Split on team boundaries, not on data boundaries. If two features are owned by different teams who need to deploy independently, split them. If one feature has dramatically different scaling needs (e.g., a video encoder vs a CRUD API), split it. Otherwise, don't.

Start with a well-structured monolith with clear package/module boundaries. You can always split a clean monolith into services later. Merging two microservices that should have been one module is a nightmare.

The Modular Monolith

My default recommendation for teams under 20 engineers: a modular monolith. Strict package boundaries, no cross-module direct calls (use internal events via Spring ApplicationEvent), separate data schemas per module. You get 80% of the benefits with 20% of the operational complexity.

Service Communication Patterns

  • Synchronous REST: simple but creates tight coupling and cascading failures. Use circuit breakers (Resilience4j) to protect against downstream failures.
  • Asynchronous messaging (Kafka/RabbitMQ): decoupled, but introduces eventual consistency. Best for operations that don't need an immediate response.
  • gRPC: faster than REST for internal service calls, strong contracts with .proto files. Overhead of maintaining .proto files is worth it at 10+ services.
  • GraphQL Federation: multiple services contribute to a single schema. Complex to operate but excellent for frontend-heavy architectures.

Circuit Breaker with Resilience4j

PaymentService.java
java
1@Service
2public class PaymentService {
3
4    @CircuitBreaker(name = "payment", fallbackMethod = "paymentFallback")
5    @TimeLimiter(name = "payment")
6    @Retry(name = "payment")
7    public CompletableFuture<PaymentResult> processPayment(PaymentRequest req) {
8        return CompletableFuture.supplyAsync(() -> externalPaymentGateway.process(req));
9    }
10
11    public CompletableFuture<PaymentResult> paymentFallback(
12            PaymentRequest req, Exception ex) {
13        log.error("Payment gateway unavailable, queuing for retry", ex);
14        paymentRetryQueue.enqueue(req);
15        return CompletableFuture.completedFuture(PaymentResult.pending(req.getId()));
16    }
17}

More in System Design