Handling 1 million requests in a Spring Boot application

Handling 1 million requests in a Spring Boot application requires optimizing concurrency, scalability, caching, and database performance. Here’s a detailed breakdown:


1. Optimize Spring Boot for High-Performance Requests

1.1 Use Reactive Programming (WebFlux) Instead of Blocking I/O

  • Spring WebFlux (uses Netty) is non-blocking, ideal for handling a high number of concurrent requests.
  • Instead of traditional RestController, use @RestController with Mono and Flux for better scalability.

Example using WebFlux:

@RestController
@RequestMapping("/users")
public class UserController {
    @Autowired
    private UserService userService;

    @GetMapping("/{id}")
    public Mono<ResponseEntity<User>> getUser(@PathVariable Long id) {
        return userService.getUserById(id)
                .map(user -> ResponseEntity.ok(user))
                .defaultIfEmpty(ResponseEntity.notFound().build());
    }
}

Why? ✅ Handles 1M+ concurrent requests efficiently ✅ Non-blocking event-driven model

🚨 When to use WebFlux?

  • If your app is highly concurrent and needs non-blocking I/O.
  • If you are using MongoDB, Cassandra, Redis, or calling remote APIs.

1.2 Increase Thread Pool Size (For Blocking API)

If you’re using traditional Spring MVC (Tomcat/Jetty), increase the thread pool size.

Modify application.properties:

server.tomcat.max-threads=500
server.tomcat.min-spare-threads=100

Why? ✅ Prevents thread starvation for high loads. ✅ Increases concurrency for handling 1M+ requests.


1.3 Use Asynchronous Processing for Heavy Operations

For tasks like email sending, reports, or background jobs, use @Async.

Example:

@Service
public class EmailService {
    @Async
    public CompletableFuture<String> sendEmail(String email) {
        // Simulate time-consuming task
        return CompletableFuture.completedFuture("Email sent to: " + email);
    }
}

Why? ✅ Keeps the main request thread free. ✅ Improves throughput by handling tasks in parallel.


1.4 Enable Connection Pooling (HikariCP for Databases)

HikariCP is the fastest JDBC connection pool for handling high database calls.

Modify application.properties:

spring.datasource.hikari.maximum-pool-size=50
spring.datasource.hikari.minimum-idle=10
spring.datasource.hikari.idle-timeout=30000
spring.datasource.hikari.connection-timeout=20000

Why? ✅ Reduces database connection overhead. ✅ Handles concurrent database queries efficiently.


2. Improve Database Performance

2.1 Use Indexing for Faster Queries

Make sure your queries use indexes efficiently.

Example: Create indexes on frequently used columns.

CREATE INDEX idx_user_email ON users(email);
CREATE INDEX idx_order_date ON orders(order_date);

Why? ✅ Speeds up SELECT queries. ✅ Reduces Full Table Scans, improving performance.


2.2 Use Read Replicas & Load Balancing

For high read traffic, use Read Replicas with database sharding.

Example: Configure Read Replicas in PostgreSQL:

spring.datasource.url=jdbc:postgresql://master-db:5432/mydb
spring.datasource.replica-url=jdbc:postgresql://replica-db:5432/mydb

Why? ✅ Distributes the read workload among multiple servers. ✅ Improves query response time.


2.3 Use Redis Caching to Reduce DB Calls

Instead of hitting the database for every request, use Redis cache.

Example: Spring Boot with Redis:

@Service
public class UserService {
    @Autowired
    private UserRepository userRepository;

    @Autowired
    private RedisTemplate<String, User> redisTemplate;

    public User getUserById(Long id) {
        String key = "USER_" + id;
        User user = redisTemplate.opsForValue().get(key);

        if (user == null) {
            user = userRepository.findById(id).orElse(null);
            redisTemplate.opsForValue().set(key, user, 10, TimeUnit.MINUTES);
        }
        return user;
    }
}

Why? ✅ Reduces DB calls by 80-90% ✅ Improves response time from milliseconds to microseconds


3. API Gateway & Load Balancing for Scaling

3.1 Use API Gateway (e.g., Nginx, Spring Cloud Gateway)

Distribute 1M requests across multiple microservices.

Nginx Load Balancer Example:

upstream backend {
    server app1:8080;
    server app2:8080;
    server app3:8080;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
    }
}

Why? ✅ Spreads traffic across multiple instances. ✅ Prevents server overload.


3.2 Use Rate Limiting to Prevent Overload

To prevent DDoS attacks or API abuse, use Rate Limiting.

Spring Boot Rate Limiting Example:

@Bean
public RateLimiter rateLimiter() {
    return RateLimiter.create(1000); // 1000 requests per second
}

Why? ✅ Protects API from excessive load. ✅ Ensures fair usage among clients.


4. Scale with Kubernetes & Autoscaling

4.1 Run Multiple Spring Boot Instances

Run multiple Spring Boot instances using Docker and Kubernetes.

Kubernetes Auto-Scaling (HPA)

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: spring-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: spring-app
  minReplicas: 3
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Why? ✅ Auto-scales instances based on CPU/memory usage. ✅ Handles sudden spikes in traffic.


Final Architecture for Handling 1M Requests

  1. Use WebFlux for non-blocking performance.
  2. Enable HikariCP for DB connection pooling.
  3. Optimize queries with indexes and read replicas.
  4. Cache data with Redis to reduce DB load.
  5. Use API Gateway & Load Balancer (Nginx or Spring Cloud Gateway).
  6. Deploy in Kubernetes with Horizontal Pod Autoscaling.
  7. Enable rate limiting to prevent abuse.

Conclusion

By implementing the above techniques, you can efficiently handle 1M+ requests in Spring Boot while keeping the application responsive and scalable. 🚀

Post a Comment

Previous Post Next Post