System DesignDecember 8, 20249 min read

System Design: Search Autocomplete at Scale

Designing Google-style search autocomplete — Trie data structure, prefix frequency ranking, caching layers, and how to serve 10,000 suggestions/sec with sub-50ms latency.

System DesignTrieRedisElasticsearchScalability

Search autocomplete is a deceptively complex system. The data structure is a Trie, but serving 10K RPS with sub-50ms latency requires a carefully layered architecture around it.

Trie Implementation

Trie.java
java
1class TrieNode {
2    Map<Character, TrieNode> children = new HashMap<>();
3    List<String> topSuggestions = new ArrayList<>(); // pre-computed top 5
4}
5
6class AutocompleTrie {
7    private final TrieNode root = new TrieNode();
8
9    public void insert(String word, int frequency) {
10        TrieNode node = root;
11        for (char c : word.toCharArray()) {
12            node.children.putIfAbsent(c, new TrieNode());
13            node = node.children.get(c);
14            updateTopSuggestions(node, word, frequency);
15        }
16    }
17
18    public List<String> search(String prefix) {
19        TrieNode node = root;
20        for (char c : prefix.toCharArray()) {
21            if (!node.children.containsKey(c)) return Collections.emptyList();
22            node = node.children.get(c);
23        }
24        return node.topSuggestions; // O(prefix_length) lookup
25    }
26}

Architecture at Scale

  • Trie is built offline from search logs — top 1M queries ranked by frequency
  • Serialized Trie stored in Redis — each prefix maps to top 5 suggestions (key: 'ac:java', value: [java tutorial, java interview, ...])
  • CDN edge cache for top 10K prefixes — these serve 80% of traffic
  • Fallback to Redis if CDN misses, fallback to Elasticsearch for tail queries

Store top suggestions at every Trie node during build time, not at query time. This pre-computation converts an O(n) DFS into O(prefix_length) lookup — critical for sub-50ms response at scale.

Spring Boot Autocomplete API

AutocompleteController.java
java
1@RestController
2@RequestMapping("/api/search")
3@RequiredArgsConstructor
4public class AutocompleteController {
5
6    private final AutocompleteService autocomplete;
7
8    @GetMapping("/suggest")
9    public ResponseEntity<List<String>> suggest(
10            @RequestParam String q,
11            @RequestParam(defaultValue = "5") int limit) {
12
13        if (q.length() < 2) return ResponseEntity.ok(Collections.emptyList());
14
15        List<String> suggestions = autocomplete.getSuggestions(
16            q.toLowerCase().trim(), limit
17        );
18        return ResponseEntity.ok(suggestions);
19    }
20}
21
22@Service
23@RequiredArgsConstructor
24public class AutocompleteService {
25
26    private final RedisTemplate<String, String> redis;
27    private static final String PREFIX = "ac:";
28
29    public List<String> getSuggestions(String prefix, int limit) {
30        // Check Redis first
31        List<String> cached = redis.opsForList().range(PREFIX + prefix, 0, limit - 1);
32        if (cached != null && !cached.isEmpty()) return cached;
33
34        // Fall back to Elasticsearch for tail queries
35        return elasticSearch.prefixQuery("title", prefix, limit);
36    }
37}

Updating Suggestions with Search Logs

  • Log every search query with a timestamp into Kafka topic 'search-events'.
  • Spark/Flink batch job runs nightly — counts query frequency over rolling 30-day window.
  • Top 1M queries by frequency are used to rebuild the Trie and Redis cache.
  • Newly trending queries appear in suggestions within 24 hours without manual intervention.
  • Offensive/spam queries filtered via blocklist before being inserted into the Trie.

More in System Design