Presequites:
Functional Requirements:
-
Cache Operations:
- The system must support basic cache operations, including inserting, retrieving, and removing items based on search queries.
-
Cache Eviction:
- Implement an eviction policy (e.g., LRU cache) to manage the cache size efficiently, ensuring that the cache doesn’t exceed its maximum size.
-
Cache Expiration:
- Optionally, provide a mechanism to expire cached items after a certain time, ensuring freshness of search results.
-
Concurrency Control:
- Implement mechanisms to handleconcurrent access to the cache to maintain data consistency.
Non-Functional Requirements:
-
Performance:
- The cache system must provide lowlatency access to cached items, ensuring that cache hits are handled quickly.
-
Scalability:
- The system should be designed forhorizontal_scaling to accommodate an increasing number of users and queries.
-
Reliability:
- The cache system must be reliable and available, with minimal downtime or service interruptions.
-
Security:
- Ensure data security and access control to prevent unauthorized access to cached items.
-
Monitoring and Logging:
- Implement monitoring and logging for tracking cache performance and user interactions.
Capacity Estimation:
- Bandwidth: The required bandwidth depends on the size of cached data and the number of cache hits. A significant portion of data can be cached in-memory to reduce the need for bandwidth.
- Traffic Estimate: Estimate the number ofQueries_Per_Second to determine traffic patterns and distribution.
- Storage Estimate: Estimate the storage required for cached items, considering the cache’s maximum size and the average size of search results and snippets.
System API:
- Cache API:
get(query): Retrieves search results and snippet for a given query.set(query, results, snippet): Stores search results and snippet for a query.remove(query): Removes a query from the cache.clear(): Clears the entire cache.
Database Design:
- The primary data store is thein_memory_cache , which uses a hash table or a similar data structure to store key-value pairs efficiently.
- Optionally, a separate database can be used for persistent storage or for handling cache synchronization across distributed cache servers.
Data Structures and Algorithms Used:
-
Data Structure:
- Hash Table: Used for efficient retrieval of cached items based on search queries.
- Doubly Linked List: Used for maintaining access history and implementing the LRU eviction policy.
-
Q. Why doubly Linked List here and not single Linked List?
-
Algorithms:
- LRU (Least Recently Used) Eviction: Determines which item to evict when the cache is full based on the least recently accessed items.
- Cache Expiration Algorithm: Optionally, an algorithm to check and remove expired items from the cache based on timestamps.
- Concurrency Control Mechanisms: Employ locking orthread-safe_data_structures to ensure thread safety in a multi-threaded or multi-process environment.
This format breaks down the key-value cache system design into functional and non-functional requirements, capacity estimates, API specifications, database considerations, and the use of data structures and algorithms. It provides a clear overview of the system’s design and requirements.
[[www.educative.io]] article learning
Notes
Questions
- Q. What dofault_tolerance mean in this case?
- Q. How can Merkle Tree be applied here?
Links
- Intro: https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers/system-design-the-key-value-store
- Design akey_value store: https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers/design-of-a-key-value-store
- Ensurescalability andreplication:https://www.educative.io/collection/page/10370001/4941429335392256/5256346714243072
- Versioning data and achieving configurability: https://www.educative.io/collection/page/10370001/4941429335392256/5979802812547072
- Enablefault_tolerance andfailure_detection: https://www.educative.io/collection/page/10370001/4941429335392256/5842597234343936