What are key-value databases and what do you need to know about them for your system design interview?
This post will cover what it is, why we use it, some use cases, and briefly introduce two popular key-value stores.
What is a key-value store?
The concept is similar to that of a hashmap; the database is made up of key-value pairs. It’s just a production-scale hashmap.
There are no schemas, tables. or relationship between data. Just keys and values.
In Redis, an in-memory key-value database, the key can be of any binary sequence. i.e.
"foo" or even the contents of a jpg file. The value can be a primitive value, a complex object, or anything in between.
Why do we use key-value stores?
They are very efficient at quickly and reliably locating a value by its key. It’s ideal for systems that need to find and retrieve data in constant time. It can serve high volumes of simultaneous transactions very well.
What are some use cases?
Here are some use cases:
- Caching layer. Saving results of popular queries and caching them to improve performance.
- Session store. From when a user logs in to when a user logs out, you can use a key-value store to save session-related data
- User preferences and profiles. Saving user preferences.
- Product recommendations. Quickly access and present new items or ads as a user navigates throughout a site.
- E-commerce store shopping cart. Handling huge volumes of shopping-cart traffic with redundancy to handle node failure
What are some popular key-value databases?
Redis and Memcached are pretty popular, and both are used at Shopify for having "top-notch performance".
The biggest difference between them is that Redis supports data structures as values. Lists, sets, queues, and hash maps can be set as a value. This allows you to make structure-specific operations, like changing the 3rd element of a list with LSET, or a specific value in a hashmap with HSET.
Whereas with Memcached, everything is a blob of data. If you want to operate on a blob, you have to read the blob, update it, and write back the entire blob.
Also, Redis allows data to be persisted to disk, which makes its data more reliable in case of node failure. Memcached cannot and is limited to just caching.
That’s all. Hope you learned something.