Don't understand Redis? After reading this story, you will understand!

created at 09-17-2021 views: 11

I am Redis

Hello, I am Redis, and a man named Antirez brought me into this world.


Speaking of my birth, it is related to the relational database MySQL.

Before I came to this world, MySQL had a very hard time. The Internet is developing faster and faster, and it contains more and more data. User requests have also skyrocketed, and every user request has changed. It has become a read and write operation one after another, MySQL is miserable. Especially the days when the people are shopping spree are the days when MySQL suffers.

MySQL later told me that, in fact, most of the user requests are read operations, and they often query something repeatedly, wasting a lot of time for disk I/O.

Later, someone wondered whether it is possible to learn CPU and add a cache to the database? So I was born!

Soon after I was born, I became good friends with MySQL, and we often appeared in the back-end server hand in hand.

The data queried by the applications from MySQL is registered with me. When you need it later, you will first ask me for it. I don't have to look for MySQL again here.

web server - redis - mysql

For ease of use, I support the storage of several data structures:

  • String
  • Hash
  • List
  • Set
  • SortedSet
  • Bitmap
  • ···

Because I record all the registered data in the memory, I don't need to perform I/O operations that are as slow as a snail, so it takes a lot of time to find me than to find MySQL.

Don't underestimate this simple change, I can relieve MySQL a lot of burden! As the program runs, I cache more and more data, and for a considerable part of the time I block user requests for it, this time it can be quite relaxed!

With my joining, the performance of the network service has improved a lot, thanks to the many shots I took for the database.

Cache Expiration && Cache Retire

But soon I discovered that things were not good. The data I cached was all in memory, but even on the server, the memory space resources were still very limited. I couldn’t save it uncontrollably. I had to think of a way. Otherwise, take jujube pills.

Soon, I thought of a way: set a timeout for the cached content, and leave it to the applications to set the specific setting. All I have to do is to delete the expired content from me and make room in time. .

redis cache

The timeout period is over. When should I do this cleanup?

The simplest is to delete regularly, I decided to do it once in 100ms, 10 times a second!

I can't delete all the expired ones at once when I clean it up. I have a lot of data stored in it. If I want to scan it all over, I don't know how long it will take, which will seriously affect my reception of new customer requests!

scan  over

Time is tight and the task is heavy, so I have to randomly select a part to clean up, which can relieve the memory pressure.

After a period of time, I found that some key-values had better luck. They were not selected by my random algorithm every time, and they were spared every time. This is not good. These long-expired data have been occupied. Less memory space! Shivering and cold!

I can't rub the sand in my eyes! So on the basis of the original regular deletion, another trick is added:

Those key values that escaped my random selection algorithm, once encountered a query request, I found that it has expired, then I will not be welcome and delete it immediately.

Because this method is triggered passively, it will not happen without query, so it is also called lazy deletion!

However, there are still some key values, which not only escaped my random selection algorithm, but have not been queried, resulting in them being at large! At the same time, the memory space that can be used is getting less and less.

cannot delete

And even if I take a step back, I can delete all the expired data. If the expiration time is set for a long time, and the memory is full before I can clean it up, I still have to eat jujube pills, so I have to think about it. A way.

I thought for a long time, and finally came up with a big move: memory elimination strategy, this time I want to solve the problem completely!

I have provided 8 strategies for application programs to choose for how to make decisions when I encounter memory shortage:

  • noeviction: returns an error and will not delete any key values
  • allkeys-lru: Use the LRU algorithm to delete the least recently used key value
  • volatile-lru: Use the LRU algorithm to delete the least recently used key value from the key set with an expiration time set
  • allkeys-random: delete randomly from all keys
  • volatile-random: randomly delete from the set of keys with expiration time set
  • volatile-ttl: delete the key with the shortest remaining time from the keys with expiration time set
  • volatile-lfu: delete the least frequently used key from the keys configured with expiration time
  • allkeys-lfu: delete the least frequently used key from all keys

With the above sets of combo punches, I no longer have to worry about the problem of filling up the space with more expired data~

Cache penetration && Bloom filter

My life is quite comfortable, but MySQL is not as comfortable as I am. Sometimes when I encounter some annoying requests and the data to be queried does not exist, MySQL will have to work in vain! Not only that, because it doesn't exist, I can't cache it, so that every time the same request comes, I have to keep MySQL busy. My value as a cache has not been reflected! This is what people often call cache penetration.

Cache penetration && Bloom filter

After going back and forth, MySQL big brother couldn't help it: "Hey, brother, can you help me think of a way to block those queries that I know will not have results"

At this time I thought of another good friend of mine: Bloom filter

Bloom filter

My friend has no other skills, but he is good at quickly telling you whether the data you are looking for exists from a huge data set (to tell you quietly, this friend of mine is a little unreliable, if it tells you that it exists, you can’t believe it completely. In fact, it may not exist, but if it tells you that it does not exist, then it must not exist)


I introduced this friend to the application, so there is no need to bother MySQL for non-existent data, and easily help solve the problem of cache penetration.

Cache breakdown && cache avalanche

After that, a period of peace passed until that day...

Once, the MySQL guy was fishing graciously, and suddenly a lot of requests were sent to him, and he was caught off guard.

After a while of busy work, MySQL found me angrily, "Brother, what's the matter, why is it so fierce all at once"

Cache breakdown

I checked the log and quickly explained: "Brother, I'm really sorry. Just now there was a hot spot data that reached the expiration time and was deleted by me. Unfortunately, a large number of query requests for this data came afterwards. Here I am. It has been deleted, so all requests have been sent to you."

"What are you doing, pay attention next time", MySQL eldest brother left with an unhappy expression.

I didn't pay much attention to this little thing, and then I left it behind, but I never thought that I would stoke a bigger basket a few days later.

On that day, a large number of network requests were sent to MySQL, which was much larger than the previous one. The big brother MySQL got down several times in a while!

It took a long time for this wave of traffic to pass, and MySQL was slowed down.

"Brother, what's the reason this time?", MySQL eldest brother was exhausted.

"This time is even worse than the last time. This time, a large amount of data has passed the validity period almost simultaneously, and then there have been many requests for these data, so the scale is larger than the last time."

cache avalanche

Brother MySQL frowned upon hearing this, "Then you have to think of a way to torture me every two days, who can stand it?"

"Actually, I am also very helpless, this time is not set by me, or I go to the application and tell him to set the cache expiration time evenly? At least don't let a large amount of data collectively invalidate."

"Go, let's go together"

Later, we went to the application to discuss, not only randomized the expiration time of the key value, but also set the hot data to never expire, which alleviated a lot of this problem. Oh, by the way, we also named the two problems: cache breakdown and cache avalanche.

We finally have a comfortable life again...

created at:09-17-2021
edited at: 09-17-2021: