A few precautions in the use of Redis

created at 07-28-2021 views: 1

A few precautions in the use of Redis

  • If not necessary, be sure to set TTL. If it is not required by the business and must be stored persistently, then please be sure to set the TTL, otherwise as time goes by, Redis will be full of garbage. In addition, pay attention to the use of the framework, determine whether the framework will set ttl, for example, a pit recently encountered is that Python RQ does not set ttl for the job by default, so a few years have passed, now Redis memory is not enough, and only found after analysis, There are a lot of garbage in it, such as some unused business data, as well as job data from a long time ago, etc., all piled up in Redis and become persistent garbage.
  • Don't set a key that is too long. For example, the spring framework will have such a key: spring:session:sessions:1c88a003-63a4-48a0-979d-9b3be4ed9c0c, a large part of which is useless data, which takes up too much memory.
  • Clients use connection pools to reuse connections and improve performance.
  • Use pipeline to perform multiple actions to avoid reducing the overhead of multiple network round trips.
  • If you use Lua, you must be careful that Lua scripts can't take too long.

Attach here the script I recently analyzed the memory usage in Redis:

import logging
import sys

import redis


logging.basicConfig(level=logging.INFO)


def get_type_and_subcount(client, key):
    _type = client.type(key).decode()
    sub_count = 0

    if _type == "set":
        sub_count = client.scard(key)
    elif _type == "list":
        sub_count = client.llen(key)
    elif _type == "hash":
        sub_count = client.hlen(key)
    elif _type == "string":
        sub_count = client.strlen(key)
    elif _type == "zset":
        sub_count = client.zcard(key)
    else:
        logging.error("bad key %s with type %s", key, _type)

    return _type, sub_count


BYTES_TO_GB = 1024 * 1024 * 1024


def analytic_db(db):
    logging.info("we're now parse db %s", db)
    redis_client = redis.Redis(host="127.0.0.1", db=db)

    total_count = 0  # Total
    key_bytes_count = 0  # total bytes
    big_key_count = 0  # >1KB total
    big_key_bytes_count = 0  # >1KB total bytes
    big_big_key_count = 0  # > 100KB total
    big_big_key_bytes_count = 0  # > 100KB total bytes
    no_ttl_big_key_count = 0  # Ttl not set>1KB total 
    no_ttl_big_key_bytes_count = 0  #  Ttl not set >1KB total bytes

    for key in redis_client.scan_iter():
        bytes_num = redis_client.memory_usage(key)
        total_count += 1
        key_bytes_count += bytes_num

        ttl = redis_client.ttl(key)

        if bytes_num > 1024:  # 1K
            big_key_count += 1
            big_key_bytes_count += bytes_num

            key_type, sub_count = get_type_and_subcount(redis_client, key)

            if ttl == -1:
                no_ttl_big_key_count += 1
                no_ttl_big_key_bytes_count += bytes_num

            if bytes_num > 102400:  # 100K
                big_big_key_count += 1
                big_big_key_bytes_count += bytes_num
                logging.warning(
                    "big key found %s, bytes: %s, type is %s, sub_count %s, ttl is %s",
                    key, bytes_num, key_type, sub_count, ttl,
                )

    logging.info(
        "db %s, %s keys(%s GB), %s keys are > 1KB (%s GB), %s keys are > 100KB (%sGB), %s no ttl big keys > 100KB(%sGB)",
        db, total_count, str.format("{:+.2f}", key_bytes_count / BYTES_TO_GB),
        big_key_count, str.format("{:+.2f}", big_key_bytes_count / BYTES_TO_GB),
        big_big_key_count, str.format("{:+.2f}", big_big_key_bytes_count / BYTES_TO_GB),
        no_ttl_big_key_count, str.format("{:+.2f}", no_ttl_big_key_bytes_count / BYTES_TO_GB),
    )


if __name__ == "__main__":
    analytic_db(sys.argv[1])
created at:07-28-2021
edited at: 07-28-2021: