optimization/solution for problem that django pagination extraction speed is too slow

created at 07-17-2021 views: 1

Introduction

Fast paging can be achieved in django, but when the data reaches several million, it must be optimized to be effective and reasonable paging, otherwise the data extraction speed for each paging is too slow.

Django pagination code

User.objects.all()[:3]

it is similar to MySQL:

select * from User limit 3,10000000,

start to fetch data from item 3, take 10000000 (10000000 is greater than the number of data in the table)

It can be known from the above code that Django paging actually uses MySQL's limit statement. When the number of such statements is small, the calculation is very fast, but the calculation speed increases linearly if the number exceeds a certain number.

Limit 100000,20 means to scan 100020 rows that meet the conditions, instead of the previous 100000 rows, return the last 20 rows of data, then the problem appears here.

If the limit is 1000000,30, 1 million rows of data need to be scanned. So it will block here when paging data.

summary

The query time of the limit statement is proportional to the position of the starting record

The limit statement of mysql is very convenient, but it is not suitable for direct use for tables with a lot of records.

Optimization

Use the covering index of the table to speed up paging queries. Everyone knows that if an index column (covering index) is included in the query, the query speed will be very fast.

Such as

# takes 34s
SELECT * FROM `format_demo` limit 1000000,20; 

# It takes 5 seconds and the speed is increased by 7 times
SELECT id FROM `format_demo` limit 1000000,20; 
Please log in to leave a comment.