Why should we use count() instead of len() for queryset in django

created at 07-17-2021 views: 1

Introduction

In Django, suppose I want to iterate and print the resulting QuerySet, what is the best option for counting objects? len(qs) or qs.count()? Is the len() method block or count() faster?

The choice between len() and count() depends on the situation. This article explains in depth how to use len() and count() correctly:

condition 1

(Core) If you only want to know the number of elements and do not plan to process the QuerySet in any way, using count() is the first choice

  1. queryset.count() is equivalent to sql executing select count(*) some_table query, all calculations are performed by RDBMS (data terminal), and Python only needs to get the result at a fixed cost of O(1).
  2. len(queryset): This will execute the select * from some_table query, get the entire table O(N), and need O(N) memory to store the data, which is more troublesome.

condition 2

When you need to use a QuerySet, it is best to use len(), so there is no need for count() to access the database again.

# Extract all data-no additional cost-will still extract data in the for loop
len(queryset) 

for obj in queryset: # len() has acquired data-use cache
     pass

Count

queryset.count() # This will perform an additional database query -len() no

for obj in queryset: # Get data
     pass

3. Restore the second case (when the query set has been obtained):

for obj in queryset: # Iteratively obtain data
     len(queryset) # Use cached data-O(1) No additional overhead
     queryset.count() # Use cache-O(1) No additional database query

len(queryset) # Same O(1)
queryset.count() # Same, no query O(1)

QuerySet source code:

class QuerySet(object):

    def __init__(self, model=None, query=None, using=None, hints=None):
        # (...)
        self._result_cache = None

    def __len__(self):
        self._fetch_all()
        return len(self._result_cache)

    def _fetch_all(self):
        if self._result_cache is None:
            self._result_cache = list(self.iterator())
        if self._prefetch_related_lookups and not self._prefetch_done:
            self._prefetch_related_objects()

    def count(self):
        if self._result_cache is not None:
            return len(self._result_cache)

        return self.query.get_count(using=self.db)
Please log in to leave a comment.