Fixed #10045 -- Corrected docs about .annotate()/.filter() ordering. (91a431f4) · Commits · Dom Sekotill / django

docs/topics/db/aggregation.txt

+63 −23

Original line number	Diff line number	Diff line
		@@ -184,6 +184,8 @@ of the ``annotate()`` clause is a ``QuerySet``; this ``QuerySet`` can be
		modified using any other ``QuerySet`` operation, including ``filter()``,
		``order_by()``, or even additional calls to ``annotate()``.

		.. _combining-multiple-aggregations:

		Combining multiple aggregations
		-------------------------------

		@@ -340,29 +342,67 @@ Order of ``annotate()`` and ``filter()`` clauses
		~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

		When developing a complex query that involves both ``annotate()`` and
		``filter()`` clauses, particular attention should be paid to the order
		in which the clauses are applied to the ``QuerySet``.

		When an ``annotate()`` clause is applied to a query, the annotation is
		computed over the state of the query up to the point where the annotation
		is requested. The practical implication of this is that ``filter()`` and
		``annotate()`` are not commutative operations -- that is, there is a
		difference between the query::

		>>> Publisher.objects.annotate(num_books=Count('book')).filter(book__rating__gt=3.0)

		and the query::

		>>> Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))

		Both queries will return a list of publishers that have at least one good
		book (i.e., a book with a rating exceeding 3.0). However, the annotation in
		the first query will provide the total number of all books published by the
		publisher; the second query will only include good books in the annotated
		count. In the first query, the annotation precedes the filter, so the
		filter has no effect on the annotation. In the second query, the filter
		precedes the annotation, and as a result, the filter constrains the objects
		considered when calculating the annotation.
		``filter()`` clauses, pay particular attention to the order in which the
		clauses are applied to the ``QuerySet``.

		When an ``annotate()`` clause is applied to a query, the annotation is computed
		over the state of the query up to the point where the annotation is requested.
		The practical implication of this is that ``filter()`` and ``annotate()`` are
		not commutative operations.

		Given:

		* Publisher A has two books with ratings 4 and 5.
		* Publisher B has two books with ratings 1 and 4.
		* Publisher C has one book with rating 1.

		Here's an example with the ``Count`` aggregate::

		>>> a, b = Publisher.objects.annotate(num_books=Count('book', distinct=True)).filter(book__rating__gt=3.0)
		>>> a, a.num_books
		(<Publisher: A>, 2)
		>>> b, b.num_books
		(<Publisher: B>, 2)

		>>> a, b = Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))
		>>> a, a.num_books
		(<Publisher: A>, 2)
		>>> b, b.num_books
		(<Publisher: B>, 1)

		Both queries return a list of publishers that have at least one book with a
		rating exceeding 3.0, hence publisher C is excluded.

		In the first query, the annotation precedes the filter, so the filter has no
		effect on the annotation. ``distinct=True`` is required to avoid a
		:ref:`cross-join bug <combining-multiple-aggregations>`.

		The second query counts the number of books that have a rating exceeding 3.0
		for each publisher. The filter precedes the annotation, so the filter
		constrains the objects considered when calculating the annotation.

		Here's another example with the ``Avg`` aggregate::

		>>> a, b = Publisher.objects.annotate(avg_rating=Avg('book__rating')).filter(book__rating__gt=3.0)
		>>> a, a.avg_rating
		(<Publisher: A>, 4.5) # (5+4)/2
		>>> b, b.avg_rating
		(<Publisher: B>, 2.5) # (1+4)/2

		>>> a, b = Publisher.objects.filter(book__rating__gt=3.0).annotate(avg_rating=Avg('book__rating'))
		>>> a, a.avg_rating
		(<Publisher: A>, 4.5) # (5+4)/2
		>>> b, b.avg_rating
		(<Publisher: B>, 4.0) # 4/1 (book with rating 1 excluded)

		The first query asks for the average rating of all a publisher's books for
		publisher's that have at least one book with a rating exceeding 3.0. The second
		query asks for the average of a publisher's book's ratings for only those
		ratings exceeding 3.0.

		It's difficult to intuit how the ORM will translate complex querysets into SQL
		queries so when in doubt, inspect the SQL with ``str(queryset.query)`` and
		write plenty of tests.

		``order_by()``
		--------------