164

Django's QuerySet has two methods, annotate and aggregate. The documentation says that:

Unlike aggregate(), annotate() is not a terminal clause. The output of the annotate() clause is a QuerySet.

Is there any other difference between them? If not, then why does aggregate exist?

Sнаđошƒаӽ
  • 15,289
  • 12
  • 72
  • 86
Alexander Artemenko
  • 18,838
  • 8
  • 38
  • 36

3 Answers3

257

I would focus on the example queries rather than your quote from the documentation. Aggregate calculates values for the entire queryset. Annotate calculates summary values for each item in the queryset.

Aggregation

>>> Book.objects.aggregate(average_price=Avg('price'))
{'average_price': 34.35}

Returns a dictionary containing the average price of all books in the queryset.

Annotation

>>> q = Book.objects.annotate(num_authors=Count('authors'))
>>> q[0].num_authors
2
>>> q[1].num_authors
1

q is the queryset of books, but each book has been annotated with the number of authors.

Alasdair
  • 278,338
  • 51
  • 534
  • 489
  • Am I correct that `.annotate()` on a qs alone does not hit the db, but calling `q[0].num_authors` does? I assume `aggregate` must always hit the db as it is a terminal clause? – alias51 Jul 30 '20 at 23:05
  • 1
    @alias51 that's really related to the original question, so I don't think the comments on an eight-year-old question is the best place to ask. If you want to check when the queries run, then you can [check `connection.queries`](https://stackoverflow.com/questions/1074212/how-can-i-see-the-raw-sql-queries-django-is-running). Hint: check whether it's the `book = q[0]` or ` book.num_authors` that causes the query. – Alasdair Jul 31 '20 at 12:27
33

Aggregate Aggregate generate result (summary) values over an entire QuerySet. Aggregate operate over the rowset to get a single value from the rowset.(For example sum of all prices in the rowset). Aggregate is applied on entire QuerySet and it generate result (summary) values over an entire QuerySet.

In Model:

class Books(models.Model):
    name = models.CharField(max_length=100)
    pages = models.IntegerField()
    price = models.DecimalField(max_digits=5, decimal_places=3)

In Shell:

>>> Books.objects.all().aggregate(Avg('price'))
# Above code will give the Average of the price Column 
>>> {'price__avg': 34.35}

Annotate Annotate generate an independent summary for each object in a QuerySet.(We can say it iterate each object in a QuerySet and apply operation)

In Model:

class Video(models.Model):
    name = models.CharField(max_length=52, verbose_name='Name')
    video = models.FileField(upload_to=document_path, verbose_name='Upload 
               video')
    created_by = models.ForeignKey(User, verbose_name='Created by', 
                       related_name="create_%(class)s")
    user_likes = models.ManyToManyField(UserProfile, null=True, 
                  blank=True, help_text='User can like once', 
                         verbose_name='Like by')

In View:

videos = Video.objects.values('id', 'name','video').annotate(Count('user_likes',distinct=True)

In view it will count the likes for each video

Mike Covington
  • 2,067
  • 1
  • 16
  • 25
Vinay Kumar
  • 999
  • 11
  • 15
  • 3
    why `distinct=True` is required in last example? – Yuriy Leonov Apr 08 '20 at 13:34
  • @YuriyLeonov distinct=True used for that the operation perform on distinct value. It's not related to the current question asked. Sorry for that Actually I have used on my code. – Vinay Kumar Apr 08 '20 at 13:48
23

That's the main difference, but aggregates also work on a grander scale than annotations. Annotations are inherently related to individual items in a queryset. If you run an Count annotation on a something like a many-to-many field, you'll get a separate count for each member of the queryset (as an added attribute). If you were to do the same with an aggregation, however, it would attempt to count every relationship on every member of the queryset, even duplicates, and return that as just one value.

Chris Pratt
  • 221,046
  • 31
  • 356
  • 415
  • Am I correct that `.annotate()` on a qs alone does not hit the db, but calling the result of an annotation such as `q[0].num_authors` does? I assume `aggregate` must always hit the db as it is a terminal clause? – alias51 Jul 30 '20 at 23:07