Commit e4cf8c84 authored by Andriy Sokolovskiy's avatar Andriy Sokolovskiy Committed by Tim Graham
Browse files

Fixed #24301 -- Added PostgreSQL-specific aggregate functions

parent 931a340f
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -49,7 +49,7 @@ answer newbie questions, and generally made Django that much better:
    Andrew Godwin <andrew@aeracode.org>
    Andrew Pinkham <http://AndrewsForge.com>
    Andrews Medina <andrewsmedina@gmail.com>
    Andriy Sokolovskiy <sokandpal@yandex.ru>
    Andriy Sokolovskiy <me@asokolovskiy.com>
    Andy Dustman <farcepest@gmail.com>
    Andy Gayton <andy-django@thecablelounge.com>
    andy@jadedplanet.net
+2 −0
Original line number Diff line number Diff line
from .general import *  # NOQA
from .statistics import *  # NOQA
+43 −0
Original line number Diff line number Diff line
from django.db.models.aggregates import Aggregate

__all__ = [
    'ArrayAgg', 'BitAnd', 'BitOr', 'BoolAnd', 'BoolOr', 'StringAgg',
]


class ArrayAgg(Aggregate):
    function = 'ARRAY_AGG'

    def convert_value(self, value, expression, connection, context):
        if not value:
            return []
        return value


class BitAnd(Aggregate):
    function = 'BIT_AND'


class BitOr(Aggregate):
    function = 'BIT_OR'


class BoolAnd(Aggregate):
    function = 'BOOL_AND'


class BoolOr(Aggregate):
    function = 'BOOL_OR'


class StringAgg(Aggregate):
    function = 'STRING_AGG'
    template = "%(function)s(%(expressions)s, '%(delimiter)s')"

    def __init__(self, expression, delimiter, **extra):
        super(StringAgg, self).__init__(expression, delimiter=delimiter, **extra)

    def convert_value(self, value, expression, connection, context):
        if not value:
            return ''
        return value
+80 −0
Original line number Diff line number Diff line
from django.db.models import FloatField, IntegerField
from django.db.models.aggregates import Aggregate

__all__ = [
    'CovarPop', 'Corr', 'RegrAvgX', 'RegrAvgY', 'RegrCount', 'RegrIntercept',
    'RegrR2', 'RegrSlope', 'RegrSXX', 'RegrSXY', 'RegrSYY', 'StatAggregate',
]


class StatAggregate(Aggregate):
    def __init__(self, y, x, output_field=FloatField()):
        if not x or not y:
            raise ValueError('Both y and x must be provided.')
        super(StatAggregate, self).__init__(y=y, x=x, output_field=output_field)
        self.x = x
        self.y = y
        self.source_expressions = self._parse_expressions(self.y, self.x)

    def get_source_expressions(self):
        return self.y, self.x

    def set_source_expressions(self, exprs):
        self.y, self.x = exprs

    def resolve_expression(self, query=None, allow_joins=True, reuse=None, summarize=False, for_save=False):
        return super(Aggregate, self).resolve_expression(query, allow_joins, reuse, summarize)


class Corr(StatAggregate):
    function = 'CORR'


class CovarPop(StatAggregate):
    def __init__(self, y, x, sample=False):
        self.function = 'COVAR_SAMP' if sample else 'COVAR_POP'
        super(CovarPop, self).__init__(y, x)


class RegrAvgX(StatAggregate):
    function = 'REGR_AVGX'


class RegrAvgY(StatAggregate):
    function = 'REGR_AVGY'


class RegrCount(StatAggregate):
    function = 'REGR_COUNT'

    def __init__(self, y, x):
        super(RegrCount, self).__init__(y=y, x=x, output_field=IntegerField())

    def convert_value(self, value, expression, connection, context):
        if value is None:
            return 0
        return int(value)


class RegrIntercept(StatAggregate):
    function = 'REGR_INTERCEPT'


class RegrR2(StatAggregate):
    function = 'REGR_R2'


class RegrSlope(StatAggregate):
    function = 'REGR_SLOPE'


class RegrSXX(StatAggregate):
    function = 'REGR_SXX'


class RegrSXY(StatAggregate):
    function = 'REGR_SXY'


class RegrSYY(StatAggregate):
    function = 'REGR_SYY'
+212 −0
Original line number Diff line number Diff line
=========================================
PostgreSQL specific aggregation functions
=========================================

.. module:: django.contrib.postgres.aggregates
   :synopsis: PostgreSQL specific aggregation functions

.. versionadded:: 1.9

These functions are described in more detail in the `PostgreSQL docs
<http://www.postgresql.org/docs/current/static/functions-aggregate.html>`_.

.. note::

    All functions come without default aliases, so you must explicitly provide
    one. For example::

        >>> SomeModel.objects.aggregate(arr=ArrayAgg('somefield'))
        {'arr': [0, 1, 2]}

General-purpose aggregation functions
-------------------------------------

ArrayAgg
~~~~~~~~

.. class:: ArrayAgg(expression, **extra)

    Returns a list of values, including nulls, concatenated into an array.

BitAnd
~~~~~~

.. class:: BitAnd(expression, **extra)

    Returns an ``int`` of the bitwise ``AND`` of all non-null input values, or
    ``None`` if all values are null.

BitOr
~~~~~

.. class:: BitOr(expression, **extra)

    Returns an ``int`` of the bitwise ``OR`` of all non-null input values, or
    ``None`` if all values are null.

BoolAnd
~~~~~~~~

.. class:: BoolAnd(expression, **extra)

    Returns ``True``, if all input values are true, ``None`` if all values are
    null or if there are no values, otherwise ``False`` .

BoolOr
~~~~~~

.. class:: BoolOr(expression, **extra)

    Returns ``True`` if at least one input value is true, ``None`` if all
    values are null or if there are no values, otherwise ``False``.

StringAgg
~~~~~~~~~

.. class:: StringAgg(expression, delimiter)

    Returns the input values concatenated into a string, separated by
    the ``delimiter`` string.

    .. attribute:: delimiter

        Required argument. Needs to be a string.

Aggregate functions for statistics
----------------------------------

``y`` and ``x``
~~~~~~~~~~~~~~~

The arguments ``y`` and ``x`` for all these functions can be the name of a
field or an expression returning a numeric data. Both are required.

Corr
~~~~

.. class:: Corr(y, x)

    Returns the correlation coefficient as a ``float``, or ``None`` if there
    aren't any matching rows.

CovarPop
~~~~~~~~

.. class:: CovarPop(y, x, sample=False)

    Returns the population covariance as a ``float``, or ``None`` if there
    aren't any matching rows.

    Has one optional argument:

    .. attribute:: sample

        By default ``CovarPop`` returns the general population covariance.
        However, if ``sample=True``, the return value will be the sample
        population covariance.

RegrAvgX
~~~~~~~~

.. class:: RegrAvgX(y, x)

    Returns the average of the independent variable (``sum(x)/N``) as a
    ``float``, or ``None`` if there aren't any matching rows.

RegrAvgY
~~~~~~~~

.. class:: RegrAvgY(y, x)

    Returns the average of the independent variable (``sum(y)/N``) as a
    ``float``, or ``None`` if there aren't any matching rows.

RegrCount
~~~~~~~~~

.. class:: RegrCount(y, x)

    Returns an ``int`` of the number of input rows in which both expressions
    are not null.

RegrIntercept
~~~~~~~~~~~~~

.. class:: RegrIntercept(y, x)

    Returns the y-intercept of the least-squares-fit linear equation determined
    by the ``(x, y)`` pairs as a ``float``, or ``None`` if there aren't any
    matching rows.

RegrR2
~~~~~~

.. class:: RegrR2(y, x)

    Returns the square of the correlation coefficient as a ``float``, or
    ``None`` if there aren't any matching rows.

RegrSlope
~~~~~~~~~

.. class:: RegrSlope(y, x)

    Returns the slope of the least-squares-fit linear equation determined
    by the ``(x, y)`` pairs as a ``float``, or ``None`` if there aren't any
    matching rows.

RegrSXX
~~~~~~~

.. class:: RegrSXX(y, x)

    Returns ``sum(x^2) - sum(x)^2/N`` ("sum of squares" of the independent
    variable) as a ``float``, or ``None`` if there aren't any matching rows.

RegrSXY
~~~~~~~

.. class:: RegrSXY(y, x)

    Returns ``sum(x*y) - sum(x) * sum(y)/N`` ("sum of products" of independent
    times dependent variable) as a ``float``, or ``None`` if there aren't any
    matching rows.

RegrSYY
~~~~~~~

.. class:: RegrSYY(y, x)

    Returns ``sum(y^2) - sum(y)^2/N`` ("sum of squares" of the dependent
    variable)  as a ``float``, or ``None`` if there aren't any matching rows.

Usage examples
--------------

We will use this example table::

    | FIELD1 | FIELD2 | FIELD3 |
    |--------|--------|--------|
    |    foo |      1 |     13 |
    |    bar |      2 | (null) |
    |   test |      3 |     13 |


Here's some examples of some of the general-purpose aggregation functions::

    >>> TestModel.objects.aggregate(result=StringAgg('field1', delimiter=';'))
    {'result': 'foo;bar;test'}
    >>> TestModel.objects.aggregate(result=ArrayAgg('field2'))
    {'result': [1, 2, 3]}
    >>> TestModel.objects.aggregate(result=ArrayAgg('field1'))
    {'result': ['foo', 'bar', 'test']}

The next example shows the usage of statistical aggregate functions. The
underlying math will be not described (you can read about this, for example, at
`wikipedia <http://en.wikipedia.org/wiki/Regression_analysis>`_)::

    >>> TestModel.objects.aggregate(count=RegrCount(y='field3', x='field2'))
    {'count': 2}
    >>> TestModel.objects.aggregate(avgx=RegrAvgX(y='field3', x='field2'),
    ...                             avgy=RegrAvgY(y='field3', x='field2'))
    {'avgx': 2, 'avgy': 13}
Loading