姜戈Postgres - 百分位数(中位数)和分组依据

Django amp; Postgres - percentile (median) and group by(姜戈Postgres - 百分位数(中位数)和分组依据)
本文介绍了姜戈Postgres - 百分位数(中位数)和分组依据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要计算每个卖家 ID 的期间中位数(参见下面的简化模型).问题是我无法构造 ORM 查询.

I need to calculate period medians per seller ID (see simplyfied model below). The problem is I am unable to construct the ORM query.

型号

class MyModel:
    period = models.IntegerField(null=True, default=None)
    seller_ids = ArrayField(models.IntegerField(), default=list)
    aux = JSONField(default=dict)

查询

queryset = (
    MyModel.objects.filter(period=25)
    .annotate(seller_id=Func(F("seller_ids"), function="unnest"))
    .values("seller_id")
    .annotate(
        duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()),
        median=Func(
            F("duration"),
            function="percentile_cont",
            template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
        ),
    )
    .values("median", "seller_id")
)

ArrayField 聚合(seller_id)来源


我认为我需要做的是以下几行


I think what I need to do is something along the lines below

select t.*, p_25, p_75
from t join
     (select district,
             percentile_cont(0.25) within group (order by sales) as p_25,
             percentile_cont(0.75) within group (order by sales) as p_75
      from t
      group by district
     ) td
     on t.district = td.district

以上示例来源


Python 3.7.5、Django 2.2.8、Postgres 11.1


Python 3.7.5, Django 2.2.8, Postgres 11.1

推荐答案

您可以像 Ryan Murphy (https://gist.github.com/rdmurphy/3f73c7b1826cacee34f6c2a855b12e2e).Median 然后就像 Avg:

You can create a Median child class of the Aggregate class as was done by Ryan Murphy (https://gist.github.com/rdmurphy/3f73c7b1826cacee34f6c2a855b12e2e). Median then works just like Avg:

    from django.db.models import Aggregate, FloatField


    class Median(Aggregate):
        function = 'PERCENTILE_CONT'
        name = 'median'
        output_field = FloatField()
        template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'

然后找到一个字段使用的中位数

Then to find the median of a field use

    my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))

然后以 my_model_aggregate['period__median'] 的形式提供.

which is then available as my_model_aggregate['period__median'].

这篇关于姜戈Postgres - 百分位数(中位数)和分组依据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)