Blu*_*gma 8 sql django django-models amazon-redshift
我有一个包含很多字段的 django 模型。我正在尝试在单个查询中获取给定字段的平均值以及同一字段的前 5 个值的平均值(来自我关于纯 SQL 的其他问题: Average of top 5 value in a table for a给定值)通过...分组)。这并不重要,但是:我的数据库是红移的。
我找到了两种不同的方法来在 SQL 中实现此目的,但我在使用 django ORM 实现这些查询时遇到了麻烦
这是我想使用 Cars 执行的操作的示例:
class Cars(models.Model):
manufacturer = models.CharField()
model = models.CharField()
price = models.FloatField()
Run Code Online (Sandbox Code Playgroud)
数据:
manufacturer | model | price
Citroen C1 1
Citroen C2 2
Citroen C3 3
Citroen C4 4
Citroen C5 5
Citroen C6 6
Ford F1 7
Ford F2 8
Ford F3 9
Ford F4 10
Ford F5 11
Ford F6 12
Ford F6 19
GenMotor G1 20
GenMotor G3 25
GenMotor G4 22
Run Code Online (Sandbox Code Playgroud)
预期输出:
manufacturer | average_price | average_top_5_price
Citroen 3.5 4.0
Ford 10.85 12.2
GenMotor 22.33 22.33
Run Code Online (Sandbox Code Playgroud)
下面是两个纯SQL查询,达到了预期的效果:
SELECT
main.manufacturer,
AVG(main.price) AS average_price,
AVG(CASE WHEN rank <= 5 THEN main.price END) AS average_top_5_price
FROM (
SELECT
manufacturer,
price,
ROW_NUMBER() OVER (PARTITION BY manufacturer ORDER BY price DESC) AS rank
FROM
cars
) main
GROUP BY
main.manufacturer;
Run Code Online (Sandbox Code Playgroud)
第二种方法:
SELECT A.manufacturer, A.avg_price, B.top5_price
FROM (
SELECT manufacturer, AVG(price) as avg_price
FROM cars
GROUP BY manufacturer
) A
JOIN (
SELECT manufacturer, AVG(psv_99) as top5_price
FROM (
SELECT manufacturer, price, RANK()
OVER (PARTITION BY manufacturer ORDER BY price DESC, id)
FROM cars
)
WHERE rank <= 5
GROUP BY manufacturer
) B
ON A.manufacturer = B.manufacturer
ORDER BY manufacturer
Run Code Online (Sandbox Code Playgroud)
到目前为止,我还没有设法使用 django ORM 实现这些查询中的任何一个,对于第一个查询,我找不到让 django 为第二个查询执行“从子查询中选择”的方法,我找不到好方法强制 django “加入两个子查询”
PS:请记住,我已将表减少到三个字段以简化解决该特定问题,但我的真实表中有大约 100 列,我在相同的查询中进行不同的计算。
values您可以使用和的组合annotate来group by制造,然后使用 计算该组的平均值Avg。
很average_price容易计算:
from django.db.models import Avg
from django.db.models.functions import Round
averages =
Car.objects.values("manufacturer").annotate(average_price=Round(Avg("price"), precision=2))
Run Code Online (Sandbox Code Playgroud)
但要计算每组的前五名,就有点复杂了(我认为)。为此,您需要一个Subquery. 所以,完整的代码是:
from django.db.models import Subquery, OuterRef, Avg, Q
from django.db.models.functions import Round
group_top_5 = Car.objects.filter(manufacturer=OuterRef("manufacturer")).order_by("-price")[:5].values("price")
query_filter = Q(price__in=group_top_5)
averages = (
Car.objects.values("manufacturer")
.annotate(
average_price=Round(Avg("price"), precision=2),
average_top_5_price=Round(Avg("price", filter=query_filter), precision=2))
)
Run Code Online (Sandbox Code Playgroud)
这应该给你:
{'manufacturer': 'Citroen', 'average_price': 3.5, 'average_top_5_price': 4.0}
{'manufacturer': 'Ford', 'average_price': 10.86, 'average_top_5_price': 12.2}
{'manufacturer': 'GenMotor', 'average_price': 22.33, 'average_top_5_price': 22.33}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
232 次 |
| 最近记录: |