Moi*_*Moi 5 django query-optimization django-orm
我正在尝试优化我的审核系统的查询,使用 Django 和 DRF 构建。我目前坚持重复检索:目前,我有类似的东西
class AdminSerializer(ModelSerializer):
duplicates = SerializerMethodField()
def get_duplicates(self, item):
if item.allowed:
qs = []
else:
qs = Item.objects.filter(
allowed=True,
related_stuff__language=item.related_stuff.language
).annotate(
similarity=TrigramSimilarity('name', item.name)
).filter(similarity__gt=0.2).order_by('-similarity')[:10]
return AdminMinimalSerializer(qs, many=True).data
Run Code Online (Sandbox Code Playgroud)
这工作正常,但至少为每个要显示的项目执行一个额外的查询。另外,如果有重复项,我会做额外的查询来填充AdminMinimalSerializer,其中包含重复项的字段和相关对象。我可能可以通过使用prefetch_related内部序列化程序来减少开销,但这并不能阻止我对每个项目进行多次查询(假设我只有一个相关项目要预取AdminMinimalSerializer,我仍然有 ~2N + 1 个查询:1项,N 为重复项,N 为重复项的相关项)。
我已经看过了Subquery,但是我无法检索一个对象,只能检索一个 id,这在我的情况下还不够。我尝试在Prefetch对象和.annotate.
我也尝试过类似的东西Item.filter(allowed=False).prefetch(Prefetch("related_stuff__language__related_stuff_set__items", queryset=Items.filter..., to_attr="duplicates")),但该duplicates属性被添加到“related_stuff__language__related_stuff_set”,所以我不能真正使用它......
我欢迎任何想法;)
编辑:真正的代码在这里。下面的玩具示例:
# models.py
from django.db.models import Model, CharField, ForeignKey, CASCADE, BooleanField
class Book(Model):
title = CharField(max_length=250)
serie = ForeignKey(Serie, on_delete=CASCADE, related_name="books")
allowed = BooleanField(default=False)
class Serie(Model):
title = CharField(max_length=250)
language = ForeignKey(Language, on_delete=CASCADE, related_name="series")
class Language(Model):
name = CharField(max_length=100)
Run Code Online (Sandbox Code Playgroud)
# serializers.py
from django.contrib.postgres.search import TrigramSimilarity
from rest_framework.serializers import ModelSerializer, SerializerMethodField
from .models import Book, Language, Serie
class BookAdminSerializer(ModelSerializer):
class Meta:
model = Book
fields = ("id", "title", "serie", "duplicates", )
serie = SerieAdminAuxSerializer()
duplicates = SerializerMethodField()
def get_duplicates(self, book):
"""Retrieve duplicates for book"""
if book.allowed:
qs = []
else:
qs = (
Book.objects.filter(
allowed=True, serie__language=book.serie.language)
.annotate(similarity=TrigramSimilarity("title", book.title))
.filter(similarity__gt=0.2)
.order_by("-similarity")[:10]
)
return BookAdminMinimalSerializer(qs, many=True).data
class BookAdminMinimalSerializer(ModelSerializer):
class Meta:
model = Book
fields = ("id", "title", "serie")
serie = SerieAdminAuxSerializer()
class SerieAdminAuxSerializer(ModelSerializer):
class Meta:
model = Serie
fields = ("id", "language", "title")
language = LanguageSerializer()
class LanguageSerializer(ModelSerializer):
class Meta:
model = Language
fields = ('id', 'name')
Run Code Online (Sandbox Code Playgroud)
我试图找到一种方式来预取相关对象和副本,这样我可以摆脱了get_duplicates在方法BookSerializer,它使N + 1查询,只有一个duplicates在我的领域BookSerializer。
关于数据,这是一个预期的输出:
[
{
"id": 2,
"title": "test2",
"serie": {
"id": 2,
"language": {
"id": 1,
"name": "English"
},
"title": "series title"
},
"duplicates": [
{
"id": 1,
"title": "test",
"serie": {
"id": 1,
"language": {
"id": 1,
"name": "English"
},
"title": "first series title"
}
}
]
},
{
"id": 3,
"title": "random",
"serie": {
"id": 3,
"language": {
"id": 1,
"name": "English"
},
"title": "random series title"
},
"duplicates": []
}
]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
133 次 |
| 最近记录: |