从大型且不断增长的数据集中优化慢速django查询集

43T*_*cts 3 python django postgresql performance django-queryset

我的页面加载速度太慢.不知何故,我需要改进查询数据的方式(缓存?部分加载/页面等等)

注意我是一个django noob,并没有完全包裹我的头model.Manager,models.query.QuerySet所以如果这个设置看起来很尴尬....

目前,页面加载查询集大约需要18秒,目前只有大约500条记录.每天平均会有大约100条新记录.

网络统计

该数据库是Postgresql

视图:

def approvals(request):
    ...
    approved_submissions = QuestSubmission.objects.all_approved()
    ...
Run Code Online (Sandbox Code Playgroud)

查询集:

class QuestSubmissionQuerySet(models.query.QuerySet):
    ...

    def approved(self):
        return self.filter(is_approved=True)

    def completed(self):
         return self.filter(is_completed=True).order_by('-time_completed')

    ...

class QuestSubmissionManager(models.Manager):
    def get_queryset(self):
        return QuestSubmissionQuerySet(self.model, using=self._db)

    def all_approved(self, user=None):
        return self.get_queryset().approved().completed()

    ...
Run Code Online (Sandbox Code Playgroud)

产生的SQL来自QuestSubmission.objects.all_approved():

'SELECT "quest_manager_questsubmission"."id", "quest_manager_questsubmission"."quest_id", "quest_manager_questsubmission"."user_id", "quest_manager_questsubmission"."ordinal", "quest_manager_questsubmission"."is_completed", "quest_manager_questsubmission"."time_completed", "quest_manager_questsubmission"."is_approved", "quest_manager_questsubmission"."time_approved", "quest_manager_questsubmission"."timestamp", "quest_manager_questsubmission"."updated", "quest_manager_questsubmission"."game_lab_transfer" FROM "quest_manager_questsubmission" WHERE ("quest_manager_questsubmission"."is_approved" = True AND "quest_manager_questsubmission"."is_completed" = True) ORDER BY "quest_manager_questsubmission"."time_completed" DESC'
Run Code Online (Sandbox Code Playgroud)

缓慢的模型:

class QuestSubmission(models.Model):
    quest = models.ForeignKey(Quest)
    user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name="quest_submission_user")
    ordinal = models.PositiveIntegerField(default = 1, help_text = 'indicating submissions beyond the first for repeatable quests')
    is_completed = models.BooleanField(default=False)
    time_completed = models.DateTimeField(null=True, blank=True)
    is_approved = models.BooleanField(default=False)
    time_approved = models.DateTimeField(null=True, blank=True)
    timestamp = models.DateTimeField(auto_now=True, auto_now_add=False)
    updated = models.DateTimeField(auto_now=False, auto_now_add=True)
    game_lab_transfer = models.BooleanField(default = False, help_text = 'XP not counted')

    class Meta:
        ordering = ["time_approved", "time_completed"]

    objects = QuestSubmissionManager()

    #other methods
    ....
Run Code Online (Sandbox Code Playgroud)

有哪些策略可以解决这个问题?我尝试使用django的Paginator,但它似乎只显示在页面中,但它仍然加载整个查询集.

Ale*_*nor 6

首先要看的是:

  • 此查询是否因为返回非常大的结果集而变慢?

要么

  • 此查询是否缓慢,因为它需要一段时间才能过滤掉表格?

假设前者,除了"返回更少的数据"之外,你没有很多好的选择.

如果它是后者,你应该EXPLAIN在数据库上运行一个,但是马上我会说你可能想要一个索引,可能就是(is_approved, is_completed).可以通过以下方式完成:

class Meta:
    index_together = [
        ["is_completed", "is_approved"],
    ]
Run Code Online (Sandbox Code Playgroud)


Pau*_*soa 4

如果您在页面中显示相关对象,请尝试使用 select_lated ()

如果没有 select_lated(),这将为每个循环迭代进行数据库查询,以便获取每个条目的相关博客。