Django - SQL bulk get_or_create possible?

Nem*_* Ga 6 sql django postgresql django-models

I am using get_or_create to insert objects to database but the problem is that doing 1000 at once takes too long time.

I tried bulk_create but it doesn't provide functionality I need (creates duplicates, ignores unique value, doesn't trigger post_save signals I need).

甚至可以通过自定义的SQL查询批量进行get_or_create吗?

这是我的示例代码:

related_data = json.loads(urllib2.urlopen(final_url).read())

for item in related_data:

    kw = item['keyword']
    e, c = KW.objects.get_or_create(KWuser=kw, author=author)
    e.project.add(id)
    #Add m2m to parent project
Run Code Online (Sandbox Code Playgroud)

related_data包含1000行,如下所示:

[{"cmp":0,"ams":3350000,"cpc":0.71,"keyword":"apple."},
{"cmp":0.01,"ams":3350000,"cpc":1.54,"keyword":"apple -10810"}......]
Run Code Online (Sandbox Code Playgroud)

KW模型还会发送我用来创建另一个父模型的信号:

@receiver(post_save, sender=KW)
def grepw(sender, **kwargs):
    if kwargs.get('created', False):
        id = kwargs['instance'].id
        kww = kwargs['instance'].KWuser
        # KeyO 
        a, b = KeyO.objects.get_or_create(defaults={'keyword': kww}, keyword__iexact=kww)
        KW.objects.filter(id=id).update(KWF=a.id)
Run Code Online (Sandbox Code Playgroud)

这行得通,但是您可以想象一次完成数千行需要很长时间,甚至会使我的小型服务器崩溃,我有什么批量选择?

Ali*_*Ali 11

从 Django 2.2 开始,bulk_create 有一个ignore_conflicts标志。根据文档

在支持它的数据库(除 Oracle 之外的所有数据库)上,将ignore_conflicts 参数设置为 True 告诉数据库忽略插入失败约束(例如重复的唯一值)的任何行的失败


小智 5

这篇文章可能对您有用:

stackoverflow.com/questions/3395236/aggregating-saves-in-django

请注意,答案建议使用已弃用的 commit_on_success 装饰器。它被 transaction.atomic 装饰器取代。文档在这里:

交易

from django.db import transaction

@transaction.atomic
def lot_of_saves(queryset):
    for item in queryset:
        modify_item(item)
        item.save()
Run Code Online (Sandbox Code Playgroud)