CoreData CloudKit 中有哪些可靠的机制可以防止数据重复?

Che*_*eng 7 core-data ios swift cloudkit

我们的每个数据行都包含一个唯一的uuid列。

以前,在采用CloudKit之前,uuid列有一个唯一的约束。这使我们能够防止数据重复。

现在,我们开始将 CloudKit 集成到我们现有的 CoreData 中。这种唯一的约束被删除。下面的用户流程,会造成数据重复。

使用CloudKit时导致数据重复的步骤

  1. 首次启动应用程序。
  2. 由于存在空数据,因此uuid生成具有预定义的预定义数据。
  3. 预定义数据同步到 iCloud。
  4. 该应用程序已卸载。
  5. 该应用程序已重新安装。
  6. 首次启动应用程序。
  7. 由于存在空数据,因此uuid生成具有预定义的预定义数据。
  8. 步骤 3 中先前的旧预定义数据已同步到设备。
  9. 我们现在有 2 个相同的预定义数据uuid!:(

我想知道,我们有没有办法防止这种重复?

在第 8 步中,我们希望有一种方法可以在写入 CoreData 之前执行此类逻辑

检查CoreData中是否存在这样的uuid。如果没有,则写入 CoreData。如果没有,我们将选择更新日期最新的数据,然后覆盖现有数据。

我曾经尝试将上述逻辑插入https://developer.apple.com/documentation/coredata/nsmanagementobject/1506209-willsave。为了防止保存,我正在使用self.managedObjectContext?.rollback(). 但它只是崩溃了。

您知道我可以使用哪些可靠的机制来防止 CoreData CloudKit 中的数据重复吗?


附加信息:

采用 CloudKit 之前

我们正在使用以下 CoreData 堆栈

class CoreDataStack {
    static let INSTANCE = CoreDataStack()
    
    private init() {
    }
    
    private(set) lazy var persistentContainer: NSPersistentContainer = {
        precondition(Thread.isMainThread)
        
        let container = NSPersistentContainer(name: "xxx", managedObjectModel: NSManagedObjectModel.wenote)
        
        container.loadPersistentStores(completionHandler: { (storeDescription, error) in
            if let error = error as NSError? {
                // This is a serious fatal error. We will just simply terminate the app, rather than using error_log.
                fatalError("Unresolved error \(error), \(error.userInfo)")
            }
        })
        
        // So that when backgroundContext write to persistent store, container.viewContext will retrieve update from
        // persistent store.
        container.viewContext.automaticallyMergesChangesFromParent = true
        
        // TODO: Not sure these are required...
        //
        //container.viewContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
        //container.viewContext.undoManager = nil
        //container.viewContext.shouldDeleteInaccessibleFaults = true
        
        return container
    }()
Run Code Online (Sandbox Code Playgroud)

我们的 CoreData 数据模式有

  1. 唯一约束。
  2. 拒绝关系删除规则。
  3. 非空字段没有默认值。

采用CloudKit后

class CoreDataStack {
    static let INSTANCE = CoreDataStack()
    
    private init() {
    }
    
    private(set) lazy var persistentContainer: NSPersistentContainer = {
        precondition(Thread.isMainThread)
        
        let container = NSPersistentCloudKitContainer(name: "xxx", managedObjectModel: NSManagedObjectModel.wenote)
        
        container.loadPersistentStores(completionHandler: { (storeDescription, error) in
            if let error = error as NSError? {
                // This is a serious fatal error. We will just simply terminate the app, rather than using error_log.
                fatalError("Unresolved error \(error), \(error.userInfo)")
            }
        })
        
        // So that when backgroundContext write to persistent store, container.viewContext will retrieve update from
        // persistent store.
        container.viewContext.automaticallyMergesChangesFromParent = true
        
        // TODO: Not sure these are required...
        //
        //container.viewContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
        //container.viewContext.undoManager = nil
        //container.viewContext.shouldDeleteInaccessibleFaults = true
        
        return container
    }()
Run Code Online (Sandbox Code Playgroud)

我们将 CoreData 数据架构更改为

  1. 没有唯一约束。
  2. 取消关系的删除规则。
  3. 非空字段具有默认值。

根据https://developer.apple.com/forums/thread/699634?login=true的开发者技术支持工程师的反馈,她提到我们可以

  1. 通过使用存储持久历史记录来检测相关更改
  2. 删除重复数据

但是,由于提供的 github 链接已损坏,因此尚不完全清楚应如何实现它。

Che*_*eng 4

一旦我们与CloudKit集成,就没有独特的约束功能。

此限制的解决方法是

一旦CloudKit插入后检测到重复,我们就会执行重复数据删除。

此解决方法的挑战性部分是,当 CloudKit 执行插入时,我们如何才能收到通知?

以下分步介绍了如何在 CloudKit 执行插入时收到通知。

  1. 打开NSPersistentHistoryTrackingKeyCoreData 中的功能。
  2. 打开NSPersistentStoreRemoteChangeNotificationPostOptionKeyCoreData 中的功能。
  3. viewContext.transactionAuthor = "app"。这是重要的一步,这样当我们查询事务历史记录时,我们就知道哪个数据库事务是由我们的应用程序发起的,哪个数据库事务是由 CloudKit 发起的。
  4. 每当我们通过功能自动收到通知时NSPersistentStoreRemoteChangeNotificationPostOptionKey,我们就会开始查询交易历史记录。查询将根据交易作者最后的查询令牌进行过滤。更详细的请参考代码示例。
  5. 一旦我们检测到事务被插入,并且它对我们相关的实体进行操作,我们将开始根据相关实体执行重复数据删除

代码示例

import CoreData

class CoreDataStack: CoreDataStackable {
    let appTransactionAuthorName = "app"
    
    /**
     The file URL for persisting the persistent history token.
    */
    private lazy var tokenFile: URL = {
        return UserDataDirectory.token.url.appendingPathComponent("token.data", isDirectory: false)
    }()
    
    /**
     Track the last history token processed for a store, and write its value to file.
     
     The historyQueue reads the token when executing operations, and updates it after processing is complete.
     */
    private var lastHistoryToken: NSPersistentHistoryToken? = nil {
        didSet {
            guard let token = lastHistoryToken,
                let data = try? NSKeyedArchiver.archivedData( withRootObject: token, requiringSecureCoding: true) else { return }
            
            if !UserDataDirectory.token.url.createCompleteDirectoryHierarchyIfDoesNotExist() {
                return
            }
            
            do {
                try data.write(to: tokenFile)
            } catch {
                error_log(error)
            }
        }
    }
    
    /**
     An operation queue for handling history processing tasks: watching changes, deduplicating tags, and triggering UI updates if needed.
     */
    private lazy var historyQueue: OperationQueue = {
        let queue = OperationQueue()
        queue.maxConcurrentOperationCount = 1
        return queue
    }()
    
    var viewContext: NSManagedObjectContext {
        persistentContainer.viewContext
    }
    
    static let INSTANCE = CoreDataStack()
    
    private init() {
        // Load the last token from the token file.
        if let tokenData = try? Data(contentsOf: tokenFile) {
            do {
                lastHistoryToken = try NSKeyedUnarchiver.unarchivedObject(ofClass: NSPersistentHistoryToken.self, from: tokenData)
            } catch {
                error_log(error)
            }
        }
    }
    
    deinit {
        deinitStoreRemoteChangeNotification()
    }
    
    private(set) lazy var persistentContainer: NSPersistentContainer = {
        precondition(Thread.isMainThread)
        
        let container = NSPersistentCloudKitContainer(name: "xxx", managedObjectModel: NSManagedObjectModel.xxx)
        
        // turn on persistent history tracking
        let description = container.persistentStoreDescriptions.first
        description?.setOption(true as NSNumber, forKey: NSPersistentHistoryTrackingKey)
        description?.setOption(true as NSNumber, forKey: NSPersistentStoreRemoteChangeNotificationPostOptionKey)
        
        container.loadPersistentStores(completionHandler: { (storeDescription, error) in
            if let error = error as NSError? {
                // This is a serious fatal error. We will just simply terminate the app, rather than using error_log.
                fatalError("Unresolved error \(error), \(error.userInfo)")
            }
        })
        
        // Provide transaction author name, so that we can know whether this DB transaction is performed by our app
        // locally, or performed by CloudKit during background sync.
        container.viewContext.transactionAuthor = appTransactionAuthorName
        
        // So that when backgroundContext write to persistent store, container.viewContext will retrieve update from
        // persistent store.
        container.viewContext.automaticallyMergesChangesFromParent = true
        
        // TODO: Not sure these are required...
        //
        //container.viewContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
        //container.viewContext.undoManager = nil
        //container.viewContext.shouldDeleteInaccessibleFaults = true
        
        // Observe Core Data remote change notifications.
        initStoreRemoteChangeNotification(container)
        
        return container
    }()
    
    private(set) lazy var backgroundContext: NSManagedObjectContext = {
        precondition(Thread.isMainThread)
        
        let backgroundContext = persistentContainer.newBackgroundContext()

        // Provide transaction author name, so that we can know whether this DB transaction is performed by our app
        // locally, or performed by CloudKit during background sync.
        backgroundContext.transactionAuthor = appTransactionAuthorName
        
        // Similar behavior as Android's Room OnConflictStrategy.REPLACE
        // Old data will be overwritten by new data if index conflicts happen.
        backgroundContext.mergePolicy = NSMergeByPropertyObjectTrumpMergePolicy
        
        // TODO: Not sure these are required...
        //backgroundContext.undoManager = nil
        
        return backgroundContext
    }()
    
    private func initStoreRemoteChangeNotification(_ container: NSPersistentContainer) {
        // Observe Core Data remote change notifications.
        NotificationCenter.default.addObserver(
            self,
            selector: #selector(storeRemoteChange(_:)),
            name: .NSPersistentStoreRemoteChange,
            object: container.persistentStoreCoordinator
        )
    }
    
    private func deinitStoreRemoteChangeNotification() {
        NotificationCenter.default.removeObserver(self)
    }
    
    @objc func storeRemoteChange(_ notification: Notification) {
        // Process persistent history to merge changes from other coordinators.
        historyQueue.addOperation {
            self.processPersistentHistory()
        }
    }
    
    /**
     Process persistent history, posting any relevant transactions to the current view.
     */
    private func processPersistentHistory() {
        backgroundContext.performAndWait {
            
            // Fetch history received from outside the app since the last token
            let historyFetchRequest = NSPersistentHistoryTransaction.fetchRequest!
            historyFetchRequest.predicate = NSPredicate(format: "author != %@", appTransactionAuthorName)
            let request = NSPersistentHistoryChangeRequest.fetchHistory(after: lastHistoryToken)
            request.fetchRequest = historyFetchRequest

            let result = (try? backgroundContext.execute(request)) as? NSPersistentHistoryResult
            guard let transactions = result?.result as? [NSPersistentHistoryTransaction] else { return }

            if transactions.isEmpty {
                return
            }
            
            for transaction in transactions {
                if let changes = transaction.changes {
                    for change in changes {
                        let entity = change.changedObjectID.entity.name
                        let changeType = change.changeType
                        let objectID = change.changedObjectID
                        
                        if entity == "NSTabInfo" && changeType == .insert {
                            deduplicateNSTabInfo(objectID)
                        }
                    }
                }
            }
            
            // Update the history token using the last transaction.
            lastHistoryToken = transactions.last!.token
        }
    }
    
    private func deduplicateNSTabInfo(_ objectID: NSManagedObjectID) {
        do {
            guard let nsTabInfo = try backgroundContext.existingObject(with: objectID) as? NSTabInfo else { return }
            
            let uuid = nsTabInfo.uuid
            
            guard let nsTabInfos = NSTabInfoRepository.INSTANCE.getNSTabInfosInBackground(uuid) else { return }
            
            if nsTabInfos.isEmpty {
                return
            }
            
            var bestNSTabInfo: NSTabInfo? = nil
            
            for nsTabInfo in nsTabInfos {
                if let _bestNSTabInfo = bestNSTabInfo {
                    if nsTabInfo.syncedTimestamp > _bestNSTabInfo.syncedTimestamp {
                        bestNSTabInfo = nsTabInfo
                    }
                } else {
                    bestNSTabInfo = nsTabInfo
                }
            }
            
            for nsTabInfo in nsTabInfos {
                if nsTabInfo === bestNSTabInfo {
                    continue
                }
                
                // Remove old duplicated data!
                backgroundContext.delete(nsTabInfo)
            }
            
            RepositoryUtils.saveContextIfPossible(backgroundContext)
        } catch {
            error_log(error)
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

参考

  1. https://developer.apple.com/documentation/coredata/synchronizing_a_local_store_to_the_cloud - 在示例代码中,该文件CoreDataStack.swift说明了一个类似的示例,说明如何在云同步后删除重复数据。
  2. https://developer.apple.com/documentation/coredata/consuming_relevant_store_changes - 有关交易历史的信息。
  3. 使用 NSPersistentCloudKitContainer 时预填充核心数据存储的最佳方法是什么?- 类似的问题