加快在Excel中循环遍历大型数据集

Swe*_*pot 2 excel vba loops excel-vba

我有两个数据集,我需要比较和提取匹配.我有一个来自每个数据集中5列的复合键,结束了我需要提取的第6列.列由文本,日期和整数组成.两组都略低于500k行.

目前我在表a中使用for循环并在表b中循环.使用if参数将行与if语句进行比较以获取组合键.

Sub ArraySearch()

    Dim Main As Long
    Dim Search As Long
    Dim arrData() As Variant
    Dim arrSource As Variant

    arrData = Sheets("Sheet1").Range("H3:M500000").Value
    arrSource = Sheets("Ark1").Range("A3:H500000").Value

    Main = 1
    Search = 1

    For Main = 1 To UBound(arrSource, 1)

        For Search = 1 To UBound(arrData, 1)

            If arrSource(Main, 3) = arrData(Search, 1) And _
                arrSource(Main, 4) = arrData(Search, 2) And _
                arrSource(Main, 1) = arrData(Search, 3) And _
                arrSource(Main, 2) = arrData(Search, 4) And _
                arrSource(Main, 5) = arrData(Search, 5) _
            Then
                arrSource(Main, 8) = arrData(Search, 6)
                Exit For
            End If

        Next
    Next

    Sheets("Sheet2").Range("A3:H500000") = arrSource

End Sub
Run Code Online (Sandbox Code Playgroud)

到目前为止,最快的方法是将两个表加载到一个数组中并执行内存循环.

这是永远的.我们说的是几小时而不是几分钟.

有没有什么方法可以提高速度?或者我需要使用其他一些程序吗?(将其加载到数据库并使用SQL,使用Visual Studio与普通的VB.net,SSIS)

我希望这可以在VBA中完成,所以任何指针都会非常感激.

编辑

散列5列密钥会提高速度,还是必须迭代的行的共享量会产生滞后?

小智 5

比较两个列表的最快方法是根据公共密钥向Dictionary添加值.字典经过优化,可以搜索键,并且可以更快地返回基于键的值,然后可以迭代数组.

Sub DictionarySearch()
    Dim dict
    Dim key As String
    Dim x As Long
    Dim arrData() As Variant
    Dim arrSource As Variant

    Set dict = CreateObject("Scripting.Dictionary")

    arrData = Worksheets("Sheet1").Range("H3:M500000").Value
    arrSource = Worksheets("Ark1").Range("A3:H500000").Value

    For x = 1 To UBound(arrData, 1)
        key = arrData(x, 1) & ":" & arrData(x, 2) & ":" & arrData(x, 3) & ":" & arrData(x, 4) & ":" & arrData(x, 5)
        If Not dict.Exists(key) Then dict.Add key, arrData(x, 6)

    Next

    For x = 1 To UBound(arrSource, 1)
        key = arrSource(x, 3) & ":" & arrSource(x, 4) & ":" & arrSource(x, 1) & ":" & arrSource(x, 2) & ":" & arrSource(x, 5)
        If dict.Exists(key) Then arrSource(x, 8) = dict(key)
    Next

    Sheets("Sheet2").Range("A3:H500000") = arrSource
End Sub
Run Code Online (Sandbox Code Playgroud)