在SSIS中对完整的行集合执行LINQ?

Dyn*_*nde 0 c# linq ssis

嗯,正如标题所说.我想使用脚本组件目标,然后利用LINQ选择要处理输出的行.

对于更多的背景,我有一个丑陋的合并的东西与一对多的关系.行看起来像:

[ID] [Title]   [OneToManyDataID]
1    Item one   2
1    Item one   4
1    Item one   3
3    Item two   1
3    Item two   5
Run Code Online (Sandbox Code Playgroud)

我们将调用对象[Item],它具有ID和Title列以及[OneToMany]

我希望我可以把整个东西扔到一个脚本组件目的地,然后使用LINQ按项目执行类似的操作,只从最高的OneToMany对象获取数据.有点像:

foreach(var item  in Data.GroupBy(d=>d.Item).Select(d=> new {Item = d.Key})){
     //Then pick out the highest OneToMany ID for that row to use with it.
}
Run Code Online (Sandbox Code Playgroud)

我意识到可能有更好的LINQ查询来实现这一点,但重点是,SSIS中的脚本组件似乎只允许使用预定义的ProcessInputRow方法在每行上使用它.我想确切地确定哪些行被处理以及哪些属性传递给该方法.

我该怎么做呢?

bil*_*nkc 5

要重述您的问题,如何使脚本转换停止逐行处理?默认情况下,脚本转换将成为同步组件 - 1行,1行.您需要将其更改为异步组件1行 - 0到多行.

在您的脚本转换编辑器,输入和输出选项卡,为您输出集合Output 0改变SynchronousInputID的值从不管它是None.

不要在我的LINQ代码上投石头 - 我相信你可以处理这项工作.此代码块的目的是演示如何收集行以进行处理,然后在修改它们之后将它们传递给下游使用者.我评论了一些方法,以帮助您了解它们在脚本组件生命周期中的作用,但如果您更愿意阅读MSDN,他们比我知道的更多;)

using System;
using System.Data;
using System.Linq;
using System.Collections.Generic;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    /// <summary>
    /// Our LINQ-able thing.
    /// </summary>
    List<Data> data;

    /// <summary>
    /// Do our preexecute tasks, in particular, we will instantiate
    /// our collection.
    /// </summary>
    public override void PreExecute()
    {
        base.PreExecute();
        this.data = new List<Data>();
    }

    /// <summary>
    /// This method is called once the last row has hit.
    /// Since we will can only find the highest OneToManyDataId
    /// after receiving all the rows, this the only time we can
    /// send rows to the output buffer.
    /// </summary>
    public override void FinishOutputs()
    {
        base.FinishOutputs();
        CreateNewOutputRows();
    }

    /// <summary>
    /// Accumulate all the input rows into an internal LINQ-able
    /// collection
    /// </summary>
    /// <param name="Row">The buffer holding the current row</param>
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // there is probably a more graceful mechanism of spinning
        // up this struct.
        // You must also worry about fields that have null types.
        Data d = new Data();
        d.ID = Row.ID;
        d.Title = Row.Title;
        d.OneToManyId = Row.OneToManyDataID;            
        this.data.Add(d);
    }

    /// <summary>
    /// This is the process to generate new rows. As we only want to
    /// generate rows once all the rows have arrived, only call this
    /// at the point our internal collection has accumulated all the
    /// input rows.
    /// </summary>
    public override void CreateNewOutputRows()
    {
        foreach (var item in this.data.GroupBy(d => d.ID).Select(d => new { Item = d.Key }))
        {
            //Then pick out the highest OneToMany ID for that row to use with it.
            // Magic happens
            // I don't "get" LINQ so I can't implement the poster's action
            int id = 0;
            int maxOneToManyID = 2;
            string title = string.Empty;
            id = item.Item;
            Output0Buffer.AddRow();
            Output0Buffer.ID = id;
            Output0Buffer.OneToManyDataID = maxOneToManyID;
            Output0Buffer.Title = title;
        }
    }

}
/// <summary>
/// I think this works well enough to demo
/// </summary>
public struct Data
{
    public int ID { get; set; }
    public string Title { get; set; }
    public int OneToManyId { get; set; }
}
Run Code Online (Sandbox Code Playgroud)

脚本转换的配置

输入选项卡

输出

结果