我有一个带有字符串列的CSV文件,该列跨越了多行。我想将这些多行汇总为一行。
例如
1, "asdsdsdsds", "John"
2, "dfdhifdkinf
dfjdfgkdnjgknkdjgndkng
dkfdkjfnjdnf", "Roy"
3, "dfjfdkgjfgn", "Rahul"
Run Code Online (Sandbox Code Playgroud)
我希望我的输出是
1, "asdsdsdsds", "John"
2, "dfdhifdkinf dfjdfgkdnjgknkdjgndkng dkfdkjfnjdnf", "Roy"
3, "dfjfdkgjfgn", "Rahul"
Run Code Online (Sandbox Code Playgroud)
我想使用PowerShell实现此输出
谢谢。
我想删除 CSV 文件字段数据中的换行符。SO/其他地方的多人问了同样的问题。然而,提供的解决方案是在脚本中。我正在寻找像 PYTHON 这样的编程语言或 Spark(不仅仅是这两个)的解决方案,因为我有很大的文件。
以前问过关于同一主题的问题:
我有一个大小约 1GB 的 CSV 文件,想删除字段数据中的换行符。CSV 文件的架构动态变化,因此我无法对架构进行硬编码。换行符并不总是出现在逗号之前,它甚至在一个字段中也是随机出现的。
样本数据:
playerID,yearID,gameNum,gameName,teamName,lgID,GP,startingPos
gomezle01,1933,1,Cricket,Team1,NYA,AL,1
ferreri01,1933,2,Hockey,"This is
Team2",BOS,AL,1
gehrilo01,1933,3,"Game name is
Cricket"
,Team3,NYA,AL,1
gehrich01,1933,4,Hockey,"Here it is
Team4",DET,AL,1
dykesji01,1933,5,"Game name is
Hockey"
,"Team name
Team5",CHA,AL,1
Run Code Online (Sandbox Code Playgroud)
预期输出:
playerID,yearID,gameNum,gameName,teamName,lgID,GP,startingPos
gomezle01,1933,1,Cricket,Team1,NYA,AL,1
ferreri01,1933,2,Hockey,"This is Team2",BOS,AL,1
gehrilo01,1933,3,"Game name is Cricket" ,Team3,NYA,AL,1
gehrich01,1933,4,Hockey,"Here it is Team4",DET,AL,1
dykesji01,1933,5,"Game name is Hockey","Team name Team5",CHA,AL,1
Run Code Online (Sandbox Code Playgroud)
换行符可以在任何字段的数据中。
编辑: 根据代码截图: