Nic*_*ick 1 excel split text-extraction extract powerquery
问题 我的任务是整理一些包含文本和数字混合的非常混乱的数据,并希望使用强力查询将代码与数据分开。幸运的是,需要分隔的代码仅由数值组成,并且长度似乎为 7 个字符(假设为 6 个或更长)。
下面是我希望如何分离数据的示例:
到目前为止: 到目前为止我有这个代码:
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Input ", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","_"," ",Replacer.ReplaceText,{"Input "}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","v"," ",Replacer.ReplaceText,{"Input "}),
#"Added Custom" = Table.AddColumn(#"Replaced Value1", "TextSplit", each Text.Split([#"Input "], " ")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "ListTransform", each List.Transform([TextSplit], each Text.Select(_,{"0".."9"}))),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "ListSelect", each List.Select([ListTransform], each Text.Length(_)>=5)),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "TextCombine", each Text.Combine([ListSelect], ", ")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom3",{"TextSplit", "ListTransform", "ListSelect"})
Run Code Online (Sandbox Code Playgroud)
在#“已删除的列”中
这似乎确实解决了问题。然而,在0102646v2.0这种情况下,会被拉过去010264620。为了让它工作,我不得不引入用“”替换 _ 和“v”的步骤。Power Query 是否无法识别 say0102646v2.0应提取为0102646?
数据:
Input Values:
3159087 v1.0
3194070 v1.0
#8102368 V3.0 (Shine and ProtectR18)
#8102371 V4.0 (Lemon 12A Degreaser)
Marine (FF3080300 v1.0)
Green Apple (FF3080301 v1.0)
0102646v2.0 (Fresh Cotton)
TDS# 3129801 V1.0 GPA Code#3123402
FF3112964 0.1 FF3145524 0.1_3152912 0.1
Run Code Online (Sandbox Code Playgroud)
谢谢你!
更新:拉取版本号
正则表达式
下面是使用 Regex 提取模式并返回以逗号分隔的 PQ 实现:
将其添加为自定义函数。我给它起了个名字fnRegexExtr
//see http://www.thebiccountant.com/2018/04/25/regex-in-power-bi-and-power-query-in-excel-with-java-script/
// and https://gist.github.com/Hugoberry/4948d96b45d6799c47b4b9fa1b08eadf
let fx=(text,regex)=>
Web.Page(
"<script>
var x='"&text&"';
var y=new RegExp('"®ex&"','g');
var b=x.match(y);
document.write(b);
</script>")[Data]{0}[Children]{0}[Children]{1}[Text]{0}
in
fx
Run Code Online (Sandbox Code Playgroud)
然后您可以在代码中使用它,如下所示:
let
Source = Excel.CurrentWorkbook(){[Name="Table10"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Input", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Output",
each fnRegexExtr([Input], "[0-9]{6,}"))
in
#"Added Custom"
Run Code Online (Sandbox Code Playgroud)
返回: