Sav*_*tel 3 c# regex boost-regex regex-greedy regex-lookarounds
string emailBody = " holla holla testing is for NewFinancial History:\"xyz\" dsd NewFinancial History:\"abc\" NewEBTDI$:\"abc\" dsds ";
emailBody = string.Join(" ", Regex.Split(emailBody.Trim(), @"(?:\r\n|\n|\r)"));
var keys = Regex.Matches(emailBody, @"\bNew\B(.+?):", RegexOptions.Singleline).OfType<Match>().Select(m => m.Groups[0].Value.Replace(":", "")).Distinct().ToArray();
foreach (string key in keys)
{
List<string> valueList = new List<string>();
string regex = "" + key + ":" + "\"(?<" + GetCleanKey(key) + ">[^\"]*)\"";
var matches = Regex.Matches(emailBody, regex, RegexOptions.Singleline);
foreach (Match match in matches)
{
if (match.Success)
{
string value = match.Groups[GetCleanKey(key)].Value;
if (!valueList.Contains(value.Trim()))
{
valueList.Add(value.Trim());
}
}
}
public string GetCleanKey(string key)
{
return key.Replace(" ", "").Replace("-", "").Replace("#", "").Replace("$", "").Replace("*", "").Replace("!", "").Replace("@", "")
.Replace("%", "").Replace("^", "").Replace("&", "").Replace("(", "").Replace(")", "").Replace("[", "").Replace("]", "").Replace("?", "")
.Replace("<", "").Replace(">", "").Replace("'", "").Replace(";", "").Replace("/", "").Replace("\"", "").Replace("+", "").Replace("~", "").Replace("`", "")
.Replace("{", "").Replace("}", "").Replace("+", "").Replace("|", "");
}
Run Code Online (Sandbox Code Playgroud)
在我上面的代码中,我试图让旁边的值NewEBTDI$:是"abc".
当我$在模式中包含符号时,它不会搜索字段名称旁边的值.
如果$删除了一个只是指定NewEBTDI然后它搜索值.
我想搜索值和$符号.
处理正则表达式中具有特殊含义但必须按原样搜索的字符的正确方法是逃避它们.你可以这样做Regex.Escape.在你的情况下,它是$符号,这意味着正则表达式的行结束,如果没有转义.
string regex = "" + Regex.Escape(key) + ":" + "\"(?<" + Regex.Escape(GetCleanKey(key))
+ ">[^\"]*)\"";
Run Code Online (Sandbox Code Playgroud)
要么
string regex = String.Format("{0}:\"(?<{1}>[^\"]*)\"",
Regex.Escape(key),
Regex.Escape(GetCleanKey(key)));
Run Code Online (Sandbox Code Playgroud)
或者使用VS 2015,使用字符串插值:
string regex = $"{Regex.Escape(key)}:\"(?<{Regex.Escape(GetCleanKey(key))}>[^\"]*)\"";
Run Code Online (Sandbox Code Playgroud)
(它确实看起来比现实中更好,因为C#编辑器对字符串部分和嵌入式C#表达式进行了不同的着色.)