Ben*_*ack 5 c# regex linq linq-to-objects parsing
我正在尝试编写一个简单的程序来比较单独文件夹中的文件.我目前正在使用LINQ to Objects来解析文件夹,并希望在我的结果集中包含从字符串中提取的信息.
这是我到目前为止所拥有的:
FileInfo[] fileList = new DirectoryInfo(@"G:\Norton Backups").GetFiles();
var results = from file in fileList
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length };
foreach (var x in results)
Console.WriteLine(x.Name);
Run Code Online (Sandbox Code Playgroud)
这会产生:
AWS025.sv2i
AWS025_C_Drive038.v2i
AWS025_C_Drive038_i001.iv2i
AWS025_C_Drive038_i002.iv2i
AWS025_C_Drive038_i003.iv2i
AWS025_C_Drive038_i004.iv2i
AWS025_C_Drive038_i005.iv2i
...
Run Code Online (Sandbox Code Playgroud)
我想修改LINQ查询,以便:
_C_Drive038上面的示例中,但038 驱动器号可能会更改)._i0XX文件名末尾没有),我想要包含一个字段.038).001的增量(例如,将是增量编号)我相信查询的基本布局如下所示,但我不确定如何最好地完成它(我有一些关于如何做到这一点的想法,但我有兴趣听到其他人如何可能会这样做):
var results = from file in fileList
let IsMainBackup = \\ ??
let ImageNumber = \\ ??
let IncrementNumber = \\ ??
where \\ it is a backup file.
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
Run Code Online (Sandbox Code Playgroud)
在寻找ImageNumber和时IncrementNumber,我想假设这个数据的位置并不总是固定的,这意味着,我想知道一个解析它的好方法(如果这需要RegEx,请解释我如何使用它).
注:我的大多数在解析文本以往的经验使用基于位置的字符串函数,如参与LEFT,RIGHT或MID.如果有更好的方法,我宁愿不回到那些.
使用正则表达式:
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime, file.Length,
IsMainBackup, ImageNumber, IncrementNumber };
Run Code Online (Sandbox Code Playgroud)
以下是正则表达式的说明:
^ Start of string.
.* Allow anything at the start.
(?<Backup>...) Match a backup description (explained below).
\. Match a literal period.
[^.]+$ Match the extension (anything except periods).
$ End of string.
Run Code Online (Sandbox Code Playgroud)
备份是:
_\w_Drive A literal underscore, any letter, another underscore, then the string "Drive".
(?<ImageNumber>\d+) At least one digit, saved as ImageNumber.
(?<Increment>...)? An optional increment description.
Run Code Online (Sandbox Code Playgroud)
增量是:
_i A literal underscore, then the letter i.
(?<IncrementNumber>\d+) At least one digit, saved as IncrementNumber.
Run Code Online (Sandbox Code Playgroud)
这是我使用的测试代码:
using System;
using System.IO;
using System.Text.RegularExpressions;
using System.Linq;
class Program
{
static void Main(string[] args)
{
FileInfo[] fileList = new FileInfo[] {
new FileInfo("AWS025.sv2i"),
new FileInfo("AWS025_C_Drive038.v2i"),
new FileInfo("AWS025_C_Drive038_i001.iv2i"),
new FileInfo("AWS025_C_Drive038_i002.iv2i"),
new FileInfo("AWS025_C_Drive038_i003.iv2i"),
new FileInfo("AWS025_C_Drive038_i004.iv2i"),
new FileInfo("AWS025_C_Drive038_i005.iv2i")
};
Regex regex = new Regex(@"^.*(?<Backup>_\w_Drive(?<ImageNumber>\d+)(?<Increment>_i(?<IncrementNumber>\d+))?)\.[^.]+$");
var results = from file in fileList
let match = regex.Match(file.Name)
let IsMainBackup = !match.Groups["Increment"].Success
let ImageNumber = match.Groups["ImageNumber"].Value
let IncrementNumber = match.Groups["IncrementNumber"].Value
where match.Groups["Backup"].Success
orderby file.CreationTime
select new { file.Name, file.CreationTime,
IsMainBackup, ImageNumber, IncrementNumber };
foreach (var x in results)
{
Console.WriteLine("Name: {0}, Main: {1}, Image: {2}, Increment: {3}",
x.Name, x.IsMainBackup, x.ImageNumber, x.IncrementNumber);
}
}
}
Run Code Online (Sandbox Code Playgroud)
这是我得到的输出:
Name: AWS025_C_Drive038.v2i, Main: True, Image: 038, Increment:
Name: AWS025_C_Drive038_i001.iv2i, Main: False, Image: 038, Increment: 001
Name: AWS025_C_Drive038_i002.iv2i, Main: False, Image: 038, Increment: 002
Name: AWS025_C_Drive038_i003.iv2i, Main: False, Image: 038, Increment: 003
Name: AWS025_C_Drive038_i004.iv2i, Main: False, Image: 038, Increment: 004
Name: AWS025_C_Drive038_i005.iv2i, Main: False, Image: 038, Increment: 005
Run Code Online (Sandbox Code Playgroud)
为这一个找到一个好的答案真的很有趣:)
下面的代码为您提供所需.请注意在检索文件时使用搜索模式 - 没有必要检索更多文件.另请注意使用parseNumber()函数,这只是为了向您展示如何以正确的格式将字符串结果从正则表达式更改为数字.
static class Program
{
[STAThread]
static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
//Application.Run(new Form1());
GetBackupFiles(@"c:\temp\backup files");
}
static void GetBackupFiles(string path)
{
FileInfo[] fileList = new DirectoryInfo(path).GetFiles("*_Drive*.*v2i");
var results = from file in fileList
orderby file.CreationTime
select new
{ file.Name
,file.CreationTime
,file.Length
,IsMainBackup = file.Extension.ToLower() == ".v2i"
,ImageNumber = Regex.Match(file.Name, @"drive([\d]{0,5})", RegexOptions.IgnoreCase).Groups[1]
,IncrementNumber = parseNumber( Regex.Match(file.Name, @"_i([\d]{0,5})\.iv2i", RegexOptions.IgnoreCase).Groups[1])
};
foreach (var x in results)
Console.WriteLine(x.Name);
}
static int? parseNumber(object num)
{
int temp;
if (num != null && int.TryParse(num.ToString(), out temp))
return temp;
return null;
}
}
Run Code Online (Sandbox Code Playgroud)
请注意,对于正则表达式,我假设文件名中有一些一致性,如果它们偏离了您提到的格式,那么您将不得不调整它们.
| 归档时间: |
|
| 查看次数: |
1146 次 |
| 最近记录: |