模式匹配路径+文件(UNC?)

use*_*304 1 c# regex unc pattern-matching

我正在使用C#和Visual Studio 2010.我只是想匹配一个字符串(在这种情况下是一个路径)并创建一个模式,这将帮助我弄清楚它是否是一个有效的模式.以下示例由仲裁示例组成,但它们确实包含

所以我试图创建一个匹配UNC路径的模式,该路径以字符串形式出现.例如:

"\\\\Apple-butter27\\AliceFakePlace\\SomeDay\\Grand100\\Some File Name Stuff\\Yes these are fake words\\One more for fun2000343\\myText.txt"
Run Code Online (Sandbox Code Playgroud)

上面是我试图模式匹配的文件路径的示例.我试图将它与这种模式匹配:

@"\\\\[a-zA-Z0-9-]+\\\w+\\\w+\\\w+\\((\w+)*(\s+)*)*\\((\w+)*(\s+)*)*\\((\w+)*(\s+)*)*\\w+\.txt";
Run Code Online (Sandbox Code Playgroud)

我保证的是在我到达我的文件之前会有7个文件夹.我将不得不为几乎所有细分市场寻找空格,字母和数字的组合.

我尝试通过匹配小位开始尝试,例如我的第一次迭代测试,我尝试将其作为我的模式:

@"\\\\";
Run Code Online (Sandbox Code Playgroud)

这是有效的,因为它将匹配前几个字符,但如果我添加它:

@"\\\\[a-zA-Z0-9-]+";
Run Code Online (Sandbox Code Playgroud)

它失败.所以我想也许是因为字符串导致它加倍,所以我可能要加倍我的"\"所以我再次尝试用8"\"单独,但那失败了.

我以前的模式的目标是匹配"\\\\ Apple-butter27"

我一直在寻找谷歌和整个网站,但我找到的模式匹配UNC的东西都不是我的问题.

如果有人能告诉我这种模式我做错了什么,我真的很感激.至少是一个起点,因为我知道它很长,可能会是一个非常复杂的起点...但如果有人能指出一般的事情是错误的.

虽然因为它是非字符串状态的路径,它看起来像这样:

\\Apple-butter27\AliceFakePlace\SomeDay\Grand100\Some File Name Stuff\Yes these are fake words\One more for fun2000343\myText.txt
Run Code Online (Sandbox Code Playgroud)

我很想尝试与UNC路径进行模式匹配,所以它开始真的让我感到困惑,所以如果有人能照亮我的路,我会非常感激.

我正在使用Regex 的.Success函数来查看模式是否匹配,如果匹配成功或失败,我只是打印一条消息.我的主要关注点是模式,除非对使用路径作为解决方案的字符串以外的其他东西有一些很好的见解.

Mit*_*tch 6

不需要正则表达式

或者,使用类的内置解析System.Uri:

foreach (var path in new [] { @"C:\foo\bar\", @"\\server\bar" })
{
    var uri = new Uri(path);

    if (uri.IsUnc)
    {
        Console.WriteLine("Connects to host '{0}'", uri.Host);
    }
    else
    {
        Console.WriteLine("Local path");
    }
}
Run Code Online (Sandbox Code Playgroud)

打印:

本地路径
连接到主机'服务器'

如果你试图匹配扩展,不要重新发明轮子,使用Path.GetExtension:

var path = "\\some\really long and complicated path\foo.txt";
var extensionOfPath = Path.GetExtension(path);

if (string.Equals(".txt", extensionOfPath, StringComparison.CurrentCultureIgnoreCase))
{
    Console.WriteLine("It's a txt");
}
else
{
    Console.WriteLine("It's a '{0}', which is not a txt", extensionOfPath);
}
Run Code Online (Sandbox Code Playgroud)

一般来说,我试图建议你在解决问题时避免跳到正则表达式.首先问问自己是否有其他人为您解决了问题(例如HTML).有很好的讨论为什么正则表达式对CodingHorror有不好的代表,而对xkcd有不太严重的代表.

正则表达版

如果你一直在使用正则表达式,我维护它不是最好的工具,它可以完成.使用间距和注释来确保代码可读.

string input = @"\\Apple-butter27\AliceFakePlace\SomeDay\Grand100\Some File Name Stuff\Yes these are fake words\One more for fun2000343\myText.txt";
Regex regex = new Regex(@"
    ^
    (?:
        # if server is present, capture to a named group
        # use a noncapturing group to remove the surrounding slashes
        # * is a greedy match, so it will butt up against the following directory search
        # this group may or may not occur, so we allow either this or the drive to match (|)
        (?:\\\\(?<server>[^\\]*)\\)
        # if there is no server, then we best have a drive letter
        |(?:(?<drive>[A-Z]):\\)
    )
    # then we have a repeating group (+) to capture all the directory components
    (?:
        # each directory is composed of a name (which does not contain \\)
        # followed by \\
        (?<directory>[^\\]*)\\
    )+
    # then we have a file name, which is identifiable as we already ate the rest of
    # the string.  So, it is just all non-\\ characters at the end.
    (?<file>[^\\]*)
    $", RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);

var matches = regex.Match(input).Groups;

foreach (var group in regex.GetGroupNames())
{
    Console.WriteLine("Matched {0}:", group);
    foreach (var value in matches[group].Captures.Cast<Capture>())
    {
        Console.WriteLine("\t{0}", value.Value);
    }
}
Run Code Online (Sandbox Code Playgroud)

打印

Matched server:
        Apple-butter27
Matched drive:
Matched directory:
        AliceFakePlace
        SomeDay
        Grand100
        Some File Name Stuff
        Yes these are fake words
        One more for fun2000343
Matched file:
        myText.txt
Run Code Online (Sandbox Code Playgroud)

我现在只是在猜...

听起来你有某种应用程序调用它所在的目录并在其下构建一个多层结构.类似于以下内容:

C:\
  root directory for the application\
    site name\
      date of work\
        project name\
          bar\
            actual.txt
            files.txt
Run Code Online (Sandbox Code Playgroud)

你正在寻找实际的文件,或者不是,我不知道.无论哪种方式,我们都知道C:\root directory\并认为它可能有实际的文件.然后我们可以获取目录树并枚举以查找实际文件:

var diRoot = new DirectoryInfo(@"C:\drop");

var projectDirectories = FindProjects(diRoot);

// get all of the files in all of the project directories of type .txt
var projectFiles = projectDirectories.SelectMany(di => di.GetFiles("*.txt"));

// projectFiles now contains:
//  actual.txt
//  files.txt

private static IEnumerable<DirectoryInfo> FindProjects(DirectoryInfo cDir, int depth = 0)
{
    foreach (var di in cDir.GetDirectories())
    {
        // assume projects are three levels deep
        if (depth == 3)
        {
            // it's a project, so we can return it
            yield return di;
        }
        else
        {
            // pass it through, return the results
            foreach (var d in FindProjects(di, depth + 1))
                yield return d;
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

由于我们没有对路径进行字符串操作,因此我们可以透明地处理本地和UNC路径.