检查2个URL是否相等

Mar*_*man 17 c# asp.net

如果2个URL相等,是否有针对该测试的方法,即指向相同的位置.我不是在谈论具有指向相同IP地址的不同域名的2个URL,而是例如指向相同.aspx页面的2个URL:

等于这些:

备注/ assumtions

  1. QueryString值被忽略
  2. ASP.NET(Pref C#)
  3. Default.aspx是默认页面

---- ----更新

这是一个非常粗略的方法,它测试一个URL以查看是否与当前URL匹配:我尝试使用本地和检查URL创建一个新的Uri(),但不知道它是否有效并且在字符串检查途径下去了.如果URL以"Http"开头,则SiteMapProvider的实现将跳过此步骤,因为这假定为外部URL.由于我有一个始终确保相对路径的SaaS框架(因为它们可以在不同的子域上),因此更容易剥离.

有关优化的评论吗?我想一开始我们可以传入一个包含当前URL的变量吗?不确定多次调用HttpContext.Current.Request.Url.LocalPath的开销?

    /// <summary>
    /// Assumes URL is relative aspx page or folder path
    /// </summary>
    /// <param name="url"></param>
    /// <returns></returns>
    public static bool CurrentURLMatch(string url)
    {
        string localURL = HttpContext.Current.Request.Url.LocalPath;
        if (HttpContext.Current.Request.Url.Host == "localhost")
        {
            localURL = localURL.Substring(localURL.IndexOf('/') + 1);
            localURL = localURL.Substring(localURL.IndexOf('/'));
        }
        string compareURL = url.ToLower();

        // Remove QueryString Values
        if (localURL.Contains("?"))
        {
            localURL = localURL.Split('?')[0];
        }

        if (compareURL.Contains("?"))
        {
            compareURL = compareURL.Split('?')[0];
        }

        if (localURL.Contains("#"))
        {
            localURL = localURL.Split('#')[0];
        }
        if (compareURL.Contains("?"))
        {
            compareURL = compareURL.Split('#')[0];
        }

        // Prepare End of Local URL
        if (!localURL.Contains("aspx"))
        {
            if (!localURL.EndsWith("/"))
            {
                localURL = String.Concat(localURL, "/");
            }
        }

        // Prepare End of Compare URL
        if (!compareURL.Contains("aspx"))
        {
            if (!compareURL.EndsWith("/"))
            {
                compareURL = String.Concat(localURL, "/");
            }
        }

        if (localURL.EndsWith(@"/"))
        {
            localURL = String.Concat(localURL, "Default.aspx");
        }

        if (compareURL.EndsWith(@"/"))
        {
            compareURL = String.Concat(compareURL, "Default.aspx");
        }

        if (compareURL.Contains(@"//"))
        {
            compareURL = compareURL.Replace(@"//", String.Empty);
            compareURL = compareURL.Substring(compareURL.IndexOf("/") + 1);
        }

        compareURL = compareURL.Replace("~", String.Empty);

        if (localURL == compareURL)
        {
            return true;
        }

        return false;
    }
Run Code Online (Sandbox Code Playgroud)

Vah*_*idN 10

为了记录,这里是http://en.wikipedia.org/wiki/URL%5Fnormalization到C#的翻译:

using System;
using System.Web;

namespace UrlNormalizationTest
{
    public static class UrlNormalization
    {
        public static bool AreTheSameUrls(this string url1, string url2)
        {
            url1 = url1.NormalizeUrl();
            url2 = url2.NormalizeUrl();
            return url1.Equals(url2);
        }

        public static bool AreTheSameUrls(this Uri uri1, Uri uri2)
        {
            var url1 = uri1.NormalizeUrl();
            var url2 = uri2.NormalizeUrl();
            return url1.Equals(url2);
        }

        public static string[] DefaultDirectoryIndexes = new[]
            {
                "default.asp",
                "default.aspx",
                "index.htm",
                "index.html",
                "index.php"
            };

        public static string NormalizeUrl(this Uri uri)
        {
            var url = urlToLower(uri);
            url = limitProtocols(url);
            url = removeDefaultDirectoryIndexes(url);
            url = removeTheFragment(url);
            url = removeDuplicateSlashes(url);
            url = addWww(url);
            url = removeFeedburnerPart(url);
            return removeTrailingSlashAndEmptyQuery(url);
        }

        public static string NormalizeUrl(this string url)
        {
            return NormalizeUrl(new Uri(url));
        }

        private static string removeFeedburnerPart(string url)
        {
            var idx = url.IndexOf("utm_source=", StringComparison.Ordinal);
            return idx == -1 ? url : url.Substring(0, idx - 1);
        }

        private static string addWww(string url)
        {
            if (new Uri(url).Host.Split('.').Length == 2 && !url.Contains("://www."))
            {
               return url.Replace("://", "://www.");
            }
            return url;
        }

        private static string removeDuplicateSlashes(string url)
        {
            var path = new Uri(url).AbsolutePath;
            return path.Contains("//") ? url.Replace(path, path.Replace("//", "/")) : url;
        }

        private static string limitProtocols(string url)
        {
            return new Uri(url).Scheme == "https" ? url.Replace("https://", "http://") : url;
        }

        private static string removeTheFragment(string url)
        {
            var fragment = new Uri(url).Fragment;
            return string.IsNullOrWhiteSpace(fragment) ? url : url.Replace(fragment, string.Empty);
        }

        private static string urlToLower(Uri uri)
        {
            return HttpUtility.UrlDecode(uri.AbsoluteUri.ToLowerInvariant());
        }

        private static string removeTrailingSlashAndEmptyQuery(string url)
        {
            return url
                    .TrimEnd(new[] { '?' })
                    .TrimEnd(new[] { '/' });
        }

        private static string removeDefaultDirectoryIndexes(string url)
        {
            foreach (var index in DefaultDirectoryIndexes)
            {
                if (url.EndsWith(index))
                {
                    url = url.TrimEnd(index.ToCharArray());
                    break;
                }
            }
            return url;
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

通过以下测试:

using NUnit.Framework;
using UrlNormalizationTest;

namespace UrlNormalization.Tests
{
    [TestFixture]
    public class UnitTests
    {
        [Test]
        public void Test1ConvertingTheSchemeAndHostToLowercase()
        {
            var url1 = "HTTP://www.Example.com/".NormalizeUrl();
            var url2 = "http://www.example.com/".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test2CapitalizingLettersInEscapeSequences()
        {
            var url1 = "http://www.example.com/a%c2%b1b".NormalizeUrl();
            var url2 = "http://www.example.com/a%C2%B1b".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test3DecodingPercentEncodedOctetsOfUnreservedCharacters()
        {
            var url1 = "http://www.example.com/%7Eusername/".NormalizeUrl();
            var url2 = "http://www.example.com/~username/".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test4RemovingTheDefaultPort()
        {
            var url1 = "http://www.example.com:80/bar.html".NormalizeUrl();
            var url2 = "http://www.example.com/bar.html".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test5AddingTrailing()
        {
            var url1 = "http://www.example.com/alice".NormalizeUrl();
            var url2 = "http://www.example.com/alice/?".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test6RemovingDotSegments()
        {
            var url1 = "http://www.example.com/../a/b/../c/./d.html".NormalizeUrl();
            var url2 = "http://www.example.com/a/c/d.html".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test7RemovingDirectoryIndex1()
        {
            var url1 = "http://www.example.com/default.asp".NormalizeUrl();
            var url2 = "http://www.example.com/".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test7RemovingDirectoryIndex2()
        {
            var url1 = "http://www.example.com/default.asp?id=1".NormalizeUrl();
            var url2 = "http://www.example.com/default.asp?id=1".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test7RemovingDirectoryIndex3()
        {
            var url1 = "http://www.example.com/a/index.html".NormalizeUrl();
            var url2 = "http://www.example.com/a/".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test8RemovingTheFragment()
        {
            var url1 = "http://www.example.com/bar.html#section1".NormalizeUrl();
            var url2 = "http://www.example.com/bar.html".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test9LimitingProtocols()
        {
            var url1 = "https://www.example.com/".NormalizeUrl();
            var url2 = "http://www.example.com/".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test10RemovingDuplicateSlashes()
        {
            var url1 = "http://www.example.com/foo//bar.html".NormalizeUrl();
            var url2 = "http://www.example.com/foo/bar.html".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test11AddWww()
        {
            var url1 = "http://example.com/".NormalizeUrl();
            var url2 = "http://www.example.com".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }

        [Test]
        public void Test12RemoveFeedburnerPart()
        {
            var url1 = "http://site.net/2013/02/firefox-19-released/?utm_source=rss&utm_medium=rss&utm_campaign=firefox-19-released".NormalizeUrl();
            var url2 = "http://site.net/2013/02/firefox-19-released".NormalizeUrl();

            Assert.AreEqual(url1, url2);
        }
    }
}
Run Code Online (Sandbox Code Playgroud)


Sam*_*ijo 8

您可能正在寻找URL规范化技术.他们可能是一个很好的起点:)

一旦您对URL进行了规范化,您只需要检查它们是否相等(请记住您的假设,例如,您丢弃了查询字符串).


Joh*_*ski 5

Uri在将每个 URL 转换为正确的格式后,您可能可以使用该类来检查 URL 的各个部分。

// Create the URI objects
// TODO: Use the right constructor overloads, 
// or do some processing beforehand to accomodate for the different scenarios
Uri uri1 = new Uri(url1);
Uri uri2 = new Uri(url2);

// There are overlaods for the constructor too
Uri uri3 = new Uri(url3, UriKind.Absolute);

// Check the correct properties
// TODO: Use the right properties...
if (uri1.AbsolutePath == uri2.AbsolutePath)
{
    // Urls match
}
Run Code Online (Sandbox Code Playgroud)