Tan*_*ner 13 .net c# encoding azure azure-storage
我正在使用Azure存储表,我有数据进入RowKey,其中包含斜杠.根据此MSDN页面,PartitionKey和RowKey中不允许使用以下字符:
正斜杠(/)字符
反斜杠()字符
数字符号(#)字符
问号(?)字符
控制字符从U + 0000到U + 001F,包括:
水平制表符(\ t)字符
换行符(\n)字符
回车(\ r)字符
控制字符从U + 007F到U + 009F
我见过有些人使用URL编码来解决这个问题.不幸的是,这可能会产生一些问题,例如能够插入但无法删除某些实体.我也看到有些人使用base64编码,但是这也可以包含不允许的字符.
如何在不运行不允许的字符或滚动自己的编码的情况下有效编码RowKey?
Jas*_*ber 12
当URL为Base64编码时,Azure表存储键列中唯一无效的字符是正斜杠('/').要解决此问题,只需将正斜杠字符替换为另一个字符,该字符在(1)Azure表存储密钥列中有效,(2)不是Base64字符.我发现的最常见的例子(在其他答案中引用)是用下划线('_')替换正斜杠('/').
private static String EncodeUrlInKey(String url)
{
var keyBytes = System.Text.Encoding.UTF8.GetBytes(url);
var base64 = System.Convert.ToBase64String(keyBytes);
return base64.Replace('/','_');
}
Run Code Online (Sandbox Code Playgroud)
解码时,只需撤消替换的字符(首先!),然后Base64解码生成的字符串.这里的所有都是它的.
private static String DecodeUrlInKey(String encodedKey)
{
var base64 = encodedKey.Replace('_', '/');
byte[] bytes = System.Convert.FromBase64String(base64);
return System.Text.Encoding.UTF8.GetString(bytes);
}
Run Code Online (Sandbox Code Playgroud)
有人建议其他Base64字符也需要编码.根据Azure表存储文档,情况并非如此.
我遇到了同样的需求.
我对Base64编码不满意,因为它将一个人类可读的字符串变成一个无法识别的字符串,并且无论它们是否遵循规则都会膨胀字符串的大小(当绝大多数字符不是非法字符时会丢失被逃脱).
这是一个使用'!'的编码器/解码器 作为一个转义字符,传统上使用反斜杠字符的方式大致相同.
public static class TableKeyEncoding
{
// https://msdn.microsoft.com/library/azure/dd179338.aspx
//
// The following characters are not allowed in values for the PartitionKey and RowKey properties:
// The forward slash(/) character
// The backslash(\) character
// The number sign(#) character
// The question mark (?) character
// Control characters from U+0000 to U+001F, including:
// The horizontal tab(\t) character
// The linefeed(\n) character
// The carriage return (\r) character
// Control characters from U+007F to U+009F
public static string Encode(string unsafeForUseAsAKey)
{
StringBuilder safe = new StringBuilder();
foreach (char c in unsafeForUseAsAKey)
{
switch (c)
{
case '/':
safe.Append("!f");
break;
case '\\':
safe.Append("!b");
break;
case '#':
safe.Append("!p");
break;
case '?':
safe.Append("!q");
break;
case '\t':
safe.Append("!t");
break;
case '\n':
safe.Append("!n");
break;
case '\r':
safe.Append("!r");
break;
case '!':
safe.Append("!!");
break;
default:
if (c <= 0x1f || (c >= 0x7f && c <= 0x9f))
{
int charCode = c;
safe.Append("!x" + charCode.ToString("x2"));
}
else
{
safe.Append(c);
}
break;
}
}
return safe.ToString();
}
public static string Decode(string key)
{
StringBuilder decoded = new StringBuilder();
int i = 0;
while (i < key.Length)
{
char c = key[i++];
if (c != '!' || i == key.Length)
{
// There's no escape character ('!'), or the escape should be ignored because it's the end of the array
decoded.Append(c);
}
else
{
char escapeCode = key[i++];
switch (escapeCode)
{
case 'f':
decoded.Append('/');
break;
case 'b':
decoded.Append('\\');
break;
case 'p':
decoded.Append('#');
break;
case 'q':
decoded.Append('?');
break;
case 't':
decoded.Append('\t');
break;
case 'n':
decoded.Append("\n");
break;
case 'r':
decoded.Append("\r");
break;
case '!':
decoded.Append('!');
break;
case 'x':
if (i + 2 <= key.Length)
{
string charCodeString = key.Substring(i, 2);
int charCode;
if (int.TryParse(charCodeString, NumberStyles.HexNumber, NumberFormatInfo.InvariantInfo, out charCode))
{
decoded.Append((char)charCode);
}
i += 2;
}
break;
default:
decoded.Append('!');
break;
}
}
}
return decoded.ToString();
}
}
Run Code Online (Sandbox Code Playgroud)
因为在编写自己的编码器时应该格外小心,我也为它编写了一些单元测试.
using Xunit;
namespace xUnit_Tests
{
public class TableKeyEncodingTests
{
const char Unicode0X1A = (char) 0x1a;
public void RoundTripTest(string unencoded, string encoded)
{
Assert.Equal(encoded, TableKeyEncoding.Encode(unencoded));
Assert.Equal(unencoded, TableKeyEncoding.Decode(encoded));
}
[Fact]
public void RoundTrips()
{
RoundTripTest("!\n", "!!!n");
RoundTripTest("left" + Unicode0X1A + "right", "left!x1aright");
}
// The following characters are not allowed in values for the PartitionKey and RowKey properties:
// The forward slash(/) character
// The backslash(\) character
// The number sign(#) character
// The question mark (?) character
// Control characters from U+0000 to U+001F, including:
// The horizontal tab(\t) character
// The linefeed(\n) character
// The carriage return (\r) character
// Control characters from U+007F to U+009F
[Fact]
void EncodesAllForbiddenCharacters()
{
List<char> forbiddenCharacters = "\\/#?\t\n\r".ToCharArray().ToList();
forbiddenCharacters.AddRange(Enumerable.Range(0x00, 1+(0x1f-0x00)).Select(i => (char)i));
forbiddenCharacters.AddRange(Enumerable.Range(0x7f, 1+(0x9f-0x7f)).Select(i => (char)i));
string allForbiddenCharacters = String.Join("", forbiddenCharacters);
string allForbiddenCharactersEncoded = TableKeyEncoding.Encode(allForbiddenCharacters);
// Make sure decoding is same as encoding
Assert.Equal(allForbiddenCharacters, TableKeyEncoding.Decode(allForbiddenCharactersEncoded));
// Ensure encoding does not contain any forbidden characters
Assert.Equal(0, allForbiddenCharacters.Count( c => allForbiddenCharactersEncoded.Contains(c) ));
}
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4944 次 |
| 最近记录: |