将带有重音字符的NSString转换为CString

Ahm*_*mad 5 encoding nsstring ios

我有一个值为Jose的NSString(e上的重音).我尝试将其转换为C字符串,如下所示:

char str [[myAccentStr length] + 1];
[myAccentStr getCString:str maxLength:[myAccentStr length] + 1 encoding:NSUTF32StringEncoding];
Run Code Online (Sandbox Code Playgroud)

但str最终成为一个空字符串.是什么赋予了?我也尝试过UTF8和UTF16.它稍后会被传递给另一个函数,当该函数调用lstrlen时,它的大小为零.

Sam*_*Sam 2

NSString getCString:maxLength:encoding的文档说:

\n\n
\n

您可以使用 canBeConvertedToEncoding: 来检查字符串是否可以无损转换为编码。如果可以\xe2\x80\x99t,则可以使用\n dataUsingEncoding:allowLossyConversion: 使用编码来获取\n C 字符串表示形式,允许一些信息丢失(注意\n dataUsingEncoding:allowLossyConversion: 返回的数据是\n 不是严格的 C 字符串,因为它没有 NULL 终止符)。

\n
\n\n

使用 NSString 方法dataUsingEncoding:allowLossyConversion:可以解决这个问题。这是一个代码示例:

\n\n
NSString *myAccentStr = @"Jos\xc3\xa9";\nchar str[[myAccentStr length] + 1];\n\n// NSString * to C String (char*)\nNSData *strData = [myAccentStr dataUsingEncoding:NSMacOSRomanStringEncoding \n                                allowLossyConversion:YES];\nmemcpy(str, [strData bytes], [strData length] + 1);\nstr[[myAccentStr length]] = \'\\0\';\nNSLog(@"str (from NSString* to c string): %s", str);\n\n// C String (char*) to NSString *   \nNSString *newAccentStr = [NSString stringWithCString:str \n                                            encoding:NSMacOSRomanStringEncoding];\nNSLog(@"newAccentStr (from c string to NSString*):  %@", newAccentStr);\n
Run Code Online (Sandbox Code Playgroud)\n\n

NSLog 的输出是:

\n\n
\n

str(从 NSString* 到 c 字符串):Jos\xc3\xa9

\n\n

newAccentStr(从 c 字符串到 NSString*):Jos\xc3\xa9

\n
\n\n

到目前为止,我只在使用 NSMacOSRomanStringEncoding 时看到此功能正常工作。

\n\n
\n\n

编辑

\n\n

将其更改为社区 wiki。请随意编辑。

\n\n

hooleyhoop 有一些很棒的观点,所以我想我会尝试编写尽可能详细的代码。如果我遗漏了什么,请有人补充。

\n\n

另外 - 不确定为什么 [NSString canBeConvertedToEncoding:] 返回 YES,即使 [NSString getCString:maxLength:encoding:] 函数肯定无法正常工作(如输出所示)。

\n\n

这里有一些代码可以帮助分析什么有效/什么无效:

\n\n
// Define Block variable to tests out different encodings\nvoid (^tryGetCStringUsingEncoding)(NSString*, NSStringEncoding) = ^(NSString* originalNSString, NSStringEncoding encoding) {\n    NSLog(@"Trying to convert \\"%@\\" using encoding: 0x%X", originalNSString, encoding);\n    BOOL canEncode = [originalNSString canBeConvertedToEncoding:encoding];\n    if (!canEncode)\n    {\n        NSLog(@"    Can not encode \\"%@\\" using encoding %X", originalNSString, encoding);\n    }\n    else\n    {\n        // Try encoding using NSString getCString:maxLength:encoding:\n        NSUInteger cStrLength = [originalNSString lengthOfBytesUsingEncoding:encoding];\n        char cstr[cStrLength];\n        [originalNSString getCString:cstr maxLength:cStrLength encoding:encoding];\n        NSLog(@"    Converted(1): \\"%s\\"  (expected length: %u)",\n              cstr, cStrLength);\n\n        // Try encoding using NSString dataUsingEncoding:allowLossyConversion:          \n        NSData *strData = [originalNSString dataUsingEncoding:encoding allowLossyConversion:YES];\n        char cstr2[[strData length] + 1];\n        memcpy(cstr2, [strData bytes], [strData length] + 1);\n        cstr2[[strData length]] = \'\\0\';\n        NSLog(@"    Converted(2): \\"%s\\"  (expected length: %u)",\n              cstr2, [strData length]);\n    }\n};\n\nNSString *myAccentStr = @"Jos\xc3\xa9";\n\n// Try out whatever encoding you want\ntryGetCStringUsingEncoding(myAccentStr, NSUTF8StringEncoding);\ntryGetCStringUsingEncoding(myAccentStr, NSUTF16StringEncoding);\ntryGetCStringUsingEncoding(myAccentStr, NSUTF32StringEncoding);\ntryGetCStringUsingEncoding(myAccentStr, NSMacOSRomanStringEncoding);\n
Run Code Online (Sandbox Code Playgroud)\n\n

结果:

\n\n
> Trying to convert "Jos\xc3\xa9" using encoding: 0x4\n>     Converted(1): ""  (expected length: 5)\n>     Converted(2): "Jos\xe2\x88\x9a\xc2\xa9"  (expected length: 5)\n> Trying to convert "Jos\xc3\xa9" using encoding: 0xA\n>     Converted(1): ""  (expected length: 8)\n>     Converted(2): "\xcb\x87\xcb\x9bJ"  (expected length: 10)\n> Trying to convert "Jos\xc3\xa9" using encoding: 0x8C000100\n>     Converted(1): ""  (expected length: 16)\n>     Converted(2): "\xcb\x87\xcb\x9b"  (expected length: 20)\n> Trying to convert "Jos\xc3\xa9" using encoding: 0x1E\n>     Converted(1): "-"  (expected length: 4)\n>     Converted(2): "Jos\xc3\xa9"  (expected length: 4)\n
Run Code Online (Sandbox Code Playgroud)\n