iOS HTML Unicode到NSString?

Sco*_*ott 2 html unicode ios

我正在将Android应用程序移植到iOS,我遇到了一个小问题.我正在从网页中提取HTML编码数据,但有些数据以Unicode显示以显示外国字符...所以俄语中的字符(Летизамной)将被解析为,"Лет..."

在android中我通过调用HTML.fromHTML()来解决这个问题.iOS中有类似的东西吗?

Lil*_*ard 6

编写自己的HTML实体解码器非常容易.只需扫描字符串,查找&,读取以下内容;然后解释结果.如果是"amp","lt","gt"或"quot",请将其替换为相关字符.如果它以#开头,则它是一个数字实体.如果#后跟一个"x",则将其余部分视为十六进制,否则视为十进制.读取数字,然后将字符插入到字符串中(如果您正在写入NSMutableString可以使用的字符[str appendFormat:@"%C", thechar].NSScanner可以使字符串扫描非常简单,特别是因为它已经知道如何读取十六进制数字.

我只是掀起了一个应该为你做这个的功能.注意,我实际上没有测试过这个,所以你应该按照它的步伐运行它:

- (NSString *)stringByDecodingHTMLEntitiesInString:(NSString *)input {
    NSMutableString *results = [NSMutableString string];
    NSScanner *scanner = [NSScanner scannerWithString:input];
    [scanner setCharactersToBeSkipped:nil];
    while (![scanner isAtEnd]) {
        NSString *temp;
        if ([scanner scanUpToString:@"&" intoString:&temp]) {
            [results appendString:temp];
        }
        if ([scanner scanString:@"&" intoString:NULL]) {
            BOOL valid = YES;
            unsigned c = 0;
            NSUInteger savedLocation = [scanner scanLocation];
            if ([scanner scanString:@"#" intoString:NULL]) {
                // it's a numeric entity
                if ([scanner scanString:@"x" intoString:NULL]) {
                    // hexadecimal
                    unsigned int value;
                    if ([scanner scanHexInt:&value]) {
                        c = value;
                    } else {
                        valid = NO;
                    }
                } else {
                    // decimal
                    int value;
                    if ([scanner scanInt:&value] && value >= 0) {
                        c = value;
                    } else {
                        valid = NO;
                    }
                }
                if (![scanner scanString:@";" intoString:NULL]) {
                    // not ;-terminated, bail out and emit the whole entity
                    valid = NO;
                }
            } else {
                if (![scanner scanUpToString:@";" intoString:&temp]) {
                    // &; is not a valid entity
                    valid = NO;
                } else if (![scanner scanString:@";" intoString:NULL]) {
                    // there was no trailing ;
                    valid = NO;
                } else if ([temp isEqualToString:@"amp"]) {
                    c = '&';
                } else if ([temp isEqualToString:@"quot"]) {
                    c = '"';
                } else if ([temp isEqualToString:@"lt"]) {
                    c = '<';
                } else if ([temp isEqualToString:@"gt"]) {
                    c = '>';
                } else {
                    // unknown entity
                    valid = NO;
                }
            }
            if (!valid) {
                // we errored, just emit the whole thing raw
                [results appendString:[input substringWithRange:NSMakeRange(savedLocation, [scanner scanLocation]-savedLocation)]];
            } else {
                [results appendFormat:@"%C", c];
            }
        }
    }
    return results;
}
Run Code Online (Sandbox Code Playgroud)