Igo*_*yuk 25 html objective-c nsstring ios
我NSString
内心有大量HTML文本.该字符串的长度超过3.500.000个字符.如何将此HTML文本转换为NSString
内部的纯文本.我使用的是扫描仪,但效果太慢了.任何的想法 ?
o15*_*1s2 67
这取决于您要定位的iOS版本.从iOS7开始,有一种内置方法,不仅可以剥离HTML标记,还可以将格式设置为字符串:
Xcode 9/Swift 4
if let htmlStringData = htmlString.data(using: .utf8), let attributedString = try? NSAttributedString(data: htmlStringData, options: [.documentType : NSAttributedString.DocumentType.html], documentAttributes: nil) {
print(attributedString)
}
Run Code Online (Sandbox Code Playgroud)
你甚至可以创建这样的扩展:
extension String {
var htmlToAttributedString: NSAttributedString? {
guard let data = self.data(using: .utf8) else {
return nil
}
do {
return try NSAttributedString(data: data, options: [.documentType : NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
} catch {
print("Cannot convert html string to attributed string: \(error)")
return nil
}
}
}
Run Code Online (Sandbox Code Playgroud)
请注意,此示例代码使用UTF8编码.您甚至可以创建函数而不是计算属性,并将编码添加为参数.
斯威夫特3
let attributedString = try NSAttributedString(data: htmlString.dataUsingEncoding(NSUTF8StringEncoding)!,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
documentAttributes: nil)
Run Code Online (Sandbox Code Playgroud)
Objective-C的
[[NSAttributedString alloc] initWithData:[htmlString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil];
Run Code Online (Sandbox Code Playgroud)
如果你只需要删除之间的所有内容<
和>
(肮脏的方式!),如果你有字符串中这些字符,这可能是有问题的,使用此:
- (NSString *)stringByStrippingHTML {
NSRange r;
NSString *s = [[self copy] autorelease];
while ((r = [s rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
s = [s stringByReplacingCharactersInRange:r withString:@""];
return s;
}
Run Code Online (Sandbox Code Playgroud)
Igo*_*yuk 16
我用扫描仪解决了我的问题,但我不是用它来处理所有文本.在将所有部分连接在一起之前,我将它用于每10,000个文本部分.我的代码如下
-(NSString *)convertHTML:(NSString *)html {
NSScanner *myScanner;
NSString *text = nil;
myScanner = [NSScanner scannerWithString:html];
while ([myScanner isAtEnd] == NO) {
[myScanner scanUpToString:@"<" intoString:NULL] ;
[myScanner scanUpToString:@">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Run Code Online (Sandbox Code Playgroud)
斯威夫特4:
var htmlToString(html:String) -> String {
var htmlStr =html;
let scanner:Scanner = Scanner(string: htmlStr);
var text:NSString? = nil;
while scanner.isAtEnd == false {
scanner.scanUpTo("<", into: nil);
scanner.scanUpTo(">", into: &text);
htmlStr = htmlStr.replacingOccurrences(of: "\(text ?? "")>", with: "");
}
htmlStr = htmlStr.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines);
return htmlStr;
}
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
37295 次 |
最近记录: |