Har*_*vey 3 html objective-c ios nsscanner
我正在尝试创建一个iOS应用程序,只是为了提取网页的一部分.
我有代码工作连接到URL并将HTML存储在NSString中
我试过这个,但我只是为我的结果得到空字符串
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
// Create a new scanner and give it the html data to parse.
while (![newScanner isAtEnd])
{
[newScanner scanUpToString:@"<body>" intoString:NULL];
// Scam until <body> tag is found
[newScanner scanUpToString:@"</body>" intoString:&bodyText];
// Everything up to the end tag will get placed into the memory address of the result string
}
Run Code Online (Sandbox Code Playgroud)
我尝试了另一种方法......
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
// Create a new scanner and give it the html data to parse.
while (![newScanner isAtEnd])
{
[newScanner scanUpToString:@"<body" intoString:NULL];
// Scam until <body> tag is found
[newScanner scanUpToString:@">" intoString:NULL];
// Go to end of opening <body> tag
[newScanner scanUpToString:@"</body>" intoString:&bodyText];
// Everything up to the end tag will get placed into the memory address of the result string
}
Run Code Online (Sandbox Code Playgroud)
第二种方式返回一个以>< script...etc 开头的字符串
如果我诚实,我没有一个很好的URL来测试这个,我认为这可能更容易一些帮助删除体内的标签(如<p></p>)
任何帮助都会非常受欢迎
我不知道为什么你的第一种方法不起作用.我假设您在该片段之前定义了bodyText.这段代码对我很好,
- (void)viewDidLoad {
[super viewDidLoad];
NSString *htmlData = @"This is some stuff before <body> this is the body </body> with some more stuff";
NSScanner* newScanner = [NSScanner scannerWithString:htmlData];
NSString *bodyText;
while (![newScanner isAtEnd]) {
[newScanner scanUpToString:@"<body>" intoString:NULL];
[newScanner scanString:@"<body>" intoString:NULL];
[newScanner scanUpToString:@"</body>" intoString:&bodyText];
}
NSLog(@"%@",bodyText); // 2015-01-28 15:58:00.360 ScanningOfHTMLProblem[1373:661934] this is the body
}
Run Code Online (Sandbox Code Playgroud)
请注意,我添加了一个调用来scanString:intoString:超越第一个"<body>".
| 归档时间: |
|
| 查看次数: |
1843 次 |
| 最近记录: |