Jac*_*nkr 5 parsing objective-c nsdictionary
文字示例:
1
00:00:00,000 --> 00:00:01,000
This is the first line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
Run Code Online (Sandbox Code Playgroud)
在JavaScript中,我会用正则表达式解析它.我只是想知道,这是在Obj C中做到这一点的最好方法吗?我确信我可以找到一种方法来做到这一点,但我想以适当的方式做到这一点.
我只需要知道从哪里开始,我很乐意做其余的事情,但为了理解我最终会得到这样的东西(伪代码):
NSDictionary
index -> [0-9]+
start -> hh:mm:ss,mmm
end -> hh:mm:ss,mmm
text -> one of the lines of text
Run Code Online (Sandbox Code Playgroud)
在这种情况下,我将在我的字典中解析三个条目.
Ext*_*ire 11
一些背景:我写了一个小应用程序并创建了一个名为stuff.srt的文件,其中包含驻留在bundle中的示例; 因此,我的手段访问它.
这只是一个快速而肮脏的东西,一个概念验证.请注意,它不会检查结果.真实应用程序始终检查其结果.如您所见,工作在-applicationDidFinishLaunching:方法中进行(我在Mac OS X中工作,而不是iOS).
编辑:
有人指出,最初发布的代码没有正确处理多个文本行.为了解决这个问题,我利用了SRT文件使用CRLF作为换行符这一事实,并搜索此序列的两次出现.然后,我根据我在此处观察到的内容,将文本字符串中出现的所有CRLF更改为空格.这不会考虑文本每行中的前导或尾随空格.
我将stuff.srt文件的内容更改为:
1
00:00:00,000 --> 00:00:01,000
This is the first line
and it has a secondary line
2
00:00:01,000 --> 00:00:02,000
This is the second line
3
00:00:02,000 --> 00:00:03,000
This is the last line
and it has a secondary line too
Run Code Online (Sandbox Code Playgroud)
并且代码已经修改如下(我还将所有内容放入@autoreleasepool指令; 在解析文件的过程中可能会生成大量自动释放的对象!):
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
NSString *path = [[NSBundle mainBundle] pathForResource:@"stuff" ofType:@"srt"];
NSString *string = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:NULL];
NSScanner *scanner = [NSScanner scannerWithString:string];
while (![scanner isAtEnd])
{
@autoreleasepool
{
NSString *indexString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&indexString];
NSString *startString;
(void) [scanner scanUpToString:@" --> " intoString:&startString];
// My string constant doesn't begin with spaces because scanners
// skip spaces and newlines by default.
(void) [scanner scanString:@"-->" intoString:NULL];
NSString *endString;
(void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&endString];
NSString *textString;
// (void) [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&textString];
// BEGIN EDIT
(void) [scanner scanUpToString:@"\r\n\r\n" intoString:&textString];
textString = [textString stringByReplacingOccurrencesOfString:@"\r\n" withString:@" "];
// Addresses trailing space added if CRLF is on a line by itself at the end of the SRT file
textString = [textString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
// END EDIT
NSDictionary *dictionary = [NSDictionary dictionaryWithObjectsAndKeys:
indexString, @"index",
startString, @"start",
endString , @"end",
textString , @"text",
nil];
NSLog(@"%@", dictionary);
}
}
}
Run Code Online (Sandbox Code Playgroud)
修改后的输出如下:
2013-02-09 16:10:17.727 SRTFileScan[4846:303] {
end = "00:00:01,000";
index = 1;
start = "00:00:00,000";
text = "This is the first line and it has a secondary line";
}
2013-02-09 16:10:17.729 SRTFileScan[4846:303] {
end = "00:00:02,000";
index = 2;
start = "00:00:01,000";
text = "This is the second line";
}
2013-02-09 16:10:17.730 SRTFileScan[4846:303] {
end = "00:00:03,000";
index = 3;
start = "00:00:02,000";
text = "This is the last line and it has a secondary line too";
}
Run Code Online (Sandbox Code Playgroud)
我从今天读到的内容中学到了另一件事:SRT文件格式起源于法国,输入中显示的逗号是那里使用的小数分隔符.