解析.srt文件

use*_*165 4 php parsing srt

1
00:00:00,074 --> 00:00:02,564
Previously on Breaking Bad...

2
00:00:02,663 --> 00:00:04,393
Words...
Run Code Online (Sandbox Code Playgroud)

我需要用php解析srt文件并用变量打印文件中的所有subs.

我找不到合适的reg exps.当这样做时,我需要采用id,时间和字幕变量.并且当打印时,不存在array()s等等必须与原始文件中的相同.

我的意思是我必须打印出来;

$number <br> (e.g. 1)
$time <br> (e.g. 00:00:00,074 --> 00:00:02,564)
$subtitle <br> (e.g. Previously on Breaking Bad...)
Run Code Online (Sandbox Code Playgroud)

顺便说一句,我有这个代码.但它没有看到线条.它必须编辑,但如何?

$srt_file = file('test.srt',FILE_IGNORE_NEW_LINES);
$regex = "/^(\d)+ ([\d]+:[\d]+:[\d]+,[\d]+) --> ([\d]+:[\d]+:[\d]+,[\d]+) (\w.+)/";

foreach($srt_file as $srt){

    preg_match($regex,$srt,$srt_lines);

    print_r($srt_lines);
    echo '<br />';

}
Run Code Online (Sandbox Code Playgroud)

dre*_*010 12

这是一个简短的状态机,用于逐行解析SRT文件:

define('SRT_STATE_SUBNUMBER', 0);
define('SRT_STATE_TIME',      1);
define('SRT_STATE_TEXT',      2);
define('SRT_STATE_BLANK',     3);

$lines   = file('test.srt');

$subs    = array();
$state   = SRT_STATE_SUBNUMBER;
$subNum  = 0;
$subText = '';
$subTime = '';

foreach($lines as $line) {
    switch($state) {
        case SRT_STATE_SUBNUMBER:
            $subNum = trim($line);
            $state  = SRT_STATE_TIME;
            break;

        case SRT_STATE_TIME:
            $subTime = trim($line);
            $state   = SRT_STATE_TEXT;
            break;

        case SRT_STATE_TEXT:
            if (trim($line) == '') {
                $sub = new stdClass;
                $sub->number = $subNum;
                list($sub->startTime, $sub->stopTime) = explode(' --> ', $subTime);
                $sub->text   = $subText;
                $subText     = '';
                $state       = SRT_STATE_SUBNUMBER;

                $subs[]      = $sub;
            } else {
                $subText .= $line;
            }
            break;
    }
}

if ($state == SRT_STATE_TEXT) {
    // if file was missing the trailing newlines, we'll be in this
    // state here.  Append the last read text and add the last sub.
    $sub->text = $subText;
    $subs[] = $sub;
}

print_r($subs);
Run Code Online (Sandbox Code Playgroud)

结果:

Array
(
    [0] => stdClass Object
        (
            [number] => 1
            [stopTime] => 00:00:24,400
            [startTime] => 00:00:20,000
            [text] => Altocumulus clouds occur between six thousand
        )

    [1] => stdClass Object
        (
            [number] => 2
            [stopTime] => 00:00:27,800
            [startTime] => 00:00:24,600
            [text] => and twenty thousand feet above ground level.
        )

)
Run Code Online (Sandbox Code Playgroud)

然后,您可以循环遍历subs数组或通过数组偏移量访问它们:

echo $subs[0]->number . ' says ' . $subs[0]->text . "\n";
Run Code Online (Sandbox Code Playgroud)

通过循环遍历每个潜在客户并显示它来显示所有潜艇:

foreach($subs as $sub) {
    echo $sub->number . ' begins at ' . $sub->startTime .
         ' and ends at ' . $sub->stopTime . '.  The text is: <br /><pre>' .
         $sub->text . "</pre><br />\n";
}
Run Code Online (Sandbox Code Playgroud)

进一步阅读:SubRip文本文件格式