UTF-8的多字节安全wordwrap()函数

phi*_*reo 14 php string utf-8 word-wrap multibyte

wordwrap()对于像UTF-8这样的多字节字符串,PHP的函数无法正常工作.

注释中有一些mb安全函数的例子,但是对于一些不同的测试数据,它们似乎都有一些问题.

该函数应采用与之完全相同的参数wordwrap().

特别要确保它适用于:

  • 削减中间词如果$cut = true,不要削减中间词
  • 如果没有在单词中插入额外的空格 $break = ' '
  • 也为...工作 $break = "\n"
  • 适用于ASCII,以及所有有效的UTF-8

Fos*_*for 20

我没有为我找到任何工作代码.这是我写的.对我来说它是有效的,认为它可能不是最快的.

function mb_wordwrap($str, $width = 75, $break = "\n", $cut = false) {
    $lines = explode($break, $str);
    foreach ($lines as &$line) {
        $line = rtrim($line);
        if (mb_strlen($line) <= $width)
            continue;
        $words = explode(' ', $line);
        $line = '';
        $actual = '';
        foreach ($words as $word) {
            if (mb_strlen($actual.$word) <= $width)
                $actual .= $word.' ';
            else {
                if ($actual != '')
                    $line .= rtrim($actual).$break;
                $actual = $word;
                if ($cut) {
                    while (mb_strlen($actual) > $width) {
                        $line .= mb_substr($actual, 0, $width).$break;
                        $actual = mb_substr($actual, $width);
                    }
                }
                $actual .= ' ';
            }
        }
        $line .= trim($actual);
    }
    return implode($break, $lines);
}
Run Code Online (Sandbox Code Playgroud)


Fle*_*der 6

因为没有答案可以处理每个用例,所以这里有一个可以解决的问题。该代码基于Drupal\xe2\x80\x99sAbstractStringWrapper::wordWrap

\n\n
<?php\n\n/**\n * Wraps any string to a given number of characters.\n *\n * This implementation is multi-byte aware and relies on {@link\n * http://www.php.net/manual/en/book.mbstring.php PHP\'s multibyte\n * string extension}.\n *\n * @see wordwrap()\n * @link https://api.drupal.org/api/drupal/core%21vendor%21zendframework%21zend-stdlib%21Zend%21Stdlib%21StringWrapper%21AbstractStringWrapper.php/function/AbstractStringWrapper%3A%3AwordWrap/8\n * @param string $string\n *   The input string.\n * @param int $width [optional]\n *   The number of characters at which <var>$string</var> will be\n *   wrapped. Defaults to <code>75</code>.\n * @param string $break [optional]\n *   The line is broken using the optional break parameter. Defaults\n *   to <code>"\\n"</code>.\n * @param boolean $cut [optional]\n *   If the <var>$cut</var> is set to <code>TRUE</code>, the string is\n *   always wrapped at or before the specified <var>$width</var>. So if\n *   you have a word that is larger than the given <var>$width</var>, it\n *   is broken apart. Defaults to <code>FALSE</code>.\n * @return string\n *   Returns the given <var>$string</var> wrapped at the specified\n *   <var>$width</var>.\n */\nfunction mb_wordwrap($string, $width = 75, $break = "\\n", $cut = false) {\n  $string = (string) $string;\n  if ($string === \'\') {\n    return \'\';\n  }\n\n  $break = (string) $break;\n  if ($break === \'\') {\n    trigger_error(\'Break string cannot be empty\', E_USER_ERROR);\n  }\n\n  $width = (int) $width;\n  if ($width === 0 && $cut) {\n    trigger_error(\'Cannot force cut when width is zero\', E_USER_ERROR);\n  }\n\n  if (strlen($string) === mb_strlen($string)) {\n    return wordwrap($string, $width, $break, $cut);\n  }\n\n  $stringWidth = mb_strlen($string);\n  $breakWidth = mb_strlen($break);\n\n  $result = \'\';\n  $lastStart = $lastSpace = 0;\n\n  for ($current = 0; $current < $stringWidth; $current++) {\n    $char = mb_substr($string, $current, 1);\n\n    $possibleBreak = $char;\n    if ($breakWidth !== 1) {\n      $possibleBreak = mb_substr($string, $current, $breakWidth);\n    }\n\n    if ($possibleBreak === $break) {\n      $result .= mb_substr($string, $lastStart, $current - $lastStart + $breakWidth);\n      $current += $breakWidth - 1;\n      $lastStart = $lastSpace = $current + 1;\n      continue;\n    }\n\n    if ($char === \' \') {\n      if ($current - $lastStart >= $width) {\n        $result .= mb_substr($string, $lastStart, $current - $lastStart) . $break;\n        $lastStart = $current + 1;\n      }\n\n      $lastSpace = $current;\n      continue;\n    }\n\n    if ($current - $lastStart >= $width && $cut && $lastStart >= $lastSpace) {\n      $result .= mb_substr($string, $lastStart, $current - $lastStart) . $break;\n      $lastStart = $lastSpace = $current;\n      continue;\n    }\n\n    if ($current - $lastStart >= $width && $lastStart < $lastSpace) {\n      $result .= mb_substr($string, $lastStart, $lastSpace - $lastStart) . $break;\n      $lastStart = $lastSpace = $lastSpace + 1;\n      continue;\n    }\n  }\n\n  if ($lastStart !== $current) {\n    $result .= mb_substr($string, $lastStart, $current - $lastStart);\n  }\n\n  return $result;\n}\n\n?>\n
Run Code Online (Sandbox Code Playgroud)\n


小智 5

/**
 * wordwrap for utf8 encoded strings
 *
 * @param string $str
 * @param integer $len
 * @param string $what
 * @return string
 * @author Milian Wolff <mail@milianw.de>
 */

function utf8_wordwrap($str, $width, $break, $cut = false) {
    if (!$cut) {
        $regexp = '#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){'.$width.',}\b#U';
    } else {
        $regexp = '#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){'.$width.'}#';
    }
    if (function_exists('mb_strlen')) {
        $str_len = mb_strlen($str,'UTF-8');
    } else {
        $str_len = preg_match_all('/[\x00-\x7F\xC0-\xFD]/', $str, $var_empty);
    }
    $while_what = ceil($str_len / $width);
    $i = 1;
    $return = '';
    while ($i < $while_what) {
        preg_match($regexp, $str,$matches);
        $string = $matches[0];
        $return .= $string.$break;
        $str = substr($str, strlen($string));
        $i++;
    }
    return $return.$str;
}
Run Code Online (Sandbox Code Playgroud)

总时间:0.0020880699是个好时机:)


phi*_*reo -2

这个好像效果不错...

function mb_wordwrap($str, $width = 75, $break = "\n", $cut = false, $charset = null) {
    if ($charset === null) $charset = mb_internal_encoding();

    $pieces = explode($break, $str);
    $result = array();
    foreach ($pieces as $piece) {
      $current = $piece;
      while ($cut && mb_strlen($current) > $width) {
        $result[] = mb_substr($current, 0, $width, $charset);
        $current = mb_substr($current, $width, 2048, $charset);
      }
      $result[] = $current;
    }
    return implode($break, $result);
}
Run Code Online (Sandbox Code Playgroud)