使用 C# 根据计数和单词拆分字符串

use*_*609 1 c# regex string split substring

我需要按单词拆分字符串,每行应有 25 个字符。例如:

string ORIGINAL_TEXT = "请编写一个程序,将此文本分成小块。每个块的最大长度应为 25 "

输出应该是:

“请写一个程序”,

“这破坏了这个文本”,

“分成小块。每个”,

“块应该有一个”,

“最大长度为25”

我尝试使用子字符串- 但它破坏了像

请写一个程序” - 错误

请写一个程序” - 正确

请写一个程序 - 只有 23 个字符,它可以需要更多 2 个字符,但它会破坏这个单词

string[] splitSampArr = splitSamp.Split(',', '.', ';');
string[] myText = new string[splitSampArr.Length + 1];

int i = 0;
foreach (string splitSampArrVal in splitSampArr)
{
    if (splitSampArrVal.Length > 25)
    {
        myText[i] = splitSampArrVal.Substring(0, 25);
        i++;
    }
    myText[i] = splitSampArrVal;

    i++;
}
Run Code Online (Sandbox Code Playgroud)

Wik*_*żew 5

您可以通过以下方式实现这一目标:

@"(\b.{1,25})(?:\s+|$)"
Run Code Online (Sandbox Code Playgroud)

请参阅正则表达式演示

This regex matches and captures into Group 1 any character but a newline (with .) preceded with a word boundary (so, we only start matching whole words), 1 to 25 occurrences (thanks to the limiting quantifier {1,25}), and then matches either 1 or more whitespace characters (with \s+) or the end of string ($).

See a code demo:

using System;
using System.Linq;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class Test
{
    public static void Main()
    {
        var str = "Please write a program that breaks this text into small chucks. Each chunk should have a maximum length of 25 ";
        var chunks = Regex.Matches(str, @"(\b.{1,25})(?:\s+|$)")
                 .Cast<Match>().Select(p => p.Groups[1].Value)
                 .ToList();
        Console.WriteLine(string.Join("\n", chunks));
    }
}
Run Code Online (Sandbox Code Playgroud)