如何分割字符串保留整个单词?

jlp*_*jlp 11 .net c# console formatting string-concatenation

我需要将长句分成保留整个单词的部分.每个部分应该给出最大数量的字符(包括空格,点等).例如:

int partLenght = 35;
string sentence = "Silver badges are awarded for longer term goals. Silver badges are uncommon."
Run Code Online (Sandbox Code Playgroud)

输出:

1 part: "Silver badges are awarded for"
2 part: "longer term goals. Silver badges are"
3 part: "uncommon."
Run Code Online (Sandbox Code Playgroud)

Tom*_*son 19

试试这个:

    static void Main(string[] args)
    {
        int partLength = 35;
        string sentence = "Silver badges are awarded for longer term goals. Silver badges are uncommon.";
        string[] words = sentence.Split(' ');
        var parts = new Dictionary<int, string>();
        string part = string.Empty;
        int partCounter = 0;
        foreach (var word in words)
        {
            if (part.Length + word.Length < partLength)
            {
                part += string.IsNullOrEmpty(part) ? word : " " + word;
            }
            else
            {
                parts.Add(partCounter, part);
                part = word;
                partCounter++;
            }
        }
        parts.Add(partCounter, part);
        foreach (var item in parts)
        {
            Console.WriteLine("Part {0} (length = {2}): {1}", item.Key, item.Value, item.Value.Length);
        }
        Console.ReadLine();
    }
Run Code Online (Sandbox Code Playgroud)


Jon*_*Jon 13

我知道必须有一个很好的LINQ-y方式来做这个,所以这里是为了它的乐趣:

var input = "The quick brown fox jumps over the lazy dog.";
var charCount = 0;
var maxLineLength = 11;

var lines = input.Split(' ', StringSplitOptions.RemoveEmptyEntries)
    .GroupBy(w => (charCount += w.Length + 1) / maxLineLength)
    .Select(g => string.Join(" ", g));

// That's all :)

foreach (var line in lines) {
    Console.WriteLine(line);
}
Run Code Online (Sandbox Code Playgroud)

显然,只要查询不是并行的,这个代码就可以工作,因为它依赖于charCount"按字顺序"递增.


小智 11

我一直在测试Jon和Lessan的答案,但是如果你的最大长度需要是绝对的而不是近似的,它们就不能正常工作.当它们的计数器递增时,它不计算在一行末尾留下的空白空间.

根据OP的示例运行他们的代码,您得到:

1 part: "Silver badges are awarded for " - 29 Characters
2 part: "longer term goals. Silver badges are" - 36 Characters
3 part: "uncommon. " - 13 Characters
Run Code Online (Sandbox Code Playgroud)

第二行的"是",应该在第三行.发生这种情况是因为计数器不包括第一行末尾的6个字符.

我想出了以下对Lessan的答案的修改:

public static class ExtensionMethods
{
    public static string[] Wrap(this string text, int max)
    {
        var charCount = 0;
        var lines = text.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
        return lines.GroupBy(w => (charCount += (((charCount % max) + w.Length + 1 >= max) 
                        ? max - (charCount % max) : 0) + w.Length + 1) / max)
                    .Select(g => string.Join(" ", g.ToArray()))
                    .ToArray();
    }
}
Run Code Online (Sandbox Code Playgroud)


Ode*_*ded 6

使用(空格)拆分字符串,从结果数组中构建新字符串,在每个新段的限制之前停止.

未经测试的伪代码:

string[] words = sentence.Split(new char[] {' '});
IList<string> sentenceParts = new List<string>();
sentenceParts.Add(string.Empty);

int partCounter = 0;    

foreach (var word in words)
{
  if(sentenceParts[partCounter].Length + word.Length > myLimit)
  {
     partCounter++;
     sentenceParts.Add(string.Empty);
  }

  sentenceParts[partCounter] += word + " ";
}
Run Code Online (Sandbox Code Playgroud)


sǝɯ*_*ɯɐſ 6

似乎每个人都在使用某种形式的“Split然后重建句子”......

我想我会尝试一下我的大脑逻辑上思考手动执行此操作的方式,即:

  • 按长度分割
  • 向后移动到最近的空间并使用该块
  • 删除使用过的块并重新开始

代码最终比我希望的要复杂一些,但是我相信它可以处理大多数(所有?)边缘情况 - 包括比 maxLength 长的单词,当单词恰好在 maxLength 处结束时等。

这是我的功能:

private static List<string> SplitWordsByLength(string str, int maxLength)
{
    List<string> chunks = new List<string>();
    while (str.Length > 0)
    {
        if (str.Length <= maxLength)                    //if remaining string is less than length, add to list and break out of loop
        {
            chunks.Add(str);
            break;
        }

        string chunk = str.Substring(0, maxLength);     //Get maxLength chunk from string.

        if (char.IsWhiteSpace(str[maxLength]))          //if next char is a space, we can use the whole chunk and remove the space for the next line
        {
            chunks.Add(chunk);
            str = str.Substring(chunk.Length + 1);      //Remove chunk plus space from original string
        }
        else
        {
            int splitIndex = chunk.LastIndexOf(' ');    //Find last space in chunk.
            if (splitIndex != -1)                       //If space exists in string,
                chunk = chunk.Substring(0, splitIndex); //  remove chars after space.
            str = str.Substring(chunk.Length + (splitIndex == -1 ? 0 : 1));      //Remove chunk plus space (if found) from original string
            chunks.Add(chunk);                          //Add to list
        }
    }
    return chunks;
}
Run Code Online (Sandbox Code Playgroud)

测试用法:

string testString = "Silver badges are awarded for longer term goals. Silver badges are uncommon.";
int length = 35;

List<string> test = SplitWordsByLength(testString, length);

foreach (string chunk in test)
{
    Console.WriteLine(chunk);  
}

Console.ReadLine();
Run Code Online (Sandbox Code Playgroud)