Javascript 相当于 C# 中的 BinaryReader.ReadString()

Question

Javascript 相当于 C# 中的 BinaryReader.ReadString()

Los*_*ost 5 javascript c# typescript

我正在将一些 C# 代码转换为 JavaScript 代码，虽然此文件具有多种数据类型，并且我在各个库中找到了 JavaScript 中的匹配功能，但我无法在 JS 中找到某个特定函数。

该函数是https://learn.microsoft.com/en-us/dotnet/api/system.io.binaryreader.readstring?view=net-7.0

我有几个问题：

首先让我困惑的是字符串本质上不是一个可变长度变量吗？如果是这样，这个函数怎么能不接受长度参数呢？
我们假设字符串的长度有一些上限。如果有的话，JS/TS 有类似的功能吗？或者我可以下载任何包来模仿 C# 功能？

先感谢您。

Answer 1

Evk*_*Evk 3

BinaryReader期望字符串以特定格式编码 - 格式BinaryWriter写入它们。如文档中所述：

从当前流中读取字符串。该字符串以长度为前缀，编码为一次七位的整数

因此，字符串的长度存储在字符串本身之前，编码为“一次七位整数”。我们可以从BinaryWriter.Write7BitEncodedInt获得更多信息：

value 参数的整数从七个最低有效位开始一次写出七位。一个字节的高位指示在该字节之后是否还有更多字节要写入。

如果值适合七位，则仅占用一个字节的空间。如果值不能容纳在七位中，则高位设置在第一个字节上并写出。然后将值移位七位并写入下一个字节。重复此过程直到写入整个整数。

因此它是可变长度编码：与始终使用 4 个字节作为 Int32 值的通常方法不同，此方法使用可变字节数。这样，短字符串的长度可以小于 4 个字节（例如，长度小于 128 个字节的字符串将仅占用 1 个字节）。

您可以在 JavaScript 中重现此逻辑 - 一次只需读取一个字节。最低 7 位表示（部分）长度信息，最高位表示下一个字节是否也表示长度信息（否则它是实际字符串的开始）。

然后，当您获得长度时 - 使用TextDecoder将字节数组解码为给定编码的字符串。这是打字稿中的相同功能。它接受缓冲区 ( Uint8Array)、该缓冲区的偏移量和编码（默认为 UTF-8，请检查文档TextDecoder以了解其他可用的编码）：

class BinaryReader {
  getString(buffer: Uint8Array, offset: number, encoding: string = "utf-8") {
      let length = 0; // length of following string
      let cursor = 0;
      let nextByte: number;
      do {
          // just grab next byte
          nextByte = buffer[offset + cursor];          
          // grab 7 bits of current byte, then shift them according to this byte position
          // that is if that's first byte - do not shift, second byte - shift by 7, etc
          // then merge into length with or.
          length = length | ((nextByte & 0x7F) << (cursor * 7));          
          cursor++;
      }
      while (nextByte >= 0x80); // do this while most significant bit is 1

      // get a slice of the length we got
      let sliceWithString = buffer.slice(offset + cursor, offset + cursor + length);      
      let decoder = new TextDecoder(encoding);      
      return decoder.decode(sliceWithString);
  }
}

Run Code Online (Sandbox Code Playgroud)

如果将在生产中使用，值得在上面的代码中添加各种健全性检查（我们不会读取太多字节读取长度，计算出的长度实际上在缓冲区的范围内等）。

小测试，使用字符串“TEST STRING”的二进制表示，用BinaryWriter.Write(string)C# 编写：

let buffer = new Uint8Array([12, 84, 69, 83, 84, 32, 83, 84, 82, 73, 78, 71, 33]);
let reader = new BinaryReader();
console.log(reader.getString(buffer, 0, "utf-8"));
// outputs TEST STRING

Run Code Online (Sandbox Code Playgroud)

更新。您在评论中提到，在您的数据中，字符串的长度由 4 个字节表示，因此例如长度 29 由 [0, 0, 0, 29] 表示。这意味着您的数据不是使用编写的BinaryWriter，因此无法使用读取BinaryReader，因此您实际上不需要的模拟BinaryReader.GetString，这与您的问题所要求的相反。

无论如何，如果你需要处理这种情况 - 你可以这样做：

class BinaryReader {
  getString(buffer: Uint8Array, offset: number, encoding: string = "utf-8") {
      // create a view over first 4 bytes starting at offset      
      let view = new DataView(buffer.buffer, offset, 4);
      // read those 4 bytes as int 32 (big endian, since your example is like that)
      let length = view.getInt32(0);
      // get a slice of the length we got
      let sliceWithString = buffer.slice(offset + 4, offset + 4 + length);      
      let decoder = new TextDecoder(encoding);      
      return decoder.decode(sliceWithString);
  }
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年，2 月前
查看次数：	466 次
最近记录：	3 年，2 月前