Ben*_*zun 11 c# text-files offset
我的简单要求:读取一个巨大的(>一百万)行测试文件(对于这个例子假设它是某种类型的CSV)并保持对该行开头的引用以便将来更快地查找(读取一行,从X).
我首先尝试了一种天真而简单的方法,使用a StreamWriter
并访问底层BaseStream.Position
.不幸的是,这不符合我的意图:
给定包含以下内容的文件
Foo
Bar
Baz
Bla
Fasel
Run Code Online (Sandbox Code Playgroud)
这个非常简单的代码
using (var sr = new StreamReader(@"C:\Temp\LineTest.txt")) {
string line;
long pos = sr.BaseStream.Position;
while ((line = sr.ReadLine()) != null) {
Console.Write("{0:d3} ", pos);
Console.WriteLine(line);
pos = sr.BaseStream.Position;
}
}
Run Code Online (Sandbox Code Playgroud)
输出是:
000 Foo
025 Bar
025 Baz
025 Bla
025 Fasel
Run Code Online (Sandbox Code Playgroud)
我可以想象,流正在尝试提供帮助/效率,并且可能在需要新数据时读入(大)块.对我来说这很糟糕..
最后的问题是:在没有使用基本Stream并且手动搞乱\ r \n\r \n和字符串编码等的情况下逐行读取文件时获取(byte,char)偏移量的方法是什么?没什么大不了的,真的,我只是不喜欢建造可能存在的东西......
Tho*_*que 11
您可以创建一个TextReader
包装器,它将跟踪基础中的当前位置TextReader
:
public class TrackingTextReader : TextReader
{
private TextReader _baseReader;
private int _position;
public TrackingTextReader(TextReader baseReader)
{
_baseReader = baseReader;
}
public override int Read()
{
_position++;
return _baseReader.Read();
}
public override int Peek()
{
return _baseReader.Peek();
}
public int Position
{
get { return _position; }
}
}
Run Code Online (Sandbox Code Playgroud)
然后您可以按如下方式使用它:
string text = @"Foo
Bar
Baz
Bla
Fasel";
using (var reader = new StringReader(text))
using (var trackingReader = new TrackingTextReader(reader))
{
string line;
while ((line = trackingReader.ReadLine()) != null)
{
Console.WriteLine("{0:d3} {1}", trackingReader.Position, line);
}
}
Run Code Online (Sandbox Code Playgroud)
小智 5
经过搜索、测试并做了一些疯狂的事情之后,我的代码需要解决(我目前在我的产品中使用此代码)。
public sealed class TextFileReader : IDisposable
{
FileStream _fileStream = null;
BinaryReader _binReader = null;
StreamReader _streamReader = null;
List<string> _lines = null;
long _length = -1;
/// <summary>
/// Initializes a new instance of the <see cref="TextFileReader"/> class with default encoding (UTF8).
/// </summary>
/// <param name="filePath">The path to text file.</param>
public TextFileReader(string filePath) : this(filePath, Encoding.UTF8) { }
/// <summary>
/// Initializes a new instance of the <see cref="TextFileReader"/> class.
/// </summary>
/// <param name="filePath">The path to text file.</param>
/// <param name="encoding">The encoding of text file.</param>
public TextFileReader(string filePath, Encoding encoding)
{
if (!File.Exists(filePath))
throw new FileNotFoundException("File (" + filePath + ") is not found.");
_fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read);
_length = _fileStream.Length;
_binReader = new BinaryReader(_fileStream, encoding);
}
/// <summary>
/// Reads a line of characters from the current stream at the current position and returns the data as a string.
/// </summary>
/// <returns>The next line from the input stream, or null if the end of the input stream is reached</returns>
public string ReadLine()
{
if (_binReader.PeekChar() == -1)
return null;
string line = "";
int nextChar = _binReader.Read();
while (nextChar != -1)
{
char current = (char)nextChar;
if (current.Equals('\n'))
break;
else if (current.Equals('\r'))
{
int pickChar = _binReader.PeekChar();
if (pickChar != -1 && ((char)pickChar).Equals('\n'))
nextChar = _binReader.Read();
break;
}
else
line += current;
nextChar = _binReader.Read();
}
return line;
}
/// <summary>
/// Reads some lines of characters from the current stream at the current position and returns the data as a collection of string.
/// </summary>
/// <param name="totalLines">The total number of lines to read (set as 0 to read from current position to end of file).</param>
/// <returns>The next lines from the input stream, or empty collectoin if the end of the input stream is reached</returns>
public List<string> ReadLines(int totalLines)
{
if (totalLines < 1 && this.Position == 0)
return this.ReadAllLines();
_lines = new List<string>();
int counter = 0;
string line = this.ReadLine();
while (line != null)
{
_lines.Add(line);
counter++;
if (totalLines > 0 && counter >= totalLines)
break;
line = this.ReadLine();
}
return _lines;
}
/// <summary>
/// Reads all lines of characters from the current stream (from the begin to end) and returns the data as a collection of string.
/// </summary>
/// <returns>The next lines from the input stream, or empty collectoin if the end of the input stream is reached</returns>
public List<string> ReadAllLines()
{
if (_streamReader == null)
_streamReader = new StreamReader(_fileStream);
_streamReader.BaseStream.Seek(0, SeekOrigin.Begin);
_lines = new List<string>();
string line = _streamReader.ReadLine();
while (line != null)
{
_lines.Add(line);
line = _streamReader.ReadLine();
}
return _lines;
}
/// <summary>
/// Gets the length of text file (in bytes).
/// </summary>
public long Length
{
get { return _length; }
}
/// <summary>
/// Gets or sets the current reading position.
/// </summary>
public long Position
{
get
{
if (_binReader == null)
return -1;
else
return _binReader.BaseStream.Position;
}
set
{
if (_binReader == null)
return;
else if (value >= this.Length)
this.SetPosition(this.Length);
else
this.SetPosition(value);
}
}
void SetPosition(long position)
{
_binReader.BaseStream.Seek(position, SeekOrigin.Begin);
}
/// <summary>
/// Gets the lines after reading.
/// </summary>
public List<string> Lines
{
get
{
return _lines;
}
}
/// <summary>
/// Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
/// </summary>
public void Dispose()
{
if (_binReader != null)
_binReader.Close();
if (_streamReader != null)
{
_streamReader.Close();
_streamReader.Dispose();
}
if (_fileStream != null)
{
_fileStream.Close();
_fileStream.Dispose();
}
}
~TextFileReader()
{
this.Dispose();
}
}
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
15960 次 |
最近记录: |