我正在读取CSV文件,记录被记录为字符串[].我想获取每条记录并将其转换为自定义对象.
T GetMyObject<T>();
Run Code Online (Sandbox Code Playgroud)
目前我通过反射来做这件事,这真的很慢.我正在测试一个包含数百万条记录的515 Meg文件.解析时间不到10秒.使用手动转换创建自定义对象大约需要20秒,Convert.ToSomeType但大约需要4分钟才能通过反射转换为对象.
什么是自动处理这个的好方法?
似乎花了很多时间在这个PropertyInfo.SetValue方法上.我尝试缓存属性MethodInfosetter并使用它,但它实际上更慢.
我也尝试将其转换为像Jon Skeet在这里建议的那样的委托:提高性能反映,我应该考虑哪些替代方案,但问题是我不知道属性类型是提前的.我能够得到代表
var myObject = Activator.CreateInstance<T>();
foreach( var property in typeof( T ).GetProperties() )
{
var d = Delegate.CreateDelegate( typeof( Action<,> )
.MakeGenericType( typeof( T ), property.PropertyType ), property.GetSetMethod() );
}
Run Code Online (Sandbox Code Playgroud)
这里的问题是我不能将委托转换为具体的类型Action<T, int>,因为int提前知道属性类型.
我要说的第一件事是手动编写一些示例代码,告诉您可以预期的绝对最佳情况 - 看看您当前的代码是否值得修复.
如果您正在使用PropertyInfo.SetValue等,则绝对可以使其更快,即使有凸状部object- HyperDescriptor可能是一个良好的开端(这是显著比原始反射快,但不进行任何代码更复杂).
为了获得最佳性能,动态IL方法是可行的方法(预编译一次); 在2.0/3.0中,也许DynamicMethod,但在3.5我喜欢Expression(和Compile()).如果您想要更多细节,请告诉我?
使用Expression和实现CsvReader,使用列标题提供映射(它沿着相同的行发明一些数据); 它IEnumerable<T>用作返回类型以避免必须缓冲数据(因为你似乎有很多):
using System;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
public string Name { get; set; }
public DateTime DateOfBirth { get; set; }
public int Id { get; set; }
}
static class Program {
static void Main()
{
string path = "data.csv";
InventData(path);
int count = 0;
foreach (Entity obj in Read<Entity>(path))
{
count++;
}
Console.WriteLine(count);
}
static IEnumerable<T> Read<T>(string path)
where T : class, new()
{
using (TextReader source = File.OpenText(path))
using (CsvReader reader = new CsvReader(source,true,delimiter)) {
string[] headers = reader.GetFieldHeaders();
Type type = typeof(T);
List<MemberBinding> bindings = new List<MemberBinding>();
ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
MethodInfo method = typeof(CsvReader).GetProperty("Item",new [] {typeof(int)}).GetGetMethod();
Expression invariantCulture = Expression.Constant(
CultureInfo.InvariantCulture, typeof(IFormatProvider));
for(int i = 0 ; i < headers.Length ; i++) {
MemberInfo member = type.GetMember(headers[i]).Single();
Type finalType;
switch (member.MemberType)
{
case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
default: throw new NotSupportedException();
}
Expression val = Expression.Call(
param, method, Expression.Constant(i, typeof(int)));
if (finalType != typeof(string))
{
val = Expression.Call(
finalType, "Parse", null, val, invariantCulture);
}
bindings.Add(Expression.Bind(member, val));
}
Expression body = Expression.MemberInit(
Expression.New(type), bindings);
Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
while (reader.ReadNextRecord()) {
yield return func(reader);
}
}
}
const char delimiter = '\t';
static void InventData(string path)
{
Random rand = new Random(123456);
using (TextWriter dest = File.CreateText(path))
{
dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
for (int i = 0; i < 10000; i++)
{
dest.Write(rand.Next(5000000));
dest.Write(delimiter);
dest.Write(new DateTime(
rand.Next(1960, 2010),
rand.Next(1, 13),
rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
dest.Write(delimiter);
dest.Write("Fred");
dest.WriteLine();
}
dest.Close();
}
}
}
Run Code Online (Sandbox Code Playgroud)
第二个版本(见注释)使用TypeConverter而不是Parse:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using LumenWorks.Framework.IO.Csv;
class Entity
{
public string Name { get; set; }
public DateTime DateOfBirth { get; set; }
public int Id { get; set; }
}
static class Program
{
static void Main()
{
string path = "data.csv";
InventData(path);
int count = 0;
foreach (Entity obj in Read<Entity>(path))
{
count++;
}
Console.WriteLine(count);
}
static IEnumerable<T> Read<T>(string path)
where T : class, new()
{
using (TextReader source = File.OpenText(path))
using (CsvReader reader = new CsvReader(source, true, delimiter))
{
string[] headers = reader.GetFieldHeaders();
Type type = typeof(T);
List<MemberBinding> bindings = new List<MemberBinding>();
ParameterExpression param = Expression.Parameter(typeof(CsvReader), "row");
MethodInfo method = typeof(CsvReader).GetProperty("Item", new[] { typeof(int) }).GetGetMethod();
var converters = new Dictionary<Type, ConstantExpression>();
for (int i = 0; i < headers.Length; i++)
{
MemberInfo member = type.GetMember(headers[i]).Single();
Type finalType;
switch (member.MemberType)
{
case MemberTypes.Field: finalType = ((FieldInfo)member).FieldType; break;
case MemberTypes.Property: finalType = ((PropertyInfo)member).PropertyType; break;
default: throw new NotSupportedException();
}
Expression val = Expression.Call(
param, method, Expression.Constant(i, typeof(int)));
if (finalType != typeof(string))
{
ConstantExpression converter;
if (!converters.TryGetValue(finalType, out converter))
{
converter = Expression.Constant(TypeDescriptor.GetConverter(finalType));
converters.Add(finalType, converter);
}
val = Expression.Convert(Expression.Call(converter, "ConvertFromInvariantString", null, val),
finalType);
}
bindings.Add(Expression.Bind(member, val));
}
Expression body = Expression.MemberInit(
Expression.New(type), bindings);
Func<CsvReader, T> func = Expression.Lambda<Func<CsvReader, T>>(body, param).Compile();
while (reader.ReadNextRecord())
{
yield return func(reader);
}
}
}
const char delimiter = '\t';
static void InventData(string path)
{
Random rand = new Random(123456);
using (TextWriter dest = File.CreateText(path))
{
dest.WriteLine("Id" + delimiter + "DateOfBirth" + delimiter + "Name");
for (int i = 0; i < 10000; i++)
{
dest.Write(rand.Next(5000000));
dest.Write(delimiter);
dest.Write(new DateTime(
rand.Next(1960, 2010),
rand.Next(1, 13),
rand.Next(1, 28)).ToString(CultureInfo.InvariantCulture));
dest.Write(delimiter);
dest.Write("Fred");
dest.WriteLine();
}
dest.Close();
}
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1515 次 |
| 最近记录: |