jkh*_*ler 0 java parsing out-of-memory
我正在编写一些代码来将非常大的平面文本文件解析为持久保存到数据库的对象.这正在处理文件的各个部分(即如果我'排在前2000行),但是java.lang.OutOfMemoryError: Java heap space当我尝试处理完整文件时遇到错误.
我使用BufferedReader逐行读取文件,我的印象是这否定了将整个文本文件加载到内存中的要求.希望我的代码是相当不言自明的.我已经通过Eclipse Memory Analyzer运行我的代码,它通知我:
线程java.lang.Thread @ 0x27ee0478 main保留局部变量,总大小为69,668,888(98.76%)字节.
内存在"<system class loader>"加载的"char []"的一个实例中累积**
非常感谢有用的评论!
乔纳森
public ArrayList<Statement> parseGMIFile(String filePath)
throws IOException {
ArrayList<Statement> statements = new ArrayList<Statement>();
// Statement Properties
String sAccount = "";
String sOffice = "";
String sFirm = "";
String sDate1 = "";
String sDate2 = "";
Date date = new Date();
StringBuffer sData = new StringBuffer();
BufferedReader in = new BufferedReader(new FileReader(filePath));
String line;
String prevCode = "";
int lineCounter = 1;
int globalLineCounter = 1;
while ((line = in.readLine()) != null) {
// We extract the GMI code from the end of the first line
String newCode = line.substring(GMICODE_START_POS).trim();
// Extract date
if (newCode.equals(prevCode)) {
if (lineCounter == DATE_LINE) {
sDate1 = line.substring(DATE_START_POS, DATE_END_POS).trim();}
if (lineCounter == DATE_LINE2) {
sDate2 = line.substring(DATE_START_POS, DATE_END_POS).trim();}
if (sDate1.equals("")){
sDate1 = sDate2;}
SimpleDateFormat formatter=new SimpleDateFormat("MMM dd, yyyy");
try {
date=formatter.parse(sDate1);
} catch (ParseException e) {
e.printStackTrace();
}
sFirm = line.substring(FIRM_START_POS, FIRM_END_POS);
sOffice = line.substring(OFFICE_START_POS, OFFICE_END_POS);
sAccount = line.substring(ACCOUNT_START_POS,
ACCOUNT_END_POS);
lineCounter++;
globalLineCounter++;
sData.append(line.substring(0, END_OF_DATA)).append("\n");
} else {
// Instantiate New Statement Object
Statement stmt = new Statement(sAccount, sOffice, sFirm,
date, sData.toString());
// Add to collection
statements.add(stmt);
// log.info("-----------NEW STATEMENT--------------");
sData.setLength(0);
lineCounter = 1;
}
prevCode = newCode;
}
return statements;
}
Run Code Online (Sandbox Code Playgroud)
STACKTRACE: Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'dbPopulator' defined in class path resource [app-context.xml]: Invocation of init method failed; nested exception is java.lang.OutOfMemoryError: Java heap space
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1401)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:512)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:450)
at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:290)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:287)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:189)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:557)
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:842)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:416)
at org.springframework.context.support.ClassPathXmlApplicationContext.(ClassPathXmlApplicationContext.java:139)
at org.springframework.context.support.ClassPathXmlApplicationContext.(ClassPathXmlApplicationContext.java:93)
at Main.main(Main.java:11)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
at java.lang.StringBuffer.append(StringBuffer.java:224)
at services.GMILogParser.parseGMIFile(GMILogParser.java:133)
at services.DBPopulator.init(DBPopulator.java:27)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeCustomInitMethod(AbstractAutowireCapableBeanFactory.java:1529)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1468)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1398)
... 12 more
在启动参数中添加更多内存是恕我直言,这是一个错误.这些参数适用于各种应用.并可能通过增加gc时间来惩罚.而且,您可能事先不知道尺寸.
您使用MemoryMappedFiles并查看java.nio.*来执行此操作.这样做可以在读取时加载,并且内存不会放在普通的内存空间中.
通过低级读取,您可以使用可变长度的块.速度很重要.如果您的文件很大,可能需要花费太多时间才能阅读.而Objects你存储的数量JVM使得GC工作和应用程序变慢.从java参考:
byte buffer可以将A 分配为直接缓冲区,在这种情况下,Java虚拟机将尽最大努力native I/O operations直接执行它.
byte buffer可以通过将文件的区域直接映射到内存来创建A ,在这种情况下,可以使用MappedByteBuffer类中定义的一些其他与文件相关的操作.
A byte buffer以任何非布尔基本类型的二进制数据的异构或同构序列,以大端或小端字节顺序提供对其内容的访问.