逐列读取CSV文件

dru*_*dev 22 java csv file-io multiple-columns

我想从多列csv文件中读取特定列,并使用Java在其他csv文件中打印这些列.有什么帮助吗?以下是我的代码逐行打印每个标记..但我打算只打印多列csv中的几列.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.StringTokenizer;

public class ParseCSV {

    public static void main(String[] args) {

        try
        {

            //csv file containing data
            String strFile = "C:\\Users\\rsaluja\\CMS_Evaluation\\Drupal_12_08_27.csv";

            //create BufferedReader to read csv file
            BufferedReader br = new BufferedReader( new FileReader(strFile));
            String strLine = "";
            StringTokenizer st = null;
            int lineNumber = 0, tokenNumber = 0;

            //read comma separated file line by line
            while( (strLine = br.readLine()) != null)
            {
                lineNumber++;

                //break comma separated line using ","
                st = new StringTokenizer(strLine, ",");

                while(st.hasMoreTokens())
                {
                //display csv values
                tokenNumber++;
                System.out.println("Line # " + lineNumber +
                                ", Token # " + tokenNumber
                                + ", Token : "+ st.nextToken());


                            System.out.println(cols[4]);
Run Code Online (Sandbox Code Playgroud)

Jas*_*ske 44

您应该使用优秀的OpenCSV来读取和写入CSV文件.要使您的示例适应使用库,它将如下所示:

public class ParseCSV {
  public static void main(String[] args) {
    try {
      //csv file containing data
      String strFile = "C:/Users/rsaluja/CMS_Evaluation/Drupal_12_08_27.csv";
      CSVReader reader = new CSVReader(new FileReader(strFile));
      String [] nextLine;
      int lineNumber = 0;
      while ((nextLine = reader.readNext()) != null) {
        lineNumber++;
        System.out.println("Line # " + lineNumber);

        // nextLine[] is an array of values from the line
        System.out.println(nextLine[4] + "etc...");
      }
    }
  }
}
Run Code Online (Sandbox Code Playgroud)

  • +1同意.试图破解一些用于解析CSV数据的代码行通常会以喊叫和眼泪结束.对于CSV,请使用为作业设计的API. (6认同)
  • 是的,同意。我可以使用 Open Csv。但我正在寻找的是我只需要选择性列。我已经使用所有标记正确解析了文件,但它逐行解析,在这里我正在寻找阅读,然后只打印出几个选定的特定列。无论如何感谢您的回复!:) (2认同)
  • *"但我正在寻找的是我只需要选择性的列."*这绝不排除使用API​​,所以我很困惑为什么你用*"但是......"开始它. (2认同)

Shi*_*mar 13

在Java中读取非常简单和常见的CSV文件.实际上,您不需要加载任何额外的第三方库来为您执行此操作.CSV(逗号分隔值)文件只是一个普通的纯文本文件,逐列存储数据,并用分隔符(例如逗号",")拆分.

为了从CSV文件中读取特定列,有几种方法.最简单的如下:

无任何第三方库读取CSV的代码

BufferedReader br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
    // use comma as separator
    String[] cols = line.split(cvsSplitBy);
    System.out.println("Coulmn 4= " + cols[4] + " , Column 5=" + cols[5]);
}
Run Code Online (Sandbox Code Playgroud)

如果你注意到,这里没有什么特别的.它只是读取文本文件,并通过分隔符 - ","进行吐出.

考虑GeoLite免费下载数据库中遗留国家/地区CSV数据的摘录

"1.0.0.0","1.0.0.255","16777216","16777471","AU","Australia"
"1.0.1.0","1.0.3.255","16777472","16778239","CN","China"
"1.0.4.0","1.0.7.255","16778240","16779263","AU","Australia"
"1.0.8.0","1.0.15.255","16779264","16781311","CN","China"
"1.0.16.0","1.0.31.255","16781312","16785407","JP","Japan"
"1.0.32.0","1.0.63.255","16785408","16793599","CN","China"
"1.0.64.0","1.0.127.255","16793600","16809983","JP","Japan"
"1.0.128.0","1.0.255.255","16809984","16842751","TH","Thailand"
Run Code Online (Sandbox Code Playgroud)

上面的代码将输出如下:

Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "AU" , Column 5="Australia"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "CN" , Column 5="China"
Column 4= "JP" , Column 5="Japan"
Column 4= "TH" , Column 5="Thailand"
Run Code Online (Sandbox Code Playgroud)

实际上,您可以put使用a中的列Map,然后只需使用key.

Shishir

  • 那简单吗?当值中包含逗号时,您的示例会中断.例如"1,0,0,0","1.0.0.255","16777216"不起作用(但是是有效的csv文件).这就是为什么使用专门设计的apis可以让您的生活更轻松,这些边缘案例已被考虑并且(希望)经过测试. (11认同)

Jer*_*kes 6

抱歉,这些答案都无法提供最佳解决方案。如果使用OpenCSV之类的库,则必须编写大量代码来处理特殊情况,以便从特定列中提取信息。

例如,如果行的列数少于要查找的行数,则必须编写大量代码来处理它。使用OpenCSV示例:

  CSVReader reader = new CSVReader(new FileReader(strFile));
  String [] nextLine;
  while ((nextLine = reader.readNext()) != null) {
       //let's say you are interested in getting columns 20, 30, and 40
       String[] outputRow = new String[3];
       if(parsedRow.length < 40){
            outputRow[2] = null;
       } else {
            outputRow[2] = parsedRow[40]
       }
       if(parsedRow.length < 30){
            outputRow[1] = null;
       } else {
            outputRow[1] = parsedRow[30]
       }
       if(parsedRow.length < 20){
            outputRow[0] = null;
       } else {
            outputRow[0] = parsedRow[20]
       }

  }
Run Code Online (Sandbox Code Playgroud)

对于一个简单的要求,这是很多代码。如果您尝试按名称获取列的值,情况会变得更糟。您应该使用更现代的解析器,例如uniVocity-parsers提供的解析器

为了可靠,轻松地获取所需的列,只需编写:

CsvParserSettings settings = new CsvParserSettings();
parserSettings.selectIndexes(20, 30, 40);
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new FileReader(yourFile));
Run Code Online (Sandbox Code Playgroud)

披露:我是这个图书馆的作者。它是开源且免费的(Apache V2.0许可证)。