用Java读取HDFS和本地文件

Ven*_*k K 20 java hadoop mapreduce hdfs

我想读取文件路径,无论它们是HDFS还是本地路径.目前,我传递带有前缀file://的本地路径和带有前缀hdfs://的HDFS路径,并编写如下代码

Configuration configuration = new Configuration();
FileSystem fileSystem = null;
if (filePath.startsWith("hdfs://")) {
  fileSystem = FileSystem.get(configuration);
} else if (filePath.startsWith("file://")) {
  fileSystem = FileSystem.getLocal(configuration).getRawFileSystem();
}
Run Code Online (Sandbox Code Playgroud)

从这里我使用FileSystem的API来读取文件.

如果还有其他比这更好的方法,你能告诉我吗?

Tar*_*riq 34

这是否有意义,

public static void main(String[] args) throws IOException {

    Configuration conf = new Configuration();
    conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
    conf.addResource(new Path("/hadoop/projects/hadoop-1.0.4/conf/hdfs-site.xml"));

    BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
    System.out.println("Enter the file path...");
    String filePath = br.readLine();

    Path path = new Path(filePath);
    FileSystem fs = path.getFileSystem(conf);
    FSDataInputStream inputStream = fs.open(path);
    System.out.println(inputStream.available());
    fs.close();
}
Run Code Online (Sandbox Code Playgroud)

如果你这样做,你不必进行检查.直接从Path获取FileSystem,然后做任何你想做的事情.


zsx*_*ing 12

您可以FileSystem通过以下方式获得:

Configuration conf = new Configuration();
Path path = new Path(stringPath);
FileSystem fs = FileSystem.get(path.toUri(), conf);
Run Code Online (Sandbox Code Playgroud)

您无需判断路径是否以hdfs://或开头file://.这个API将完成这项工作.

  • Tariq的解决方案也做同样的事情.Path.getFileSystem将调用此FileSystem.get(URI,Configuration)方法 (2认同)