小智 23
类中的getSpaceConsumed()函数ContentSummary将返回文件/目录在集群中占用的实际空间,即它考虑了为集群设置的复制因子.
例如,如果hadoop集群中的复制因子设置为3,目录大小为1.5GB,则该getSpaceConsumed()函数将返回4.5GB的值.
使用类中的getLength()函数ContentSummary将返回实际的文件/目录大小.
Tar*_*riq 14
您可以使用getContentSummary(Path f)该类提供的方法FileSystem.它返回一个可以调用该方法的ContentSummary对象getSpaceConsumed(),它将为您提供以字节为单位的目录大小.
用法:
package org.myorg.hdfsdemo;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class GetDirSize {
/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
Configuration config = new Configuration();
config.addResource(new Path(
"/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
config.addResource(new Path(
"/hadoop/projects/hadoop-1.0.4/conf/core-site.xml"));
FileSystem fs = FileSystem.get(config);
Path filenamePath = new Path("/inputdir");
System.out.println("SIZE OF THE HDFS DIRECTORY : " + fs.getContentSummary(filenamePath).getSpaceConsumed());
}
}
Run Code Online (Sandbox Code Playgroud)
HTH
谢谢你们。
斯卡拉版本
package com.beloblotskiy.hdfsstats.model.hdfs
import java.nio.file.{Files => NioFiles, Paths => NioPaths}
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.fs.Path
import org.apache.commons.io.IOUtils
import java.nio.file.{Files => NioFiles}
import java.nio.file.{Paths => NioPaths}
import com.beloblotskiy.hdfsstats.common.Settings
/**
* HDFS utilities
* @author v-abelablotski
*/
object HdfsOps {
private val conf = new Configuration()
conf.addResource(new Path(Settings.pathToCoreSiteXml))
conf.addResource(new Path(Settings.pathToHdfsSiteXml))
private val fs = FileSystem.get(conf)
/**
* Calculates disk usage with replication factor.
* If function returns 3G for folder with replication factor = 3, it means HDFS has 1G total files size multiplied by 3 copies space usage.
*/
def duWithReplication(path: String): Long = {
val fsPath = new Path(path);
fs.getContentSummary(fsPath).getSpaceConsumed()
}
/**
* Calculates disk usage without pay attention to replication factor.
* Result will be the same with hadopp fs -du /hdfs/path/to/directory
*/
def du(path: String): Long = {
val fsPath = new Path(path);
fs.getContentSummary(fsPath).getLength()
}
//...
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
9374 次 |
| 最近记录: |