hadoop webhdfs客户端中的追加操作

mrk*_*afk 4 python java hadoop webhdfs

我组装的一个 Java 客户端可以工作:

import java.io.File;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.Path;

public class HdfsAppend {

        public static final String hdfs = "hdfs://my222host.com";
        public static final String hpath = "/tmp/odp/testfile";
        public static final String message = "Hello, world!\n";

        public static void main(String[] args) throws IOException {

                Configuration conf = new Configuration();
                conf.set("fs.defaultFS", hdfs);
                FileSystem fs = FileSystem.get(conf);
                Path filenamePath = new Path(hpath);

                FSDataOutputStream out = fs.append(filenamePath);
                out.writeBytes("DUPA DUPA DUPA\n");
        }
}
Run Code Online (Sandbox Code Playgroud)

但是curl和Python whoops客户端都以类似的方式失败,这里是curl:

curl -i -X POST   "http://my222host:50070/webhdfs/v1/tmp/odp/testfile?op=APPEND"
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Tue, 13 Aug 2013 13:26:22 GMT
Date: Tue, 13 Aug 2013 13:26:22 GMT
Pragma: no-cache
Expires: Tue, 13 Aug 2013 13:26:22 GMT
Date: Tue, 13 Aug 2013 13:26:22 GMT
Pragma: no-cache
Content-Type: application/octet-stream
Location: http://my333host:50075/webhdfs/v1/tmp/odp/testfile?op=APPEND&namenoderpcaddress=my222host:8020
Content-Length: 0
Server: Jetty(6.1.26.cloudera.2)


curl -i -X POST -T /tmp/abc "http://my333host:50075/webhdfs/v1/tmp/odp/testfile?op=APPEND&namenoderpcaddress=my222host:8020"
HTTP/1.1 100 Continue

HTTP/1.1 403 Forbidden
Cache-Control: no-cache
Expires: Tue, 13 Aug 2013 13:26:26 GMT
Date: Tue, 13 Aug 2013 13:26:26 GMT
Pragma: no-cache
Expires: Tue, 13 Aug 2013 13:26:26 GMT
Date: Tue, 13 Aug 2013 13:26:26 GMT
Pragma: no-cache
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26.cloudera.2)

{"RemoteException":{"exception":"AccessControlException","javaClassName":"org.apache.hadoop.security.AccessControlException","message":"Permission denied: user=dr.who, access=WRITE, inode=\"/tmp/odp/testfile\":root:hadoop:-rw-r--r--\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:224)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:155)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4716)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4698)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:4660)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1837)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2105)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2081)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:434)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:224)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44944)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:396)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)\n"}}
Run Code Online (Sandbox Code Playgroud)

哎呀客户端失败并显示“连接被拒绝”。这里可能有什么问题?我唯一的线索是使用curl时java异常中的“user=dr.who”,但我不知道配置类使用的用户是什么或如何获取它(如果这是问题的根源)。请帮忙!

Mik*_*ark 5

假设您的用户名是hdfs,请添加&user.name=hdfs到您的 URL。写操作需要有效的用户。

您的 java 代码之所以有效,是因为它从 unix 环境中提取您的用户信息。

如果您在任何地方看到该用户dr.who,可能是因为您没有user.name在请求中设置 a。