Joh*_*lap 6 server networking hosts torque
遵循本指南:
https://jabriffa.wordpress.com/2015/02/11/installing-torquepbs-job-scheduler-on-ubuntu-14-04-lts/
我在 Ubuntu 16-04-lts 上安装了 TORQUE(他声称该过程在 16.04 上工作相同)
他的安装说明的简短摘要,因此这可以自给自足:
apt-get install torque-server torque-client torque-mom torque-pam
/etc/init.d/torque-mom stop
/etc/init.d/torque-scheduler stop
/etc/init.d/torque-server stop
pbs_server -t create
killall pbs_server
echo SERVER.DOMAIN > /etc/torque/server_name
echo SERVER.DOMAIN > /var/spool/torque/server_priv/acl_svr/acl_hosts
echo root@SERVER.DOMAIN > /var/spool/torque/server_priv/acl_svr/operators
echo root@SERVER.DOMAIN > /var/spool/torque/server_priv/acl_svr/managers
echo "SERVER.DOMAIN np=4" > /var/spool/torque/server_priv/nodes
echo SERVER.DOMAIN > /var/spool/torque/mom_priv/config
/etc/init.d/torque-server start
/etc/init.d/torque-scheduler start
/etc/init.d/torque-mom start
# set scheduling properties
qmgr -c 'set server scheduling = true'
qmgr -c 'set server keep_completed = 300'
qmgr -c 'set server mom_job_sync = true
Run Code Online (Sandbox Code Playgroud)
按照他的指示之后:
qmgr -c 'set server scheduling = true'
Run Code Online (Sandbox Code Playgroud)
我收到错误消息
qmgr obj=master.node svr=master.node: Unauthorized Request
Run Code Online (Sandbox Code Playgroud)
我在他提到的时候 grep 日志,发现这个无用的片段:grep Unauthorized /var/spool/torque/server_logs/*
08/25/2018 15:48:43;0080;PBS_Server;Req;req_reject;Reject reply code=15007(Unauthorized Request ), aux=0, type=Manager, from root@master.node
Run Code Online (Sandbox Code Playgroud)
这是我的主机名:
master
Run Code Online (Sandbox Code Playgroud)
这是我的主机文件:
127.0.1.1 master master
127.0.0.1 localhost
10.136.7.155 master.node
10.136.7.155 master
10.136.65.29 slave1
10.136.73.247 slave2
10.136.44.128 slave3
Run Code Online (Sandbox Code Playgroud)
这是我配置各种配置文件的方式:
echo master.node > /etc/torque/server_name
echo master.node > /var/spool/torque/server_priv/acl_svr/acl_hosts
echo root@master.node > /var/spool/torque/server_priv/acl_svr/operators
echo root@master.node > /var/spool/torque/server_priv/acl_svr/managers
echo "master.node np=4" > /var/spool/torque/server_priv/nodes
echo master.node > /var/spool/torque/mom_priv/config
Run Code Online (Sandbox Code Playgroud)
每次我摆弄它时,我都会重新启动各种守护进程:
/etc/init.d/torque-server restart
/etc/init.d/torque-scheduler restart
/etc/init.d/torque-mom restart
Run Code Online (Sandbox Code Playgroud)
我目前以 root 身份运行。
我完全不知道 TORQUE 在这里想要什么。为什么我没有授权?
此外,qmgr 认为尽管有 /var/spool/torque/server_priv/nodes 文件,但没有节点。为什么?
Qmgr: list node
No Active Nodes, nothing done.
Run Code Online (Sandbox Code Playgroud)
我按照同一链接的说明进行操作,但出现了相同的错误。
问题是服务器在 localhost 上运行,因此如果您指定了 localhost 以外的 FQDN,则请求将显示为来自未经授权的用户。
我必须将我的情况下的服务器域更改为本地主机:
echo localhost > /etc/torque/server_name
echo localhost > /var/spool/torque/server_priv/acl_svr/acl_hosts
echo root@localhost > /var/spool/torque/server_priv/acl_svr/operators
...
...
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
354 次 |
| 最近记录: |