我正在使用COMPS运行COMPSs示例应用程序手册中显示的Increment应用程序.我添加了-m标志以启用监视功能:
$ runcompss -m --debug increment.Increment 5 1 2 3
Run Code Online (Sandbox Code Playgroud)
应用程序正常运行并完成(std输出/错误中没有显示错误,并且.COMPSs文件夹中的runtime.log没有任何堆栈跟踪).
我还启动了运行以下命令的COMPSs Monitor服务(我也添加了它的输出)
$ /etc/init.d/compss-monitor start
* Starting COMPSs Monitor
* Checking JAVA Installation...
Success
* Checking IT_HOME...
WARNING: IT_HOME not defined. Trying default location /opt/COMPSs/
Success
* Checking IT_MONITOR...
IT_MONITOR=/root/.COMPSs/
Success
* Checking COMPSs Monitor Port...
Warning: COMPSs_MONITOR_PORT not defined.
Loading from configuration file.
COMPSs_MONITOR_PORT=8080
Success
* Checking COMPSs Monitor Timeout...
Warning: COMPSs_MONITOR_TIMEOUT not defined.
Loading from configuration file.
COMPSs_MONITOR_TIMEOUT=20000
Success
* Configuring COMPSs Monitor …Run Code Online (Sandbox Code Playgroud) 提交COMPS应用程序后,我收到以下错误消息,并且未执行该应用程序.
MPI_CMD=mpirun -timestamp-output -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/user/runcompss
--project=/tmp/1668183.tmpdir/project_1458303603.xml
--resources=/tmp/1668183.tmpdir/resources_1458303603.xml
--uuid=2ed20e6a-9f02-49ff-a71c-e071ce35dacc
/apps/FILESPACE/pycompssfile arg1 arg2 : -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh
/apps/INTEL/mkl/lib/intel64 null
/home/myhome/kmeans_python/src/ true
/tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 1
/apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc : -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh
/apps/INTEL/mkl/lib/intel64 null
/home/myhome/kmeans_python/src/ true
/tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 2
/apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc
--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
--------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
我正在使用COMPS 1.3.
为什么会这样?
执行手册(http://compss.bsc.es/releases/compss/latest/docs/COMPSs_User_Manual_App_Exec.pdf)中给出的示例应用程序增量时,运行时将被阻止,并且终端中不会显示任何错误消息.
OUTPUT:
$ runcompss increment.Increment 3 1 2 3
Using default location for project file: /opt/COMPSs/Runtime/configuration/xml/projects/project.xml
Using default location for resources file: /opt/COMPSs/Runtime/configuration/xml/resources/resources.xml
----------------- Executing increment.Increment --------------------------
WARNING: IT Properties file is null. Setting default values
[ API] - Deploying COMPSs Runtime v1.3
[ API] - Starting COMPSs Runtime v1.3
Initial counter values:
- Counter1 value is 1
- Counter2 value is 2
- Counter3 value is 3
Run Code Online (Sandbox Code Playgroud)
我怎么知道什么阻止了我的申请?
先感谢您
编辑: 检查$ HOME/.COMPSs/increment*/runtime.log所有任务似乎都被阻止:
grep "Blocked" runtime.log …Run Code Online (Sandbox Code Playgroud) 我试图在激活跟踪系统(extrae)的情况下运行COMPS.我第一次遇到安装问题但我解决了这个问题:
如何修复libpapi.so.*运行时不能打开共享对象文件(py)带跟踪的COMPS?
但是,现在我正面临一个新的PAPI问题.COMPS运行时似乎已正确加载但Extrae报告此错误:
Extrae: Error! Hardware counter PAPI_L3_TCM (0x80000008) cannot be added in set 1 (thread 0)
Extrae: Error! Hardware counter PAPI_FP_INS (0x80000034) cannot be added in set 1 (thread 0)
Extrae: Error! Hardware counter PAPI_SR_INS (0x80000036) cannot be added in set 2 (thread 0)
Extrae: Error! Hardware counter PAPI_BR_UCN (0x8000002a) cannot be added in set 2 (thread 0)
Extrae: Error! Hardware counter PAPI_BR_CN (0x8000002b) cannot be added in set 2 (thread 0)
Extrae: Error! Hardware counter PAPI_VEC_SP (0x80000069) cannot …Run Code Online (Sandbox Code Playgroud)