运行了很长时间的进程之后的执行时间和资源

Bic*_*hoy 5 cpu rhel io time

这可能是一个基本问题,但我很难找到答案。我正在使用 RHEL6。在运行任何需要大量 cpu 时间的进程后,我得到一行自动告诉(我猜)运行时间和 IO 使用情况。它是这样写的:

77410.101u 124.968s 1:42:43.49 657.9%    0+0k 0+1353384io 0pf+0w
Run Code Online (Sandbox Code Playgroud)

我有以下问题:

  1. 我如何解释此消息中的每个字段?我可以猜测一些时间,IO 使用情况,也可能是 CPU 使用情况……但我不确定。
  2. 什么实际打印了这条线?是壳吗?是终端模拟器吗?是否有一个正在运行的守护进程来处理这个问题?此功能/服务的名称是什么/无论它是什么。
  3. 是否可以控制此消息?喜欢设置打印的cpu使用阈值吗?
  4. 我可以添加额外的信息吗?比如进程的绝对路径、内存使用高峰、磁盘使用高峰等……

slm*_*slm 6

调试正在发生的事情的提示

假设您使用的是 Bash,我建议您打开 shell 的调试工具。

$ set -x
Run Code Online (Sandbox Code Playgroud)

当您运行产生此输出的命令时,此输出将向您显示幕后发生的事情。

输出

该输出来自/usr/bin/time您运行的每个命令的前缀time命令。为了获得该输出,我猜您正在使用 C-shell (csh) 或 Turbo C-shell (tcsh)。

例子

$ tcsh
$ time sleep 2
0.000u 0.000s 0:02.00 0.0%  0+0k 0+0io 0pf+0w
Run Code Online (Sandbox Code Playgroud)

我怀疑这是一个tcshshell的原因是,当我/usr/bin/time在 Bash shell 中运行命令时,输出如下所示:

$ /usr/bin/time sleep 2
0.00user 0.00system 0:02.02elapsed 0%CPU (0avgtext+0avgdata 580maxresident)k
0inputs+0outputs (0major+180minor)pagefaults 0swaps
Run Code Online (Sandbox Code Playgroud)

可以使用-f--format开关控制输出,因此您看到的输出也可以在 Bash 中实现,但必须有意完成。

输出的意义

如果您/usr/bin/time在详细模式下运行该命令, ( -v) 您将获得有关每个字段的所有详细信息,如下所示:

$ /usr/bin/time -v sleep 2
    Command being timed: "sleep 2"
    User time (seconds): 0.00
    System time (seconds): 0.00
    Percent of CPU this job got: 0%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.00
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 584
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 184
    Voluntary context switches: 2
    Involuntary context switches: 4
    Swaps: 0
    File system inputs: 0
    File system outputs: 0
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0
Run Code Online (Sandbox Code Playgroud)

如果您排列原始输出:

77410.101u 124.968s 1:42:43.49 657.9%    0+0k 0+1353384io 0pf+0w
^^^^^^^^^^ ^^^^^^^^ ^^^^^^^^^^ ^^^^^^    ^^^^ ^^^^^^^^^^^ ^^^^^^
     1        2         3       4         5        6        7
Run Code Online (Sandbox Code Playgroud)
  1. 用户时间(秒)
  2. 系统时间(秒)
  3. 经过(挂钟)时间(h:mm:ss 或 m:ss)
  4. 此作业获得的 CPU 百分比
  5. 平均共享文本大小 (kbytes) + 平均非共享数据大小 (kbytes)
  6. 进程的文件系统输入数+进程的文件系统输出数
  7. 进程运行时发生的主要页面错误数。这些是必须从磁盘读取页面的错误 + 进程从主内存换出的次数

您可以像这样手动执行相同的格式:

$ /usr/bin/time -f '%Uu %Ss %E %P %X+%Dk %I+%Oio %Fpf+%Ww' sleep 2
0.00u 0.00s 0:02.00 0% 0+0k 0+0io 0pf+0w
Run Code Online (Sandbox Code Playgroud)

自定义输出

一旦您能够确定调用的位置,/usr/bin/time您就可以通过在time. 此输出中可以包含许多选项。

$ man time
Run Code Online (Sandbox Code Playgroud)

摘抄

   Time
   %E     Elapsed real time (in [hours:]minutes:seconds).
   %e     (Not in tcsh.) Elapsed real time (in seconds).
   %S     Total number of CPU-seconds that the process spent in kernel mode.
   %U     Total number of CPU-seconds that the process spent in user mode.
   %P     Percentage of the CPU that this job got, computed as (%U + %S) / %E.

   Memory
   %M     Maximum resident set size of the process during its lifetime, in Kbytes.
   %t     (Not in tcsh.) Average resident set size of the process, in Kbytes.
   %K     Average total (data+stack+text) memory use of the process, in Kbytes.
   %D     Average size of the process's unshared data area, in Kbytes.
   %p     (Not in tcsh.) Average size of the process's unshared stack space, in Kbytes.
   %X     Average size of the process's shared text space, in Kbytes.
   %Z     (Not in tcsh.) System's page size, in bytes.  This is a per-system constant, but varies between systems.
   %F     Number  of major page faults that occurred while the process was running.  These are faults where the page has to be read
          in from disk.
   %R     Number of minor, or recoverable, page faults.  These are faults for pages that are not valid but which have not yet  been
          claimed by other virtual pages.  Thus the data in the page is still valid but the system tables must be updated.
   %W     Number of times the process was swapped out of main memory.
   %c     Number of times the process was context-switched involuntarily (because the time slice expired).
   %w     Number of waits: times that the program was context-switched voluntarily, for instance while waiting for an I/O operation
          to complete.

   I/O
   %I     Number of file system inputs by the process.
   %O     Number of file system outputs by the process.
   %r     Number of socket messages received by the process.
   %s     Number of socket messages sent by the process.
   %k     Number of signals delivered to the process.
   %C     (Not in tcsh.) Name and command-line arguments of the command being timed.
   %x     (Not in tcsh.) Exit status of the command.
Run Code Online (Sandbox Code Playgroud)

有关更多详细信息,请参阅手册页。

编辑 #1:你的问题

原来你关于自动显示的输出的问题是由在 csh/tcsh 中设置这个环境变量引起的。

从 tcsh 手册页

   The time shell variable can be set to execute the time builtin command 
     after the completion of any process that takes more  than a given number 
     of CPU seconds.
Run Code Online (Sandbox Code Playgroud)

例子

将时间设置为 5 秒。

$ set time=5
Run Code Online (Sandbox Code Playgroud)

确认:

$ set|grep time
time    5
Run Code Online (Sandbox Code Playgroud)

测试一下:

$ bash -c "while [ 1 ];do echo hi; done"
hi
hi
...
...waited ~5 seconds, then Ctrl-C to stop it

5.650u 1.471s 0:09.68 73.5% 0+0k 0+0io 0pf+0w
Run Code Online (Sandbox Code Playgroud)

仅当您正在运行的任务消耗的 CPU 时间的秒数超过变量设置的秒数时,才会显示输出$time

参考