CPU IDLE low as results of bad disks is it the case here

Low CPU IDLE can be caused by a variety of factors, including:Insufficient RAM or slow Hard Disk Drive

but in our RHEL server RAM memory have enough RAM but from dmesg we found couple errors about the disks drive

our suspicion is about the disks as for example sdk and sdc and that because we saw from dmesg errors as [sdk] tag#0 Add. Sense: Unrecovered read error

here the details from sar command that show the CPU IDLE values

    09:43:56 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
    09:44:01 AM     all     98.57      0.00      0.62      0.00      0.00      0.80
    09:44:06 AM     all     98.26      0.00      0.92      0.01      0.00      0.81
    09:44:11 AM     all     97.29      0.00      1.66      0.01      0.00      1.03
    09:44:16 AM     all     92.81      0.00      6.06      0.03      0.00      1.10
    09:44:21 AM     all     92.31      0.00      6.43      0.05      0.00      1.21
    Average:        all     95.85      0.00      3.14      0.02      0.00      0.99


09:44:21 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
09:44:22 AM     all     96.52      0.00      3.10      0.00      0.00      0.38
09:44:22 AM       0     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM       1     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM       2    100.00      0.00      0.00      0.00      0.00      0.00
09:44:22 AM       3     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM       4     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM       5     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM       6     97.98      0.00      2.02      0.00      0.00      0.00
09:44:22 AM       7     97.98      0.00      2.02      0.00      0.00      0.00
09:44:22 AM       8     98.99      0.00      1.01      0.00      0.00      0.00
09:44:22 AM       9     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM      10     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM      11     98.02      0.00      0.99      0.00      0.00      0.99
09:44:22 AM      12     97.00      0.00      1.00      0.00      0.00      2.00
09:44:22 AM      13     96.97      0.00      3.03      0.00      0.00      0.00
09:44:22 AM      14     98.02      0.00      0.99      0.00      0.00      0.99
09:44:22 AM      15     94.00      0.00      6.00      0.00      0.00      0.00
09:44:22 AM      16     83.00      0.00     16.00      0.00      0.00      1.00
09:44:22 AM      17     98.00      0.00      1.00      0.00      0.00      1.00
09:44:22 AM      18     96.97      0.00      2.02      0.00      0.00      1.01
09:44:22 AM      19     96.00      0.00      4.00      0.00      0.00      0.00
09:44:22 AM      20     97.98      0.00      1.01      0.00      0.00      1.01
09:44:22 AM      21     95.05      0.00      4.95      0.00      0.00      0.00
09:44:22 AM      22     94.95      0.00      5.05      0.00      0.00      0.00
09:44:22 AM      23     98.99      0.00      1.01      0.00      0.00      0.00
09:44:22 AM      24     98.99      0.00      1.01      0.00      0.00      0.00
09:44:22 AM      25     99.00      0.00      1.00      0.00      0.00      0.00
09:44:22 AM      26     98.99      0.00      1.01      0.00      0.00      0.00
09:44:22 AM      27     98.99      0.00      1.01      0.00      0.00      0.00
09:44:22 AM      28     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM      29     98.00      0.00      2.00      0.00      0.00      0.00
09:44:22 AM      30     94.95      0.00      5.05      0.00      0.00      0.00
09:44:22 AM      31     97.03      0.00      1.98      0.00      0.00      0.99
09:44:22 AM      32     98.02      0.00      1.98      0.00      0.00      0.00
09:44:22 AM      33     99.00      0.00      1.00      0.00      0.00      0.00
09:44:22 AM      34     98.00      0.00      1.00      0.00      0.00      1.00
09:44:22 AM      35     97.98      0.00      2.02      0.00      0.00      0.00
09:44:22 AM      36     94.00      0.00      5.00      0.00      0.00      1.00
09:44:22 AM      37     98.02      0.00      0.99      0.00      0.00      0.99
09:44:22 AM      38     97.98      0.00      1.01      0.00      0.00      1.01
09:44:22 AM      39     89.00      0.00     11.00      0.00      0.00      0.00
09:44:22 AM      40     83.00      0.00     13.00      0.00      0.00      4.00
09:44:22 AM      41     97.00      0.00      3.00      0.00      0.00      0.00
09:44:22 AM      42     91.92      0.00      8.08      0.00      0.00      0.00
09:44:22 AM      43     94.06      0.00      5.94      0.00      0.00      0.00
09:44:22 AM      44     92.93      0.00      7.07      0.00      0.00      0.00
09:44:22 AM      45     97.00      0.00      3.00      0.00      0.00      0.00
09:44:22 AM      46     99.00      0.00      1.00      0.00      0.00      0.00
09:44:22 AM      47     98.99      0.00      1.01      0.00      0.00      0.00

sar -B 2 5

09:44:24 AM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
09:44:26 AM  14852.00  71776.00 101443.50      0.00 216420.00      0.00      0.00      0.00      0.00
09:44:28 AM  14336.00    184.00   5123.00      0.00  47167.50      0.00      0.00      0.00      0.00
09:44:30 AM  14418.00 203778.00  67194.50      0.00 132952.50      0.00      0.00      0.00      0.00
09:44:32 AM  14352.00 220796.00   2475.00      0.00  59666.00      0.00      0.00      0.00      0.00
09:44:34 AM  13318.00  56996.00  16290.00      0.00   9599.00      0.00      0.00      0.00      0.00
Average:     14255.20 110706.00  38505.20      0.00  93161.00      0.00      0.00      0.00      0.00

from vmstat command

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
65  0 3505188 6265864 4828612 304096576    0    0   137   127    0    0 49  1 50  0  0
63  1 3505188 6068484 4828660 304294848    0    0 12292 41500 95782 88751 98  2  1  0  0
66  0 3505188 5933464 4828672 304429248    0    0 14668 130968 85788 90844 97  2  1  0  0

r: The number of processes waiting for run time.

from kernel messages we get:

[117426425.532990] blk_update_request: critical medium error, dev sdc, sector 116127985
[117426431.038365] sd 0:0:3:0: [sdc] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[117426431.038374] sd 0:0:3:0: [sdc] tag#0 Sense Key : Medium Error [current] [descriptor] 
[117426431.038378] sd 0:0:3:0: [sdc] tag#0 Add. Sense: Unrecovered read error
[117426431.038383] sd 0:0:3:0: [sdc] tag#0 CDB: Read(16) 88 00 00 00 00 00 06 eb f8 f0 00 00 00 08 00 00
[117426431.038386] blk_update_request: critical medium error, dev sdc, sector 116127985
[139602560.596832] traps: polkitd[27641] general protection ip:7f7996318cf2 sp:7ffe7a28e5b0 error:0 in libmozjs-17.0.so[7f79961da000+3b3000]
[144770588.094226] sd 0:0:11:0: [sdk] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[144770588.094238] sd 0:0:11:0: [sdk] tag#0 Sense Key : Medium Error [current] [descriptor] 
[144770588.094242] sd 0:0:11:0: [sdk] tag#0 Add. Sense: Unrecovered read error
[144770588.094248] sd 0:0:11:0: [sdk] tag#0 CDB: Read(16) 88 00 00 00 00 00 01 15 20 00 00 00 02 00 00 00

so based on above output is it make sense that the root cause of very low CPU IDLE is because disks errors as we get from kernel messages ?

Asked By: yael

||

Based on the timestamps, nearly a year passed between the two disk errors in your logs, so no, they’re not the reason your system isn’t idling.

As an aside, note that

r: The number of processes waiting for run time.

isn’t accurate: in vmstat, the r column shows the number of runnable processes, i.e. the number of processes either running or waiting to run. If you have many logical CPUs then a high number here isn’t a problem.

Answered By: Stephen Kitt
Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.