最近在对某项目做巡检时,发现了如下问题:9F3821FD 0305125711 P S topsvcs NIM excessive packet traffic
864D2CE3 0305114911 P S topsvcs NIM thread blocked
864D2CE3 0305114911 P S topsvcs NIM thread blocked然而经过查阅,问题可以这样解决。说明: 这个信息是反映NIM 中某一个线程被锁死了。
详细解释: 该信息显示了NIM中的某一个线程长时间没有相应或者在一段时间内被锁死了。根据线程的类型和锁死时间的的长短, 正在响应NIM 进程的 adapter (适配器)会被认为宕掉了。
例子:
LABEL:
NIM thread blocked
IDENTIFIER:
864D2CE3Date/Time:
Tue Jun 29 00:10:42 EDTSequence Number: 57941Machine Id:
0027078A4C00Node Id:
ammk37Class:
SType:
PERMResource Name:
topsvcsDescriptionNIM thread blockedProbable CausesA thread in a Topology Services Network Interface Module (NIM) processwas blockedTopology Services NIM process cannot get timely access to CPUUser CausesExcessive memory consumption is causing high memory contentionExcessive disk I/O is causing high memory contention
Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persistsFailure CausesExcessive virtual memory activity prevents NIM from making progressExcessive disk I/O traffic is interfering with paging I/O
Recommended Actions
Examine I/O and memory activity on the system
Reduce load on the system
Tune virtual memory parameters
Call IBM Service if problem persistsDetail DataDETECTING MODULErsct,nim_control.C,1.29,5242ERROR ID6XnGH400jCs./RNU.0pK4g0...................REFERENCE CODEThread which was blockedreceive threadInterval in seconds during which process was blocked
30Interface nametty2在上述例子中,我们看到了系统说明了这是一个NIM 线程被锁死的错误,以及可能导致的
原因及处理方法。 一般这种错误可能是由于系统资源耗尽或者异常大量io 造成的。 在
Thread which was blocked 中指定了被锁死的进程。在Interval in seconds during which process was blocked 中注明了线程被锁定的时间。在interface中说明的受影响的adapter(适配器)。 解决办法:如果没有产生adapter down event , 那么这个信息可以忽略不计。因为她不是使cluster 会产生警示的错误报告 。 但是下面有两种方法可以阻止或减少该种错误报告的产生。
1.
升级 bos.rte.libpthreads 的包到最新的级别。2.
降低NIM failure detact rate.
smitty hacmp
cluster config
cluster topology
configure Network Modules
Change a Network Module using Predefined Values
把rs232 和 Ethernet 的值都调慢。
如:Network Module Name
ether
Description
Ethernet Protocol
Failure Detection Rate
Normal
+