IT咨询服务AIXcognostunning

如何验证那个process消耗完内存

大家好~ 第一次来这边发问,如有发错板块麻烦告知协助搬到正确板块首先说系统架构:有2台Server(一台LDAP做security验证;一台做AP,包含有DB2/Informix/Cognos/Web service...etc)AIX 6.1.8.3 (6cpu , 2 core , 60G Ram ; stacksize(hard:unlimited, soft:8192)+openfiles(hard...显示全部
大家好~ 第一次来这边发问,如有发错板块麻烦告知协助搬到正确板块

首先说系统架构:有2台Server(一台LDAP做security验证;一台做AP,包含有DB2/Informix/Cognos/Web service...etc)
AIX 6.1.8.3 (6cpu , 2 core , 60G Ram ; stacksize(hard:unlimited, soft:8192)+openfiles(hard:unlimited, soft:8192) )
Cognos 10.2 no fix pack(DB2是contentDB,Informix是Report DB, Cognos是报表展现工具)

问题发生在AP server:
我们发现在问题发生时会产生core(PROD或是UAT进行压力测试时),从core dbx where output解析发现是 tcmalloc问题。(MALLOCTYPE=tcmalloc)
所以开发建议是否可以disable tcmalloc,于是我们使用MALLOCTYPE=watson来规避这个问题。
使用MALLOCTYPE=watson时的确规避了error以及core产生,但也严重影响了Cognos报表执行的效能。(执行月报从50分钟增加到3小时)

图形一显示了2次压力测试的结果,从nmon解析的MEM察看(#1为MALLOCTYPE=tcmalloc,有core产生;#2为MALLOCTYPE=watson)
图形二显示了nmon解析中的MEMNEW

现在我的问题是:
Q1: 从NMON看来当时可以用内存的确很低,但为什么只有#1有Core,#2压测没有core/错误产生呢
Q2: 想要知道是AIX OOM还是Cognos的process OOM. 应该从哪里察看呢
Q3: 从NMON中UARG可以看出当时有process执行,但要如何查询那一个process把memory使用完了

麻烦大家了,还有需要什么讯息可以再补充。谢谢

mem_1.png



mem_2.png

收起
参与5

查看其它 2 个回答anda的回答

andaanda其它vipabc
回复 2# a156580801


    大概有3种类型,但都指向tcmalloc

#1

[using memory image in /tmp/cog_core/1/core.11665636.17061313]
reading symbolic information ...

Segmentation fault in unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int) at line 48 in file "" ($t29)
couldn't read "src/linked_list.h"
(dbx) where
unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0x00000378, start = 0x5bf98a40, end = 0x38a6cad0, N = 1499309056), line 48 in "linked_list.h"
unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0x00000378, start = 0x5bf98a40, end = 0x38a6cad0, N = 1499309056), line 48 in "linked_list.h"
tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0x00000378, start = 0x5bf98a40, end = 0x38a6cad0, N = 1499309056), line 48 in "linked_list.h"
tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*,unsigned long,int)(this = 0xd5599e10, src = 0xf17d24c0, cl = 950455056, N = -243360916), line 167 in "thread_cache.cc"
__malloc__(??) at 0xd5599e0c
malloc(??) at 0xd0112410
uprv_malloc_48_cognos() at 0xd9d626e0
unistr.icu_48_cognos::UnicodeString::allocate(int)() at 0xd9d7b5ac
unistr.icu_48_cognos::UnicodeString::cloneArrayIfNeeded(int,int,signed char,int**,signed char)() at 0xd9d7b1d0
unistr.icu_48_cognos::UnicodeString::doReplace(int,int,const wchar_t*,int,int)() at 0xd9d7add0
unistr.icu_48_cognos::UnicodeString::doReplace(int,int,const icu_48_cognos::UnicodeString&,int,int)() at 0xd9d7aa74
unistr.icu_48_cognos::UnicodeString::extractBetween(int,int,icu_48_cognos::UnicodeString&) const() at 0xd9d7a8f4
i18n_string.I18NString::insert(int,const I18NString&).I18NString::substring(int,int) const() at 0xd92d274c

#2
[using memory image in /tmp/cog_core/1/core.7930534.17061233]
reading symbolic information ...

Segmentation fault in unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int) at line 48 in file "" ($t36)
couldn't read "src/linked_list.h"
(dbx) where
unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0xd55ad458, start = 0x00000004, end = (nil), N = -247692128), line 48 in "linked_list.h"
unnamed block in tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0xd55ad458, start = 0x00000004, end = (nil), N = -247692128), line 48 in "linked_list.h"
tcmalloc::CentralFreeList::RemoveRange(void**,void**,int)(this = 0xd55ad458, start = 0x00000004, end = (nil), N = -247692128), line 48 in "linked_list.h"
tcmalloc::ThreadCache::FetchFromCentralCache(unsigned long,unsigned long)(this = 0xd5598304, cl = 980815824, byte_size = 980808784), line 167 in "thread_cache.cc"
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 0, size = 0), line 336 in "thread_cache.h"
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 0, size = 0), line 336 in "thread_cache.h"
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 0, size = 0), line 336 in "thread_cache.h"
tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 0, size = 0), line 336 in "thread_cache.h"
__posix_memalign__(??, ??, ??) at 0xd559b5fc
posix_memalign(??, ??, ??) at 0xd0112168
_Fancy_malloc(unsigned long)(??) at 0xd135e9fc
newop.init_pres(unsigned long)(??) at 0xd135ec24
LogAuditIndication::clone() const() at 0xd433f750

#3
[using memory image in /tmp/cog_core/1/core.9437810.17061954]
reading symbolic information ...

Segmentation fault in unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long) at line 44 in file "" ($t5)
couldn't read "src/linked_list.h"
(dbx) where
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 3645120600, size = 762), line 44 in "linked_list.h"
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 3645120600, size = 762), line 44 in "linked_list.h"
unnamed block in tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 3645120600, size = 762), line 44 in "linked_list.h"
tcmalloc.cc-0::do_memalign(unsigned long,unsigned long)(align = 3645120600, size = 762), line 44 in "linked_list.h"
__posix_memalign__(??, ??, ??) at 0xd559b5fc
posix_memalign(??, ??, ??) at 0xd0112168
_Fancy_malloc(unsigned long)(??) at 0xd135e9fc
newop.init_pres(unsigned long)(??) at 0xd135ec24
std::vector >::insert(std::_Ptrit,unsigned long,const RSAOMObjectRegistry::RegistrationEntry&)() at 0xd76eb3bc
RSAOMObjectRegistry.RSAOMObjectRegistry::create().RSAOMObjectRegistry::registerPointer(char*,RSAOMObjectRegistryI::RSDeleteMode)() at 0xd76ebdd0
RSAOMMessageIContentHandler::startElement(const unsigned short*,const unsigned short*,const unsigned short*,const xercesc_2_7::Attributes&)() at 0xd76f90cc
xercesc_2_7::SAX2XMLReaderImpl::startElement(const xercesc_2_7::XMLElementDecl&,unsigned int,const unsigned short*,const xercesc_2_7::RefVectorOf&,unsigned int,bool,bool)() at 0xd979a2c0
xercesc_2_7::IGXMLScanner::scanStartTagNS(bool&)() at 0xd96bc918
xercesc_2_7::IGXMLScanner::scanContent()() at 0xd96bb894
xercesc_2_7::IGXMLScanner::scanDocument(const xercesc_2_7::InputSource&)() at 0xd96bb17c
IT咨询服务 · 2015-08-25
浏览3133

回答者

anda
anda033
其它vipabc
擅长领域: 商业智能大数据cognos

anda 最近回答过的问题

回答状态

  • 发布时间:2015-08-25
  • 关注会员:2 人
  • 回答浏览:3133
  • X社区推广