请牛人帮忙分析一下这个javacore文件,基本问题已经定位在Thread-1上,从stack看,是at java.lang.Class.forName1调用的时候出错了,clSearchForNameCache。再具体的信息就不太清楚,大家帮着瞧瞧,出个主意。
另外这个javacore不是OOM引起的,是signal 11 received。
[b]问题补充:[/b]
谢谢RednaxelaFX的回复,从stacktrace来看,应该是Shutdown运行的时候调用了class.forname方法,实在是不知道是哪里引起来的错误。我贴一下ibm jdk的源码,当然只是包括stack里面涉及的那两个类。还望大家多多指教。
[b]问题补充:[/b]
这个错误不是每次都产生,具有一定随机性,但是出错时都是同一个原因导致的。上传一下出错时产生的core dump文件,希望能有帮助。该如何分析这个dump以获得更多的信息呢?gdb?
[b]问题补充:[/b]
重要发现 (gdb coredump后的结果):
Core was generated by `/usr/local/java14/bin/java -classpath /home/mdebt/app/dist-NO_JRE//config:/home'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/local/java14/jre/bin/classic/libjvm.so...done.
Loaded symbols for /usr/local/java14/jre/bin/classic/libjvm.so
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /usr/local/java14/jre/bin/libute.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libute.so
Reading symbols from /usr/local/java14/jre/bin/libjsig.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libjsig.so
Reading symbols from /usr/local/java14/jre/bin/libdbgmalloc.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libdbgmalloc.so
Reading symbols from /usr/local/java14/jre/bin/libxhpi.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libxhpi.so
Reading symbols from /usr/local/java14/jre/bin/libhpi.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libhpi.so
Reading symbols from /usr/local/java14/jre/bin/libjava.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libjava.so
Reading symbols from /usr/local/java14/jre/bin/classic/libcore.so...done.
Loaded symbols for /usr/local/java14/jre/bin/classic/libcore.so
Reading symbols from /usr/local/java14/jre/bin/libzip.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libzip.so
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /usr/local/java14/jre/bin/libjitc.so...done.
Loaded symbols for /usr/local/java14/jre/bin/libjitc.so
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
#0 clSearchForNameCache (ee=0x80762d4, name=0x10100bf0, classloader=0x0, init=TRUE) at /userlvl/cxia32142/src/jvm/sov/cl/clresolver.c:1542
1542 /userlvl/cxia32142/src/jvm/sov/cl/clresolver.c: No such file or directory.
in /userlvl/cxia32142/src/jvm/sov/cl/clresolver.c
signal 11就是SIGSEGV,也就是所谓segfault或者全称叫segmentation fault。通常是出现空指针问题了。不过居然在库的native部分遇到segfault就core dump,这“鲁棒性”还真是……
看上去在dump的时候java.lang.Class.forName还是在解释执行,执行到调用native的Class.forName1()时,使用的是invokestatic指令;该指令被替换为invokestatic_quick,由VM里的L0_invokestatic_quick__函数执行;clSearchForNameCache()这个函数里多半有一条指令通过EAX做了一次间接读取,形式类似mov eax, dword ptr [eax](AT&T语法的话是movl (%eax), %eax),但由于EAX是0也就是空指针,所以segfault了。如果楼主能把那个libjvm.so中的clSearchForNameCache反汇编一下看看就好了,出问题的指令在该动态链接库被装载的时候是位于地址B7EE5301的。
我对IBM的JVM了解甚少,上面的分析只是根据core dump的信息来猜测的。还等高手做准确的分析。
在库的native一侧出现空指针的问题这多半不是你们的错……找IBM报bug也是个好办法。想想看你要forName的类是不是确实能找到的?是的话那应该是VM的错,不是的话想办法确保VM能加载到你要的类。
唔,IBM的JDK里的类库实现在Java一侧果然大部分都是从Sun那里来的。但那两个文件似乎没有提供多少有用的信息,只有一点特别奇怪:Java侧的stacktrace里,Shutdown的两个活动的方法都是停在synchronize(Shutdown.class)上了,接着的活动记录就是Class.forName了;native侧的stacktrace里,两个没有名字的函数应该是call stub,现在程序是死在native的Class.forName1()里的,我真的不太明白是谁调用了Class.forName()...
然后看锁的状况,被获取的锁是:
[quote]2LKREGMON JITC CHA lock (0x081CC210): owner "Thread-1" (80764B0), entry count 1
2LKREGMON Heap lock (0x08081E88): owner "Thread-1" (80764B0), entry count 1
3LKWAITERQ Waiting to enter:
3LKWAITER "Thread-0" (82BCAE8)
2LKREGMON Monitor Cache lock (0x08081DC8): owner "Thread-1" (80764B0), entry count 1
2LKREGMON Thread queue lock (0x080768A8): owner "Thread-1" (80764B0), entry count 1
2LKREGMON Monitor Registry lock (0x08081F48): owner "Thread-1" (80764B0), entry count 1[/quote]
这几个锁都是由Thread-1也就是当前线程获取的。CHA是class hierarchy analysis,多半是因为Class.forName为了保持类的层次结构的状态一致性所以锁了吧。诶总之看了这信息还是不知道是什么地方调用了Class.forName(),怪哉啊。
而且前面的回复已经提过了,发生错误的地点是在native的Class.forName1里,它的源码应该是C还是C++的而不是Java的。要是这个错误能总是在某些条件下再现的话,用gdb在libjvm.so的IBMJVM_ForName或者clSearchForNameCache设断点卡看传入的值到底是什么也好……
抱歉胡扯了很多,IBM的JVM我确实不熟 T T
[url]http://www.javakb.com/Uwe/Forum.aspx/java-jvm/166/JNI-DetachCurrentThread-leads-to-Signal-11-SIGSEGV-on-Linux[/url]
[url]http://www.theserverside.com/discussions/thread.tss?thread_id=32403[/url]