<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Neuroning: Graphical Thread Dumps</title>
    <link>http://neuroning.com/articles/2005/11/24/graphical-thread-dumps</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>My mind workout on Software Development</description>
    <item>
      <title>Graphical Thread Dumps</title>
      <description>&lt;p&gt;I am surprised by the high number of Java developers I meet that do
not know what a Java &lt;a href="http://java.sun.com/developer/technicalArticles/Programming/Stacktrace/"&gt;Thread Dump&lt;/a&gt; is or how to generate one. I find it
a very powerful tool, and it is always available as part of the JVM.
I haven&amp;#8217;t played much with Java 5 yet, but it comes with &lt;a href="http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstack.html"&gt;jstack&lt;/a&gt;, a new tool that makes it easier to generate thread dumps.&lt;/p&gt;

&lt;p&gt;Earlier this year, I was working on a load test for for a well-known
airline. We were tunning the environment all we could, monitoring and
profiling to know where to focus our optimization efforts. The
solution involved a fairly high stack: Apache httpd, WebSphere,
FuegoBPM, Tibco messaging, Oracle RAC.&lt;/p&gt;

&lt;p&gt;The system was holding load pretty well up to a certain point in which
it immediatly halted and stopped processing new requests. Every time
we run the load testing scripts we experienced the same symptoms. Not
even the official testers &amp;#8211;with allegedly powerful testing and
monitoring tools&amp;#8211; were able to identify the cause of the problem.&lt;/p&gt;

&lt;p&gt;So, I decided to get a few Thread Dumps of WebSphere&amp;#8217;s JVM. On Unix, you do &amp;#8221;&lt;code&gt;kill -3 &amp;lt;pid&amp;gt;&lt;/code&gt;&amp;#8221; and the dump goes to WebSphere&amp;#8217;s &lt;code&gt;native_stdout.log&lt;/code&gt;. We
inspected the dumps but couldn&amp;#8217;t identify dead-locks or any other
obvious anomaly, although the answer was right before our eyes.&lt;/p&gt;&lt;p&gt;Since the thread dump was daunting, I decided to spend a bit of my time on doing some fun work: I
wrote a short Ruby script to create a graphical representation of the
dump, showing the locks each thread was holding, and the locks each
thread was waiting on. The heavy work of actually drawing and laying
out the graph was left to &lt;a href="http://www.graphviz.org"&gt;GraphViz&lt;/a&gt;&amp;#8217;s &lt;code&gt;dot&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Once the script was usable, I generated the graph for the above
mentioned dump. To our delight, the graph immediately exposed the
problem. &lt;a href="/pages/thread-dump-graph"&gt;See for yourself&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Every thread was waiting on a lock that was held by thread
&lt;em&gt;Servlet.Engine.Transports 1405&lt;/em&gt;. What was this thread doing?
Here&amp;#8217;s the stack, taken from the thread dump:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;...
"Servlet.Engine.Transports : 1405" daemon prio=5 tid=0x020cee40
                      nid=0x515ea runnable [8648f000..864919c0]
  at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
  at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:221)
  at java.io.File.exists(File.java:680)
  at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:887)
  at sun.misc.URLClassPath.getResource(URLClassPath.java:157)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:191)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
  at com.ibm.ws.bootstrap.ExtClassLoader.findClass(ExtClassLoader.java:79)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
  - locked &amp;lt;0xb4c1be08&amp;gt; (a com.ibm.ws.bootstrap.ExtClassLoader)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
...
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;accessing the filesystem! After sharing the findings with the rest of
the team, one of the admins revealed that the WebSphere installation
was on a mounted NFS drive&amp;#8230; WebSphere&amp;#8217;s JVM was trying to reload
some .jar files, but it choked NFS under heavy load.&lt;/p&gt;

&lt;p&gt;Unfortunately, moving WebSphere to a non-NFS was not trivial (you
know&amp;#8230; a big company, with procedures, standards, bureaucracy), and
since the managers were already happy with the results of the load
test, we never had a chance to run it all again. I am still wondering
how much load it would have standed without NFS (and
why was WebSphere reading those .jars so often to begin with?).&lt;/p&gt;

&lt;p&gt;So, we found the bottle-neck thanks to a thread dump, and &lt;em&gt;a picture is worth a thousand lines of thread dumps&lt;/em&gt; :-)&lt;/p&gt;

&lt;p&gt;Here is the quick and dirty script: &lt;a href="/images/articles/tdg.rb"&gt;tdg.rb&lt;/a&gt;&lt;/p&gt;</description>
      <pubDate>Thu, 24 Nov 2005 23:00:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:e2be0033ccb7ef365ebf1fb09a3a9639</guid>
      <author>F</author>
      <link>http://neuroning.com/articles/2005/11/24/graphical-thread-dumps</link>
      <category>Java</category>
      <category>Debugging &amp; Optimizing</category>
      <trackback:ping>http://neuroning.com/articles/trackback/7</trackback:ping>
    </item>
    <item>
      <title>"Graphical Thread Dumps" by Wiz</title>
      <description>Excellent Stuff F..!</description>
      <pubDate>Fri, 18 Nov 2005 22:47:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:</guid>
      <link>http://neuroning.com/articles/2005/11/24/graphical-thread-dumps#comment-20</link>
    </item>
  </channel>
</rss>
