Thursday, 7 August 2014

XPATH to check if a given node is present or not

To check if a node is present in a XML or not simply use boolean function

So for the XML


<person id="1"> <name>John Smith</name> <phone_home>12345</phone_home> <phone_office></phone_office> </person>





If we want to know if phone_home is present or not then simply say

boolean(/person/phone_office)

Similarly if you want to check if any phone is present or not

boolean(/person/*[contains(name(.),"phone")])

and in the same way it can be extended to more complex queries, like if any phone is empty

boolean(/person/*[contains(name(.),"phone") and .=""])

or if any or the nodes are empty:

boolean(/person/*[.=""])


Same approach can be applied based on node content.

So for XML


<jobs> <job seq="1">Job 1 Completed with status success</job> <job seq="2">Job 2 Completed with status success</job> <job seq="3">Job 3 Completed with status failure</job> <job seq="4">Job 4 Completed with status success</job> </jobs>







If we need to know if any of the jobs failed:

boolean(/jobs/job[contains(.,"status failure")])

Also, the same approach can be used based on attribute name

So for XML


<details> <detail name="John Smith"/> <detail phone_home="12345"/> <detail phone_office="54321"/> </details>






If we need to know if any of the attribute has phone in attribute name then:

boolean(/details/detail[contains(name(./@*), "phone")])

And same as with node names we can check if any of phone is empty:

boolean(/details/detail[contains(name(./@*), "phone") and ./@* = ""])

or if any of the attribute is empty

boolean(/details/detail[./@* = ""])

Hope this helps

Thursday, 23 January 2014

Java process memory: JVM heap size constraints in 32 bit environment

In theory, on a given computer system, the addressable memory is 2^b bytes where b is bit of the processor architecture. (Although there is much more to it like PAE and stuff. )

So maximum physical memory in a 32 bit is:
2^32 = 4X(1024)X(1024)X(1024) Bytes = 4096 MB = 4 GB

Out of this 4 GB some of the memory is reserved for OS, amount varies from OS to OS. So the maximum memory any process can occupy is less than 4GB. Further, in case of java process there are other components to the memory usage - JVM heap, PermGen space and native threads.

Roughly,
JVM Heap Size + PermGen Space + (Thread stack size*number of thread) < Maximum space OS allows for processes.

So there is limit on the maximum heap size you can have practically which depends on the underlying OS.

I did a few tests to check what is the result of requesting different Max heap size values on different systems. Following table lists result of different Xmx values for each of those systems:


Xmx (CentOS 32 bit)Xmx (Ubuntu 32 bit)Xmx (RHEL 64 bit)Xmx (Win7 64 bit)Xmx (Win XP 32 bit)Xmx (SunOS 32 bit)Result
> 4095m> 4095m> 4095m> 4095m> 4095m> 4095mThe specified size exceeds the maximum representable size.
4095m4095m4095m4095m4095m4095mIncompatible minimum and maximum heap sizes specified
4030m - 4094m4030m - 4094m4030m - 4094m4000m - 4094m4000m - 4094m4000m - 4094mThe size of the object heap + VM data exceeds the maximum representable size
2700m - 4030m2685m - 4030m3730m - 4030m1500m - 4000m1460m - 4000m3750m - 4000mCould not reserve enough space for object heap
< 2700m (approx)< 2685m (approx)< 3730m (approx)< 1500m (approx)< 1460m (approx)< 3750mJVM initialized

Little more on the results:
  1. So when you try to go above 4096M (ie 4GB) the  error is "The specified size exceeds the maximum representable size." Which is expected since 4GB is maximum addressable memory.
  2. At 4095M, strangely it says "Incompatible minimum and maximum heap sizes specified", looks like value 4095 is regarded same as 0. I am not sure on this, need to do more research.
  3. From roughly 4000m to 4094m it says "The size of the object heap + VM data exceeds the maximum representable size" which is again understandable since as mentioned before java process has other components too so overall java process size is going beyond 4096M (the limit on addressable memory)
  4. Going lower on heap, we have "Could not reserve enough space for object heap" so now we see those OS specific limits. Each OS allows a different amount of memory for a process and subtracting the non-heap part of java process we get the limit on heap size. Even the non heap part (thread stack size and limit on threads) is OS specific.

All of these tests were performed using 32 bit jdk1.7.0_04 with default PermGen space and default thread stack size and thread limits. If you tweak these heap size can be increased a little.
So in the limited test environments (with default settings) the trend is roughly as following:
  • OS being 32 bit or 64 bit has very little bearing on the results.
  • Server class Unix based OS (SunOS and RHEL) > Desktop Linux (Ubuntu, CentOS) > Windows Desktop (Win7, WinXP)
Need to check this on Windows Servers too but i think results will be between Windows desktop and Linux desktop values but closer to windows desktop OS.

Wednesday, 1 January 2014

Evaluate XPATH in Java

This is straight forward and I think better than iterating through the document tree. There is a evaluate method available in XPATH interface.

Here is a sample code for it:

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;


public class ApplyXPATH {

    public static void main(String[] args) {
        XPath xpath = XPathFactory.newInstance().newXPath();
        InputSource inputSource = new InputSource("XMLs/MultiNode.xml");
        String expression = "//node/data";
        NodeList outPut=null;
        try {
            outPut = (NodeList) xpath.evaluate(expression, inputSource, XPathConstants.NODESET);
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }
        for(int i=0; i<outPut.getLength();i++) {
            System.out.println(outPut.item(i).getFirstChild().getNodeValue());
        }
    }
}


Sample Input:
<root>
    <node>
        <data>123</data>
        <data>456</data>
        <junk>abc</junk>
    </node>
    <junknode>
        <junk>def</junk>
    </junknode>
    <node>
        <data>789</data>
        <junk>ghi</junk>
        <data>012</data>
    </node>
</root>


Sample Output:
123
456
789
012

Here I am just printing the node values but you can easily create a new XML document and append this nodelist to it.

PS: if you change the expression to //node/data/text() then to access the value you just need to do outPut.item(i).getNodeValue()