skip to main | skip to sidebar

Saturday, October 10, 2009

String manipulation in Java

One topic most of the developers in early days of their career overlook while programming is STRING MANIPULATION. This article will throw some light on this topic, so that one can make habit of some efficient coding. I am going to give some insight into Java string manipulation and some better ways to do string manipulation.
In Java String objects are immutable, what it means is once you create a String object in Java, it can not be changed/modified. For example look at the following code.
   String a = "A String Object";
   a = "Second String Object";
In the above code, we created a String object 'a' at line#1 with the value "A String Object", and in line#2 we tried to change the value of the object 'a' to "Second String Object". As I mentioned little earlier, when you create a String object in Java, it can not be changed. In line#2 when you try to change the value of object 'a', JVM [Java Virtual Machine] will create a new String object with the new value and assign its reference to object 'a', and the old reference will be marked for garbage collection. Let's look at another piece of code.
   String a = "A String Object, ";
   a = a + "created in Java";
It looks almost similar to the first piece of code. Difference is after creating a String object at line#1, at line#2 - we tried to append another string "created in Java" to the existing one. Obviously result we expected is "A String Object, created in Java". Internally what JVM does is it creates a new Java object with the new appended string and assigns its reference to object 'a', old references will be marked for garbage collection.
This cost of this string manipulation [above pieces of code] might look negligible to any one, but think about an application with pieces of code like the one above spread all over its source. Think how many String objects gets created and marked for garbage collection at one run. This all results to many CPU cycles allocated for creation of String objects and garbage collecting unreferenced String Objects.
Now let's look at one way of doing String manipulation in a better way. Java provides two classes which allow you to modify String values.
These two objects are almost same with only one difference. StringBuilder class is unsynchronized, where are StringBuffer is synchronized. In short StringBuffer is thread safe, where as StringBuilder is not. We can use these classes to do String manipulation efficiently in our code. When to use StringBuilder and when to use StringBuffer? Answer to this question is when your code is accessed by single thread at any time then it is better to use StringBuilder, which gives better performance in this scenario. And when your code is accessed by multiple threads then you need to use StringBuffer because its thread safe.
Now lets look at an example using one of these classes. Think of a situation where you have String objects representing parts of address of one user/company or something like that. Think that you've four String objects 'streetAddress', 'city', 'state', and 'country' with values initialized properly, and you want to generate one String with all these values. For example if the above mentioned objects are initialized as given below,
   streetAddress = "Plot#10, Road#7";
   city = "Hyderabad";
   state = "Andhrapradesh";
   country = "India";
And you want to have a composite string like "Plot#10, Road#7, Hyderabad, Andhrapradesh, India". One inefficient way of doing is using the overloaded '+' operator like
   streetAddress + "," + city + "," + state + "," + country
Now the better way using StringBuilder class is
   StringBuilder sb = new StringBuilder();
Most common place where developers can improve string manipulation related code is JSP pages. If you take the above example, to print the address, some developers code it like
   <%=streetAddress + "," + city + "," + state + "," + country %>
In a JSP page one object [instance of JSPWriter, 'out' object available on a JSP] will already be created to generate the output. If you write code which is similar to the one above, JVM creates unnecessary String objects for each of the '+' operator you use. After reading this article till this point you can think of using StringBuilder/StringBuffer, but that might not be the correct way to do it. Why?, as I mentioned above there will be one object 'out' created already, creating another StringBuilder/StringBuffer object doesn't make sense. Then obvious solution is to use the 'out'. You can write something like the one below to generate the address string.
   <%=streedAddress%>, <%=city%>, <%=state%>, <%=country%>
If you think the above code makes the JSP look little complex, then you can write a static method to do the job for you. This method can take the 'out' object [JSPWriter instance] and the values to be appended. In that method you can call the 'append()' method on JSPWriter object 'out' to append the values.
That is all I have for now, about String manipulation. Hope it helps some guys to do some efficient coding.
Technorati Profile


Mohammed said...

In Java, Interfaces make things simple.

Notice that StringBuilder, StringBuffer as well as JSPWriter implements Appendable interface.

for the mentioned example of generating complete address on a JSP, write a static method which take Appendable as an argument and perform append operations on it without worrying about the implementation class.

sharath said...

why do we need a object do simple string manipulation ????

Rakesh Reddy said...

In Java all string are Objects, instances of String class :-)

Post a Comment