Java Code Splitter

Struggling with Servlet 3.0

A couple of days ago, I wrote a blog post about a session which I had attended during JavaOne. It was a session about the upcoming Servlet 3.0 specification, also known as JSR-315. Back at work, I thought it might be fun to try some of the new features myself. We have this one application running in our production site, which stores events that occur during Poker games. The application uses a Service based on Hessian, which clients can call to store the events. The Hessian service is exposed using the HessianServlet from Caucho, so that it may be invoked from HTTP. The Servlet runs in Jetty 6 container. The application is heavily used and around 20 to 30 calls are made per second.

My idea was to rewrite some parts within the application, so that everything is based on Servlet 3.0. I have written a test case where I measure the execution time of 30 parallel Threads persisting 30 Poker events. I am hoping that using Servlet 3.0 asynchronous requests, I can see a performance improvement. I know that this is not the perfect use case where asynchronous requests can shine as there is nothing that can be optimized using parallel processing within a single Thread. Anyway, we will see how it goes. I am curious about the results anyway.

The first step towards Servlet 3.0 is to use a JSR-315 compliant Servlet container. At JavaOne everyone was talking about the upcoming Jetty 7 release and that it supports Servlet 3.0. This was my disappointment. Jetty 7 is not built on top of Servlet 3.0 but still only Servlet 2.5 compatible. The semiofficial reason is that Jetty has moved from Codehaus to Eclipse and JSR-315 is delayed anyways. My next pick would have been Tomcat 7 as I have seen a comparison matrix that indicated, that the next Tomcat would support Servlet 3.0 as well. Unfortunately I don't think development is that far. I found some source code in SVN but I don't know if it was official and it was also only Servlet 2.5 based. So my last resort was Glassfish v3 which is the reference implementation for Java EE6. The preview release of Glassfish v3 comes with a Servlet container that implements JSR-315. Perfect.

It was the first time I installed Glassfish. It was very easy. The application server ships with a web administration interface and is easy to maintain. Currently there are not so many tutorials and examples for Servlet 3.0 on the Web, so I looked forward to check some samples which ship with Glassfish. "After installation, samples are located in install-dir/glassfish/samples/javaee6" - well this directory just does not exist. Not in the standard preview nor in the web profile. Too bad, they were supposed to have a sample for asynchronous requests as well as adding Servlet's dynamically.

Anyway, I changed my application from a standalone jar distribution that starts a Jetty 6 container to a war distribution that is deployed in Glassfish v3. To deploy something in Glassfish, just copy it into the autodeploy directory of your domain in the glassfish directory. Since the old version uses the HessianServlet directly, I had to download the source from Caucho and modify it, so that it uses asynchronous requests from Servlet 3.0. Unfortunately the HessianServlet is not really built for extensibility, so I just got a copy of the whole file to play with. To use another Sevlet 3.0 feature, I decided to add the Servlet at runtime using the new ServletContext.addServlet method. I looked up a sample on how to write a ServletContextListener. Some old documentation about JSR-315 indicated that you had to annotate the Listener with the @WebServletContextListener annotation. This annotation does not exist anymore in the final draft of JSR-315. Instead you do it the oldschool way. Write a class that implements ServletContextListener and add it to the web.xml as a context-listener. Then in the contextInitialized method, I added my AsynchronousHessianServlet.

In the next posting I will write about asynchronous requests and if this really makes an existing application faster.

Update: the javaee6 samples will be downloaded and installed using the Glassfish updater tool. In my first install attempt, the updater would not work with my companies firewall, so I never got the samples folder. It works fine if your updater tools works. Would have been nice to mention on the Glassfish or Sun website.

Session of the day: Java NIO2 in JDK7

Today has been a good day at the JavaOne. I have seen quite a few great and useful talks. For the session of the day I have picked a talk by Alan Bateman and Carl Quinn from Netflix about the new IO API (JSR203) that will be available in JDK7.

In my own private projects I still use the old Java IO API and I guess that's perfectly fine if your application is not IO critical. At work however, we have multiple projects that make use of java.nio and are very much defendant on a good performance when it comes to files and directories. So what can JSR-203 do for us?

First of all there will be a class Path which is an abstraction to a physical file or directory resource. Path is basically what File used to be in plain Java IO. To create a Path instance you have a bunch of options. You can call FileSystems.getDefault().getPath("/foo/bar") or just Paths.get("/foo/bar"). One nice thing is that Path will implement java.lang.Iterable so that you can iterate over a physical path from root to current directory. If you want to know at which depth you currently are from the root just call Path.getNameCount(). Another nice thing, when you iterate over Path using the legacy Iterator idiom and you invoke iterator.remove, then the physical file gets deleted.

In the example code above you already saw something called FileSystem. This is the a of all Paths in NIO2. In JDK7 there will also be something called a Provider which you can leverage to create your own FileSystem. It will be possible to create a memory based FileSystem, a Desktop FS or a Hadoop FileSystem or anything else you can think of. You can even make your FileSystem the default FileSystem so that whenever your application calls FileSystems.getDefault() will return your custom FileSystem.

Another cool thing is the possibility to traverse a directory tree. NIO2 contains a Interface called FileVisitor. The Interface has a bunch of methods like preVisitDirectory, postVisitDirectory, visitFile, visitFileFailed etc. Each of the methods will be invoked at certain stages when traversing a file tree. For convinced JSR-203 ships with a bunch of implementations of FileVisitor like SimpleFileVisitor or InvokeFileVisitor. You can use one of these FileVisitor's and then only overwrite the methods that are interesting for you. To kick off traversal of the file tree you would call Files.walkFileTree(Path path, FileVisitor fileVisitor).

This becomes really handy when you use it in conjunction with another new class in JDK7 called PathMatcher. This is an Interface similar to FileFilter in old Java IO maybe. This is how to create a PathMatcher: FileSystems.getDefault().getPathMatcher("glob:*.log"). In this example it will select all files matching *.log. You can also use regular expressions instead of glob syntax.

If you look at the method signature of visitFile in the FileVisitor Interface you will notice that the second parameter is of type BasicFileAttributes, which are the attributes of the current file that visitFile is invoked with. So let's say you create your FileVisitor with a PathMather that selects *.log files. What you could do in the visitFile method, is to invoke PathMatcher.match and if it is a log file, check the file size attribute using the given BasicFileAttributes. If the file is bigger than a certain size, delete it. Pretty handy or? A piece of very short Java code that traverses a File tree and deleted logfiles of a certain size.




PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:*.{java,class}");

Path filename = ...;
if (matcher.matches(filename)) {
    System.out.println(filename);
}

A entirely different use case can be covered with WatchService, Watchable, WatchEvent and WatchKey. These guys make it possible to sit, listen and react to changes that occur to Path objects. First you get yourself a WatchService using the default FileSystem. WatchService watcher = FileSystems.getDefault().newWatchService(). The next step is to get the Path like before Paths.get("/foo/bar/old.log"). Then you register the Path with the WatchService to get a WatchKey: WatchKey key = path.register(watcher, ENTRY_CREATE, ENTRY_DELETE, ENTRY_MODIFY). The last parameters in the register method are varargs. In the example you will be watching create, delete and modification events on the specified Path. Finally you need to create an invinite loop that is constantly polling the events out of the WatchKey. Your code can then react to these events.




for (;;) {

    //wait for key to be signaled
    WatchKey key;
    try {
 key = watcher.take();
    } catch (InterruptedException x) {
 return;
    }

    for (WatchEvent event: key.pollEvents()) {
 WatchEvent.Kind kind = event.kind();

        //This key is registered only for ENTRY_CREATE events,
        //but an OVERFLOW event can occur regardless if events are
        //lost or discarded.
 if (kind == OVERFLOW) {
     continue;
 }

        // do your stuff
    }

    boolean valid = key.reset();
    if (!valid) {
        break;
    }
}

One new feature that is particular interesting for our applications is the new DirectoryStream class. You use it to access the contents of a directory. Well you could do this before but DirectoryStream scales much better and uses less resources. This will not be an issue if you have a couple of hunded files in your directory but make a huge difference if there are hundreds of thousand files in the directory. Here is how to use DirectoryStream.




Path dir = new File("/foo/bar").toPath(); // new method on File in JDK7
DirectoryStream stream = null;
try {
    stream = dir.newDirectoryStream();
    for (Path file: stream) {
        System.out.println(file.getName());
    }
} catch (IOException x) {
    ....
} finally {
    if (stream != null) stream.close();
}

Okay this post was maybe a bit too theoretical. You can check out what nio2 feels like with the OpenJDK or this great tutorial.

Session of the day: JVM debugging

The second day at JavaOne was surprisingly just average. Kohsuke had a great talk about distributed Hudson builds and Hudson EC2 integration but the rest of the sessions was pretty normal. Then I am going into this session “Debugging your production JVM” from Ken Sipe and the guy blasts the roof off. He is showing all these nifty, cool tools that can help to get an insight what is going on in your JVM. Not just one tool but many. I am really having a hard time to write everything down. A lot of command-line tools that already come with Java like jstat, jps or jmap. They are ready to be used, you just need to know how to use them with the right parameters etc.

Then Ken starts to talk about something really fancy - BTrace. So how can BTrace help to debug your production VM. Essentially it is a little tool that you can use at runtime to at debugging “aspects” to your running Java bytecode. I use the word Aspect here because when I first saw it, BTrace felt quite similar to AspectJ. What BTrace does, it takes a little Script that your write in Java and dynamically injects it as tracing code into the running JVM.

What I called Script here is pure Java code. What you write looks almost exactly like a plain Java Classes with a lot of Annotations. The Code you can write as BTrace Script is really limited however. Ken had a slide in the presentation about all the Java stuff that is not doable. I only remember that you could not use the new keyword to create new objects. Luckily I found the other restrictions on the BTrace website:

can not create new
can not create new arrays.
can not throw exceptions.
can not catch exceptions.
can not make arbitrary instance or static method calls - only the public static methods of com.sun.btrace.BTraceUtils class may be called from a BTrace program.
can not assign to static or instance fields of target program's classes and objects. But, BTrace class can assign to it's own static fields ("trace state" can be mutated).
can not have instance fields and methods. Only static public void returning methods are allowed for a BTrace class. And all fields have to be static.
can not have outer, inner, nested or local classes.
can not have synchronized blocks or synchronized methods.
can not have loops (for, while, do..while)
can not extend arbitrary class (super class has to be java.lang.Object)
can not implement interfaces.
can not contains assert statements.
can not use class literals.

Your hands are tied. Well almost. So let's have a look at a sample from the BTrace website.




import com.sun.btrace.annotations.*;
import static com.sun.btrace.BTraceUtils.*;

// @BTrace annotation tells that this is a BTrace program
@BTrace
public class HelloWorld {

// @OnMethod annotation tells where to probe.
// In this example, we are interested in entry
// into the Thread.start() method.
@OnMethod(
    clazz="java.lang.Thread",
    method="start"
)
public static void func() {
    // println is defined in BTraceUtils
    // you can only call the static methods of BTraceUtils
    println("about to start a thread!");
}
}

You can see, it is a standard Java class annotated with @BTrace. Then it says @OnMethod with two parameters which translates to – every time the start method is invoked in java.lang.Thread ... What it will do in that case is invoke the static function it will find within the @BTrace annotated class. It has to be a static method. I forgot if it had to follow a naming convention too. So the static method will be invoked every time a Thread is started. In the sample, it will just print out something fixed on the console. You could also count the number of Threads or other things.

Here is another example.




@BTrace public class Memory {
@OnTimer(4000)
public static void printMem() {
    println("Heap:");
    println(heapUsage());
    println("Non-Heap:");
    println(nonHeapUsage());
}
}

This will print out memory usage every 4 seconds. Awesome. Now something really huge.




@BTrace public class HistogramBean {
// @Property exposes this field as MBean attribute
@Property
private static Map histo = newHashMap();

@OnMethod(
    clazz="javax.swing.JComponent",
    method=""
)
public static void onnewObject(@Self Object obj) {
    ....
}

@OnTimer(4000)
public static void print() {
    if (size(histo) != 0) {
        printNumberMap("Component Histogram", histo);
    }
}
}

Don't worry about the details what it does right now. The important part is that you can annotate fields with @Property and have BTrace expose these fields as Mbean. All you need to do is inject the BTrace script a little bit different from the command line.

Some words on the @OnMethod, @OnTimer stuff. These Annotations are called probe points and there are more like @OnMemoryLow, @OnExit, @OnError etc. Another example is to use BTrace to monitor entering and leaving of synchronization blocks.

Unfortunately BTrace requires Java 6+, it will not work with 5. Now you have a good reason to step up your Java version.

Session of the day: Servlet 3.0

SVG and Canvas is really cool stuff and “Cross Browser Vector Graphics with SVG and Canvas” came really close to be my session of the day here at JavaOne, but the Servlet 3.0 stuff topped it all. There are so many great features in JSR-315. The final draft is now out and the guys said it will go live with Java EE6.

web.xml = history
First of all, the biggest difference is that you do not need a web.xml anymore. Servlet 3.0 fully relies on Java Annotations. If you want to declare a Servlet use the @WebServlet Annotation on your Servlet class. The class still has to inherit from HttpServlet, this remains unchanged. That means it is not possible to use other methods instead of doPost, doGet etc. and annotate them to mark them as Request handler methods. This is something you can do in the Jersey library (JSR-311). In theory I guess it would have been possible to not inherit from HttpServlet but the create a rule that for every class annotated with @WebServlet you have to have methods annotated with @PostMethod or @GetMethod. I guess there are good reasons not to do so in Servlet 3.0.

At the very minimum, you have to annotate specifying a URL under which to invoke your Servlet. If omitted, the full class name will be used as the Servlet name. Other parameters you can specify in the top-level @WebServlet Annotation are for instance if you want the Servlet to be usable in asynchronous requests. More on asynchronous requests later.




@WebServlet(url = "/foo", asynchronous = true)
public class SomeServlet extends HttpServlet {
 public void doGet(...) {
     ....
 }
}

So the web.xml is gone. Filters are added to the ServletContext using @WebFilter Annotation, Listeners are added using @WebListener Annotation. The deployment descriptor File web.xml is still useful though. It can be used to overwrite whatever you have specified using Class Annotations. So if you create a web.xml file, whatever you have in there has the final word when the Container starts up the ServletContext.

Servlets, Listeners and Filters can now also be added programatically. The ServletContext class has new methods like addServlet, addServletMapping, addFilter or addFilterMapping. On Container start up, you can hook in and add Servlets or whatever you want at Runtime.

Web Frameworks can plug in
Something that I think is really cool, is the possibility for Web Frameworks like Apache Wicket, Tapestry or Spring MVC to plug-in into the ServletContext creation. Remember that in the past, whenever you learned about a new web framework, there was this one section in the documentation where you had to add some Servlet, some Filter or some Listener to the web.xml?




<web-app>
 <display-name>Wicket Examples</display-name>
 <filter>
     <filter-name>HelloWorldApplication</filter-name>
     <filter-class>org.apache.wicket.protocol.http.WicketFilter</filter-class>
     <init-param>
       <param-name>applicationClassName</param-name>
       <param-value>org.apache.wicket.examples.helloworld.HelloWorldApplication</param-value>
     </init-param>
 </filter>
 <filter-mapping>
     <filter-name>HelloWorldApplication</filter-name>
     <url-pattern>/*</url-pattern>
 </filter-mapping>
</web-app>

This is now history in Servlet 3.0. Web Frameworks can supply something in their JAR-file deployables that is called web-fragment.xml. It is basically a light version of the web.xml and has almost the exact XML structure. What the Container does when it loads up, it will go into all the JAR files in WEB-INF/lib and scan for a web-fragment.xml File in the META-INF directory. If it finds one, the file will be used when creating the final ServletContext. Sometimes you want more control over how fragments are being pulled into the ServletContext creation. There are ways to control the ordering in which web-fragment.xml files are being put together. The library itself can specify a relative ordering in the web-fragment.xml file. For instance it can say, I am not interested in a particular order, load me last. It is also possible to define an absolute ordering in your own web.xml file which will have again the final word. One important thing of notice. Only the WEB-INF/lib directory will be used to scan for web fragments, not the classes directory.

Third party libraries can also add resources on their own. Whatever the web framework has in META-INF/resources becomes available from the context in which the library is loaded. For instance if you have META-INF/resources/foo.jsp then the resource is available from http://localhost:8080/foo.jsp. Kind of useful too.

New security, New Error Page

Security constraints can now also be expressed using Annotations. I forgot the correct Annotation names, I think it was something like @DenyRoles or something that you could put on for instance doPost methods to secure them. I vaguely remember also, that the standard security mechanism to display a Web Form (Form Based Authentication) is removed. What you now do instead is to properly annotate a method which will authenticate the User for you. You are therefore basically free to choose however you want to authenticate your Users. Unfortunately this is a feature in the Servlet Specification that I have almost never used in the last 10 years, so I did not pay too much attention when they talked about this and the security part was also kept short in the session.

Which brings me to the next improvement, default error pages. Remember back in the old Servlet 2.3+ days that you had to add a default error page for every error code? What a copy and paste mess. JSR-315 gives you the possibility to define default error pages. You can have for instance something like “show this page for all errors except 404”. Very handy.

New methods in Response
Remember that it was a pain in the ass to work with HttpResponse sometimes? It was not apparent in which phase the response was, what the status was. Servlet 3.0 will add new methods to HttpRepsonse that will make our lives easier. You can get Header name, Headers and the Response status now using API calls.

Asynchronous Requests
Finally to the most impressive new feature - asynchronous requests. This is huge! Imagine that you have HTTP POST method in a Servlet and what it does is to call out to a WebService. While the Service is doing it's work, the HttpRequest is being held by the Servlet Container. Standard Stuff. This however becomes problematic if your thread pool limit is reached. So lets say you have defined that 50 Threads should be in the Pool. You receive 30 Requests per second. If the web service call takes 2 seconds, you have a problem because 60 Requests are coming in and only 50 Threads are in the pool.

Here is what Servlet 3.0 does. Well, it is kind of hard to explain and I did not understand it fully. But here is what I think it does. You can specify on Servlet and Filters that they may be used for asynchronous requests. This is something you have to declare as an attribute of the @WebServlet or @WebFilter Annotation. The request comes in, it will look at the Filter chain and the Servlet. It will figure out if it can run asynchronously or not. The original request calls out and the Request is suspended, therefore freeing a Thread and returning it to the pool. A callback method is given along. The callback method is invoked when the external resource becomes available and the Container will use a new Thread to generate the Response. The Request therefore resumes it's work. New methods for resuming and suspending as well as querying the current status of the request have been added. There was also something called asynchronous Handle in the presentation slides but I forgot how it was used.

Anyway the guy at JavaOne had an example ready where he had written a Servlet that queried the Ebay Rest Interface for 3 keywords. He then had written the same using asynchronous Requests from Servlet 3.0 and it was like 3 or 4 times faster. This is because not only are the Threads returned to the pool while they are doing nothing (Thread starvation) but also can multiple Requests be run in parallel. This is a major improvement I think. Our current web applications can be made faster just by using features of a new Servlet specification.

I hope this was useful. I will experiment with Servlet 3.0 before the final release comes out. I think I might be able to do this today already using some experimental version of Jetty or Glassfish.

Welcome bag

I made it through the first day of the CommunityOne West. Unfortunately my drive back from Fresno to California took longer than I thought. I was only be able to attend two conference sessions in the afternoon. The first session was about database architectures for web 2.0 applications. Personally I found the talk a bit superficial. Too much high level stuff. I wrote down a few buzzwords like Memcache, Multi Statement Requests, Horizontal Partitioning and Hibernate Shards.

The second session was an introduction to Amazon EC2 and it was very interesting. F

inally I know what Cloud Computing stands for and how it works. I realized, that I was programming in the cloud already, as I had used Google App Engine. Chris Richardson, who did the session, sounded a bit negative every time he mentioned Google App Engine. At least I got these vibes. He did not like that GAE was not fully 100% Java compliant and that database access was limited to Appengine's own datastore. Well, at least it is free I'd say.

Before I went to the CommunityOne West, I checked in for tomorrows Java One and got my welcome bag. It contained a T-shirt, a jacket, some papers, an Open Solaris CD and something, you would have had a big question mark above your head for - antibacterial hand gel. Pig influenza is just around the corner.

Zeitgeist - what you can do as Java developer

Since JavaOne is in San Francisco, I used the chance that I am in the Bay Area and visited a friend, that I had not seen a few years, in Fresno. We ended up hacking around and watching documentaries on DVD. One particular documentary that made quite an impression on me was Zeitgeist Addendum. You can watch it here on the blog but beware it has full-movie length.

In Zeitgeist Addendum the author evaluates how things tick today by looking at our system and the society. He starts by looking at the global financial system and how it is doomed to fail sooner or later. In the second part, someone talks about how small countries are controlled and corrupted by organizations in the worlds leading countries. It becomes apparent that this system will never change by itself. The last two parts, which I found most interesting, talk about alternatives to our current society structures. There is the Venus Project as a different society experiment. It is shown how different things are handled in the Venus Project than they are in our current society.

One of the biggest problems today is that we live in a profit orientated society. You need money to survive. Companies need to make more money every year. A lot of bad things we see in our world today are directly connected to the profit orientated society. The destroying of our worlds environment. The wars that countries fight against each other. In Zeitgeist Addendum it is proposed that the profit based system must be weakened for mankind to move on. Today, environmental friendly technology is held back or killed by competition just to make more profit. The solution to a lot of problems are not politicians but technology. How can I help?

This is where I think even as a, maybe unimportant, Java developer you can help. Because this is what we create – technology. Start being a better developer. Educate yourself. Read a book at least once in three months. This will give you new ideas, this will motivate you, this will maybe help you to be more efficient, to create more in a shorter time. Help other developers around you. Do not think about why you should help or teach them. Do not be jealous to what they have and you don't. Do not be angry why they get away being inefficient every day. Give them something back instead, from what you have learned and therefore help others to become better and more efficient too.

Automate things you are doing repeatedly and slow you down. If you spent 3 hours every week, looking at various log files, investigating why something did not work as expected - create something that will make it easier and more efficient for you. First of all, you will learn something while you do it. Secondly, if you only need 30 minutes instead of 3 hours, you have gained 2 ½ hours in which you can work on something more challenging - maybe the next MapReduce algorithm.

Support the community. Participate in writing open source software. Give answers to other peoples questions. Write a blog. Publish the super-cool code snippet you have written, so that others save time. How can this help to weaken the profit based, monetary system? Well, maybe your answer to some question, your code, your open-source plugin will be used by someone else to create better technology. Technology that will makes it possible to travel faster using renewable energies, so that we don't need airplanes anymore. Or you indirectly help to create software that examines data intensive DNA patterns and cancer can be cured.

Start using Linux. Almost everything that you do on a Windows powered machine, you can do using free software and Linux too. By using Linux and showing others (developers, family) how easy it is to use, or that it even exists, you help weakening the influence of Microsoft. I do not say Windows is bad, it is really great even. It just costs too much and Microsoft has for a long time leveraged the fact that there was no alternative. Unfortunatly an OS is something every computer needs. You cannot sell it for a couple of hunded bucks. Come on. Support the idea of free software. Imagine a world in which software is free and everyone has access to it. So many people that could help creating new, helpful technology.

Support companies creating modern technology like Google. Even though Google is a profit orientated company, they have in the last years given so much great stuff out for free. Take Google App Engine. Isn't it great to have a place that does not cost you anything, where you can put your own Python slash Java applications and share it with others? Take the MapReduce paradigm I mentioned earlier. Google has shared it with others. It can now be leveraged by till example Health Researchers for data mining. There is a backside of the coin though. Google is dominating search. The world of search is paralyzed. People are searching for contents like they did in 1999. There cannot be new development in search if everyone uses Google. Think about alternatives, even though it hurts, I know :)

You work in IT. You are making probably more money than the average salesman in your town. Think about switching to energy providers offering energy from renewable sources. This will likely costs more. It can even be that the energy is not from a renewable source at all (some people call that cheating by the way). But it will create awareness if people start paying extra for nice energy. As more and more people will hop on, companies will fully shift renewable energy. They will invest into renewable energy and it will become cheaper. Start to turn off computers when you are not using them. I had mine always running during the night. Probably because I was lazy to turn them off or I wanted to save time turning them on again next day. Save yourself some money and some non-renewable energy.

Alright now it seems to be getting too far off. Just watch Zeitgeist Addendum and think about it. I am off for JavaOne now.