Java Code Splitter

Don't let these ThreadLocals escape

A while ago I wrote about ThreadLocals and how useful and tricky they can be. In that blog post, I also wrote that it was a good practice to clean up your ThreadLocals so that, if the Thread was reused, it would not have access to the previous values. The recommended way to do this for a web application was to use a ServletFilter.

After running this setup for a while, I also have to add that you really have to know all the entry points into your application in order to achieve 100% cleanup coverage. Normally the ServletFilter is adding what is called a "Cleaner" into the ServletRequestCleanupService. That is required before you can cleanup callbacks for any of your ThreadLocals. In our log file I saw that we were not always adding a "Cleaner". This was an indication, that we were running code which had not been passed through the ServletFilter, so I reviewed our application.

It turned out that the first problem was a missing url-pattern element inside the filter-mapping block in the web.xml file. Unless you are mapping to /* make sure you are catching all the possible Url's. The good news is that since Servlet 2.5 you are allowed to have multiple url-pattern elements inside each filter-mapping element. So this was an easy fix.

Some people were writing that you could also separate multiple pattern with a comma. I haven't tried that myself.

Another problem area are application internal Thread pools. For instance do we use custom Events in the Spring framework which are passed between Spring beans. Per default this is done synchronously. You can change to asynchronous Event delivery by using a ApplicationEventMulticaster together with a proper TaskExecutor (i.e. org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor instead of default org.springframework.core.task.SyncTaskExecutor). However if you do this, you are creating yourself a thread pool. Listening to and handling of the Events will be done in separate Thread and not pass through the ServletFilter. So I was looking into ways to make sure, that each Event Listener adds a "Cleaner" and executes the cleaning logic afterwards. This was a good candidate to use an Aspect.

I have used the @AspectJ syntax for this Aspect and made it a Spring bean. This means I can compile the Aspect using a regular Java compiler. Instead of using load-time or compile-time weaving, we are using the Spring proxy-based AOP approach. In the code above, I am creating 2 Pointcuts, each one mapping to one of the Listeners where we actually have to do cleanup. This is probably not very future proof. Someone else might write another event listener in the future, which does then not have a Pointcut mapped to it. On the other hand using the proxy based AOP approach is probably slower than real weaving and some of the listeners which we have (5 currently) are really receiving a lot of events. So I sacrificed a future-proof implementation for maximum performance.

The 2 Pointcuts selecting listener execution are then combined in another Pointcut, which is then finally being used to create an Around advice which has a similar cleanup logic to the ServletFilter. Splitting Pointcuts into logical units with meaningful names is also a good AOP practice which I can recommend. I also took the liberty to rename a few classes. ServletRequestCleanupService became just CleanupService and ServletRequestCleanupCallback became CleanupCallback, which was more fitting now that not everything was passing the ServletRequestCleanupFilter anymore.

Time to wrap this up. If you need to clean up ThreadLocals from your Threads, investigate carefully make sure you have covered all entry points to your application. At least at some logging so you can find "holes" easily.

Mimicking a circular buffer

Today I needed a Java collection having some non-standard properties. I wanted to continuously iterate over the collection, pretty much like you would over a circular buffer. This alone would be simple enough to implement with any Java List I guess but I also wanted to be able to remove elements from the Collection while going through it. I could have written my own linked list and unlinked the elements that I wanted to remove while cycling through the list. However I wasn't really interested in adding a custom linked list to our project just for this specific purpose. Unfortunately the remove method of the LinkedList in Java takes an index which implies that you cannot iterate through the list using a for-each loop. If you use a for loop, the control variables have to be adapted after removing elements - so the code becomes more complex.

Google Guava to the rescue. They have this nice utility class Iterators. The cycle method in Iterators returns a indefinitely looping Iterator for any Iterable that is given as argument. This would give me the behavior of the ring buffer and, because it was an Iterator, I was able to remove elements from the underlying collection. The loop would stop if the Collection was exhausted. Pretty neat.

Testing just got better

This week I managed to work a bit with our test suite and make it run faster. When you work on a project, which has a codebase that grows and grows, it is natural that more and more tests are getting added. After some months you will be sitting with a test suite that runs a minute or even longer. This is the execution time for all our unit tests in the different Maven modules.

Last year I read an article in the German Java Magazin about a library called org.patterntesting. The library comes with TestRunner that can be used to run all test methods within a test class in parallel. Just change your test to look like this:

This will of course not work for all your tests immediately as not all tests can be run in parallel. Often this is due to bad test or software design. For instance tests requiring write access to the same physical File, tests altering shared fields within a test class, tests changing static field values - just to name a few. As you refactor your tests, so that they can run concurrently, you will automatically improve the design and testability of your application. We had a couple of these "smelling" unit tests that needed to be refactored. So this is what the execution time looked like after running the tests with patterntesting.

Saving 40 seconds does not seem a lot. But 40 seconds times 15 builds per day times 3 developers times 21 working days in a month brings you to 10,5 hours. Unfortunately, it isn't always as easy. Sometimes your test is already using the TestRunner, so you cannot just switch and use ParallelRunner. This is the case for all our Spring tests which were using the SpringJUnit4ClassRunner from Spring. I contacted one of the authors of the patterntesting library and got some help. In the latest version, patterntesting 1.2, there is a new TestRunner class ParallelProxyRunner, which can be used in connection with the DelegateTo annotation, to delegate to the original TestRunner while running the test in parallel. This works for the SpringJUnit4ClassRunner, but you have to be aware that the SpringJUnit4ClassRunner is not thread-safe (a problem that will be fixed in Spring 3.2). Though as a user of the patterntesting library you will never be affected by this - the ParallelProxyRunner will hide this problem for you.

This isn't everything patterntesting has to offer. My favorite thing is the @Broken annotation which replaces the @Ignore annotation in Junit.

One big anti-pattern in test driven development is developers adding @Ignore annotations and then never look at the test case again. When introduced the patterntesting library to other EA developers, I got a lot of responses like: "why do you have tests flagged as ignore or broken in the first place?" - it's bad practice. Yes, you are all right. But often reality is different. Game producers can get very pushy. Developers are forced to commit hot-fixes which can potentially break existing tests. Then the developer might not be able to fix the test for various reasons.

He or she is new in the team and doesn't have the big picture.
He or she is junior and doesn't know how stuff works.
The test is overly complicated so that only the author understands it.
It takes too long to fix it and something else has higher priority.

Just to name a few. Patterntesting comes adds other useful stuff for the testing toolbox. Here are some examples:

More examples can be found here.

Starting Dependency WAR Artifact using maven-jetty-plugin

I have worked in a lot of Maven projects, that used the maven-jetty-plugin. Normally the plugin is used to start a Jetty container with the WAR artifact produced by the current project. This works like a charm. Sometimes however, you want to host the WAR artifact of another project. This could be the case if you are developing the client for a service that can be reached via HTTP. The integration tests in the client project would require the service to be running and reachable, so that these tests can test and verify the client code. To start the Jetty container before and shut down after the integration test, you define two executions and bind them to the correct phase in Mavens lifecycle.

Also starting any external WAR artifact is straight forward with the maven-jetty-plugin and the deploy-war goal.

As you can see, the version is hard-coded in the path. This is somewhat ok as long as you are not violating the DRY principle. If you use the version somewhere else in your pom.xml, make sure to use a custom Maven property. Another way, which is a bit nicer perhaps, is to use the cargo-maven2-plugin which out of the box can start the WAR artifact of any Maven dependency from within the dependencies section. Here is a nice example from stackoverflow.com.

As you can see, either the cargo-maven2-plugin or the maven-jetty-plugin can be used for the simple use cases. It gets a bit trickier if you want to start the external WAR artifact and set some System properties during startup of the container. Both the maven-jetty-plugin and the cargo-maven2-plugin allow you to define individual system properties. However, only the maven-jetty-plugin can read a pre-existing properties file instead of individual System properties. This was required in one of the projects I am working on. Copying all the System properties out of the properties file to add them as individual System properties, is a lot of work and would again violate the DRY principle.

Also sometimes when you are developing both the service and the client project, and you are not using the SNAPSHOT mechanism, it can be tedious to update the version of the server WAR artifact that gets started during the integration tests of the client project. Maven knows two fixed keywords which you can use instead of specifying an exact version or a range. Use LATEST to download the latest snapshot or released version of a dependency from a repository. Use RELEASE to download the latest released version of a dependency from a dependency. Unfortunately you cannot use LATEST or RELEASE to start a WAR artifact of a dependency, if you are using the maven-jetty-plugin. This is because you specify the location of the WAR artifact as a full path inside the configuration - webApp element of the maven-jetty-plugin. The plugin does not use the syntax which is used to define the Maven dependencies.

There is however a little trick. You use the maven-dependency-plugin, which uses the default syntax for dependencies and can understand LATEST and RELEASE, to copy the WAR artifact to a fixed location. While copying you should also rename the war file. This makes your life easier as you never have to adapt the path in the webApp element of the maven-jetty-plugin if the version changes. Here is an example:

Test our new game

As some reader of this blog might know, I work for the EA studio of Playfish. Currently we are heading into the closed beta phase for a game which I helped to develop. This is a so called social game which is being played on Facebook. The backend of the game is developed by our team in Java. If you want to be one of the first ones to play the game and become a beta tester, fill in this application. I am not allowed to tell anything about the game at this point, just in response to this comment - yes the game is different from Adventure World and Cloudforest Expedition and much more fun to play.

Maybe I should have used a Lock here

Java 5 added some really nice classes in the java.util.concurrent package. For instance there is the ConcurrentMap interface which allows you to add items to a Map if they are not already contained. The code you would normally write if the ConcurrentMap didn't exist, needs to do this as a atomic check-then-act sequence, if the Map is shared between Threads. With the ConcurrentMap interface you get all of this for free using the putIfAbscent method.

I shoot myself in the foot today, with a small piece of code, which one of my unit tests was executing from a large number of Threads. The test was failing randomly, like 5% of the times. If you see tests randomly failing, it is often indicating a Date problem or a concurrency problem. Here is the class under test. Can you spot the problem?

In order to understand what the class does, you need to know what a DynamicProperty is. This is a class wrapping a value which comes from a remote source, i.e. over the Network. So instead of having fixed System properties, you would ask the remote source for the value. Then the value is cached for a couple of seconds and if expired you refresh the value. Now the ChangeAwareDynamicProperty is doing the same thing but additionally react to value changes. So every time the value is changed in the remote source, there is a costly operation that the ChangeAwareDynamicProperty needs to do. This code is not shown as it is not relevant.

What is the ConcurrentMap for? Obviously the costly operation should only be done once per value change right? So a simple approach is to use locking. Let only 1 Thread at a time go into a critical section where the current value is compared to the new value. If a change is detected, run the costly operation. This would totally work but the throughput would be horrible. Especially when a Thread detects a value change while holding the Lock, running the costly operation would block all other Threads for a while. So my idea was to use a ConcurrentMap instead of a Lock. If Threads detect a value change, they try to put the new value into a ConcurrentMap. By definition, only one Thread can put the new value into the Map. For this Thread the putIfAbscent call will return null. This Updater Thread will then run the costly operation, while other Threads will still return the previous value. Once the Updater Thread is finished, it will update the current value and remove the new current value from the ConcurrentMap. This is to prevent the Map from growing eternally. Sounds straightforward or?

Well obviously there was a problem, as the unit test was failing every now and then. Every time the test was failing, I could see that it was trying to run the costly operation twice for the same changed value, as indicated by the following log statement:

I started to believe, that the putIfAbsent method was buggy and returned null even if the value was already present in the Map. I asked another co-worker to check my code, to see if he could spot a problem. After a few minutes we realized that the problem was in my code - as to be expected. Like I said, I don't want the Map to grow forever. So the Updater Thread is removing the new changed value after the costly operation is finished. The problem is that another Thread could be waiting for the CPU in line 14. This is the line that invokes putIfAbscent. So once the Updater Thread is done, and the waiting Thread gets active, it will actually do exactly the same work again. Not good!

Our immediate solution was to not remove the Map entry after the Update Thread is finished. What we do instead is to remove the old value from the Map before reassigning the new changed value into currentVersion. As the Map will never contain more than 1 entry, it is always possible that a costly operation will be run again, even if a value has already been handled. This change only fixes the problem, that a single value change can trigger a consecutive execution of the costly operation.

git clone and remote end hung up unexpectedly

Yesterday morning before going to work, I created a git repository for a new hobby project of mine. I have done this a couple of time before and the git hosting provider of choice is Assembla. They are offering private git repositories and I never had any trouble in the past.

After creating the repository, I tried to clone it. I need to use sudo because I clone into a directory which is not owned by me. I am using the /web directory (or rather the directories under /web) directly as docroot for Apache.

sudo git clone git@git.assembla.com:my-new-repository.git
Initialized empty Git repository in /web/my-new-repository/.git/
Permission denied (publickey).
fatal: The remote end hung up unexpectedly

As you can see, something went wrong. I verified that my public ssh key was added to my Assembla account and it was. I decided it was a problem on the Assembla side and decided to try again this morning but the problem was still there. I created another git repository over at Bitbucket and tried again - same problem, wtf. Finally I had the great idea to try and clone the repository into my user directory and voila it worked. So it turned out that doing sudo and ssh public/private key authentication with git does not work. There is a good explanation about it on github.

If you are using sudo with git commands (e.g. using sudo git clone because you are deploying to a root-owned folder), ensure that you also generated the key using sudo. Otherwise, you will have generated a key for your current user, but when you are doing sudo git, you are actually the root user – thus, the keys will not match.