Having Maven create a nice zip File and separate Configuration

A co-worker and I are preparing a presentation about Amazon EC2 for other developers in my company. To show some stuff in action, we decided to write two JMS powered applications. One is sending messages, the other is receiving messages and persisting them into a database. During the presentation we will roll this out onto 3 EC2 nodes. Each application is build using Maven. To have it as convenient as possible, I changed the Maven package build phase to produce a single zip-file. The zip-file can be copied over to the EC2 node, where it is extracted. The zip-file contains one big "uber-jar" (with all the third party dependencies included) and a single properties-file to be able to set host and port for the JMS communication. We use ActiveMQ as JMS vendor in our projects.

Once the big zip-file has been extracted on the EC2 nodes and the properties have been set, you can start the producer and the consumer from the Main class. (Note the little dot infront of .:consumer... - this is needed so that the properties-file is found)

java -cp .:consumer-1.0-SNAPSHOT-final.jar package.MessageReceiver


For the packaging of the big zip-files, I use the Maven Assembly plugin during the package phase. The configuration looks like this:


<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
<id>final</id>
<formats>
<format>jar</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<dependencySets>
<dependencySet>
<unpack>true</unpack>
<scope>runtime</scope>
<useProjectArtifact>false</useProjectArtifact>
</dependencySet>
</dependencySets>
<fileSets>
<fileSet>
<directory>${project.build.outputDirectory}</directory>
<outputDirectory>/</outputDirectory>
<excludes>
<exclude>consumer.properties</exclude>
</excludes>
</fileSet>
</fileSets>
</assembly>
src/main/assembly/jar.xml

This explodes all third party dependency jar files and merges them into one big uber-jar file. The properties-file is excluded from the uber-jar (this name sounds hilarious if you are from Germany by the way).


<assembly xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.0 http://maven.apache.org/xsd/assembly-1.1.0.xsd">
<id>bin</id>
<formats>
<format>zip</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>${project.basedir}/src/main/resources</directory>
<outputDirectory/>
<includes>
<include>consumer.properties</include>
</includes>
</fileSet>
<fileSet>
<directory>${project.build.directory}</directory>
<outputDirectory/>
<includes>
<include>*-final.jar</include>
</includes>
</fileSet>
</fileSets>
</assembly>
src/main/assembly/zip.xml

This creates a zip-archive containing the "uber-jar" and the properties-file. Notice that the "uber-jar" has the suffix of "-final" equal to the id-attribute in the jar.xml file.


<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>package.MessageReceiver</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<descriptors>
<descriptor>src/main/assembly/jar.xml</descriptor>
<descriptor>src/main/assembly/zip.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
pom.xml

This code snippet runs the maven-assembly-plugin during the package phase.

Everything went well when we finished the producer application last week. Today I ran into some weird errors when I worked on the consumer end. Trying to start the MessageReceiver main class gave me the following error:


Caused by: org.xml.sax.SAXParseException: cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'amq:broker'.


I was able to resolve this error with the help of the ActiveMQ XML reference page, only to stumble into the next problem:


Caused by: org.springframework.beans.factory.parsing.BeanDefinitionParsingException: Configuration problem: Unable to locate Spring NamespaceHandler for XML schema namespace http://www.springframework.org/schema/context


It did not really jump me at first why this was happening. There a unit tests which load the Spring Context, they run fine. Maven runs the test in the package phase, they passed fine. So this was odd. After doing some research, I found out that the maven-assembly-plugin is responsible for this. Apparently Spring needs the spring.handlers and spring.schemas files to be present in the META-INF directory of the "uber-jar". A lot of other people had already hit the same problem before me. Some of them recommend the use of the maven-shade-plugin with the following setup:


<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.3.1</version>

<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>${artifactId}-${version}-final</finalName>
<transformers>
<transformer implementation="
org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>
com.sabre.newgermanrail.dbitool.DbiTool
</mainClass>
</transformer>
<transformer implementation="
org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.handlers</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
<transformer implementation="
org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
<resource>consumer.properties</resource>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
pom.xml

With the help of Transformers, the spring.handlers and spring.schemas files are added. If you are using Spring 3.x, there are many small spring-xyz.jar files and each of these comes with a spring.handlers and spring.schemas file. The maven-shade-plugin will append the content of each of these files using the AppendingTransformer. As you can see in the example above, I am again excluding my properties-file using the DontIncludeResourceTransformer. I decided to keep the zip-archiving part from the maven-assembly-plugin. Maybe this is something the maven-shade-plugin could do for me as well, not sure.

Interviewed at Google Pt.1

Today I want to blog about an on-site interview which I had at Google a while ago. Since I did not sign a non-disclosure agreement and only agreed on not taking photos in their office, I hope it is okay to write about this interview. Going to an on-site interview, is like reaching the next level, if Google is satisfied with your previous telephone interview(s). Due to summer-time and a lot of vacations, I had my on-site interview more than 1,5 months after doing the initial phone interview. This was very good, because it gave me extra time to brush up on some stuff. I read some books about data-structures and algorithms and practiced white-board coding a bit. I coded through a lot of sample questions from software interviews, which I was able to find on the net. However, all the questions during the interview were new to me, so I had to be spontaneous.

The job I was interviewed for, was Software Engineer in Test and the interview was held in Zürich, Usually the candidates are flying to the interview location one day in advance. That did not work for me, because I was already on another flight a day earlier. Instead Google booked two flights on the same day for me. which was OK. I arrived in time and signed in at the reception. As stated, you have to agree on not taking photos and you have to put some label identifying yourself on your shirt. Some minutes later I was picked up by my HR contact. She gave me a short office tour and set me up in one of the interview rooms. You could tell that Google is doing a lot for their employees. The Zürich office had a big fitness room with a personal coach. Free soft drinks everywhere, video gaming rooms, pool tables all that kind of stuff. My interview room was very small, maybe 3 times 3 meters and had a white-board. She handed me my interview schedule and I was surprised that it was six interviews, each 45 minutes long with a lunch break in between.

After some minutes the first interviewer came and we got into action right away. This first interview went really well. We talked a bit about continuous integration, my contribution to the Hudson project and about testing in general. Then the interviewer switched to a question about bash in Linux. We talked about how piping works under the hood. We spoke about the case where the first command is non-stopping one like "yes" or "tail -f". I learned that Linux is not executing the commands sequentially, like I thought it would. Rather it sends the output of a command to a buffer and the following command would use read and write blocks to work on the buffer. Time went and the next two interviewers came.

For some of the interviews, there are actually two people in the room. One of them being just a observer, that need to learn how to interview candidates. For the second interview, the interviewer directly put up a matrix on the white-board.


-4 -1 4 5
-3 0 6 10
1 8 11 15
17 19 22 30


I did not realize, that the matrix was set up in a way that each row and each column was ascending - from negative into positive numbers. My task then, was to come up with an algorithm that, given any number, would return true or false if the number was contained in the matrix. The first idea I had was looking at the upper left number, reading the right and lower neighbor and select the neighbor which would bring me closer to the target number. Surprisingly this worked out but the interviewer was able to find a negative example. For the next attempts I tried starting from the lower right corner, starting from the middle element looking at all 4 neighbors and even go row by row running binary search. The binary search algorithm would have worked but the interviewer indicated that this is not the best solution and that I should try a different starting location for my first approach. After some discussion we agreed that it would be possible to start lower left, so that the row would be ascending and the column would be descending. If the number, we were looking for, was bigger than the current item in the matrix (lower left at the start) I would go to the right neighbor. If it was lower I would go to the upper neighbor. This neighbor becomes the next item and the algorithm would again check right and upper neighbors until either the element was found or there was no way to go further. Finally I wrote some Java code for this and the second interview finished.

It was lunchtime and someone from Brazil picked me up for a lunch-date. We went to the Google cafeteria and I got a free lunch. We talked about different things and it was nice to relax a little bit from coding exercises.

Subversion Changes to Archive File

Just a simple one-liner that I want to share with you. Do you know this problem: you have made changes to one of your projects, some files have changed, some images were added. Now you need to move this to your live server. Most often I moved the files one by one based on their change dates. Today, just before committing the changes into Subversion, I had the idea to use the Subversion changes for creating a Tarball which I could copy over to and extract on my server.


svn st | grep -v '?' | awk '{ print $2 }' | xargs tar rvfz changes.tar


svn st - show Subversion changes (assuming that you have done all you svn add stuff before)
grep -v '?' - filter files which are not versioned, ie. IntelliJ project files
awk '{ print $2 }' - print only the file path and name
xargs tar rvfz changes.tar - add these files into changes.tar

All that is left is to transfer the file via SCP or FTP and extract it in the right location.

Google Phone Interview

A couple of months ago, I got contacted from a recruiter working at Google. It was about a Software Engineer in Test position in Stockholm. I was a bit surprised because this position has been on the net for a long time and it was more than a year ago since I sent them my CV. Even though I am not pro-actively looking for a new job, you cannot say no if Google is asking. I replied back that I was interested and she called me back two days later. Google's recruiting process is very different compared to other companies I applied at. First there will be a one or more interviews over the telephone, and if you are good at those, you will be invited to an on-site interview. The on-site interview is a whole day where you will meet 5 to 7 interviewers each for a 45 min interview. Each interviewer will then write together an evaluation of your interview. All of the evaluations are reviewed by the hiring committee and if they decide to hire you, as one of the last steps, your documents including CV is sent over to the US headquarter for the final go or no-go.

So when the recruiter called me, this was actually the first step in a long recruiting process. The only thing she was interested during the fist call, was the length of my notice period and if it was negotiable. After this she asked me two technical questions. First question for the worst case time complexity of Quicksort. Second question something about Radixsort complexity, but she could not read the question, so she asked what the difference between a HashMap and a HashSet was. The recruiter is not really a technical person. She was probably given a list of questions and answers to filter out the very bad candidates. My answers were okay and she told me that she would set up a telephone interview.

A week or two passed. Since I did not hear from her, I wrote a mail asking for the status of the telephone interview. Due to vacations it took a bit longer to set up the interview. My recruiter told me that a confirmation would be sent soon. To prepare myself, I should be looking at the Google Testing blog. Also she wrote, that the interviewer would be interested in my testing background and that he would be asking questions relevant to my coding and problem solving skills, algorithms, core computer science concepts, OOP and datastructures. It could also be, that he asked me to write a test plan or how to test an arbitrary Google product like Maps or how I would design a cache. Also we would be talking about my projects, problems I had found, tested and fixed and also what I would do with my Google 20% time. It would also be good to know some stuff about the company, like the founders, the products, the business model etc.

The interview however, went completely different. Someone else sent me a confirmation mail including a link to a Google Docs document. which I could share with the interviewer. It was also suggested, that I "warm" up a bit on topcoder.com in the "Software Competitions - Algorithms" arena, which I did. On the day of my phone interview, I was called exactly at 11am. The interviewer was very friendly and explained everything very carefully.

For the first task I was given two Lists of Integers, one sorted ascending the other sorted descending. I should write an algorithm (Java) in the Google docs document, which would take both Lists and return a combined List that is sorted. I remembered that in the merge phase of the Mergesort, you do a similar thing. Have a pointer on the first element of the first list and a pointer on the last element in the second list. Then copy back the smaller of the two elements and increase or decrease the appropriate pointer.

The second task was about anagrams. Given a list of words, I was asked for a datastructure in which you could store anagrams. I suggested to use a Hashtable. Each word is effectively a character array that could be sorted. So sort the characters and use this as the key for the Hash function. That way, elements having the same key are anagrams of each other.

In the last task I was introduced to a Skip List, which I never heard of before. My interviewer explained the Skip List a couple of times and I was asked to write an algorithm for finding an element in the List. I did okay I think and then he asked me if I had any questions. I asked a bit about the Stockholm office, what he was working with at Google, if they used Git for version controlling (they use Perforce) and some other questions. I asked about feedback but he said, he is never giving feedback directly. Instead the interviewer said, he would write an evaluation of the interview in the next couple of days. Also he said, if later on I was asked to do another phone interview, it did not mean I was good or bad in the first one.

One week or so passed. My recruiter called me and said that they liked my interview and want to meet me for an on-site interview. Because Stockholm was not big enough and not so many test managers worked there, they would like to have the interview in Zürich. I would be given further details and date suggestions later on. Google pays the flight upfront. The hotel must be paid by the candidate but the money can be reimbursed by sending an expense form to Google in Poland. For those candidates staying overnight, Google also pays the food (30 Euro, US-Dollar or Pound depending on where the interview was held). Rental car fees can also be reimbursed. I decided to pay for everything myself, since I was not staying over night. Sending some reimbursement form to Poland, seemed complicated. I was given 4 days to choose from for the interview. I picked one but they set up the time on another day anyways - weird. So I was flying on-site to Google in Switzerland, roughly two months after the initial contact.