Implementing a faceted search - Part One

First of all, I think I will continue this Blog in English. I saw in the logs, that a lot of users who find this Blog through a search engine, are coming from other countries than Germany. So it does not make sense to keep on posting in German.

Today I will blog about implementing a faceted search in Java. I guess most of you are not so familiar with what a facet search is all about. To be honest, I am not so familiar with all the nifty theory, linguistics and details myself. You have to imagine a faceted search, like a way to browse through a huge list of items. Every item in the list is attached with some metadata. This metadata is organized into facets or taxonomies and values or headings. An example facet could be „Age“ and the headings for that facet could be „13+“, „15-18“ or „21“. Some sample item, let's say a deck of poker card, could contain the heading „13+“. A good example is the eToys website I think. You start by seeing all the different facets and headings. Selecting one facet will narrow down you search result as well as the forward selections you can take. Faceted search is a very user-friendly way of browing a repository.

So how could you integrate a facet search into your own website? I did this more than 2 years ago for one of our websites. Back then, I came across a Java library called Facetmap. It was offering everything I needed to incorporate a facet search into a Servlet based web application. There were a lot of commercial tools out there, like Dieselpoint or Endeca. Unfortunately they are all very expansive. So I decided to buy a Gold license of Facetmap, which cost me around $700. After playing around with Facetmap for a while, I figured out that for our website, the free edition would have been totally sufficient. But it was good to buy it anyways as I got great implementation support and even access to the source code. Facetmap is not an open source library. Not even the Light version of Facetmap is.

Anyway, two years have passed now. So I thought it was time to try out the latest version of Facetmap, which is 2.1. Knowing, that Facetmap Light was quite powerful and sufficient for most of my websites, I wrote a small sample web application that would make use of the latest Facetmap version. Facetmap can be downloaded as a trial version and ships with a .war file. The file facetmapgold-2.1.war contains an example webapplication. However, most of the code is based on good old jsp and some taglibs, so I decided to base my test application on Apache Wicket, Google Guice and Maven 2.

In the first part of this guide, we should build the project so that everyone is at the same page. I created a pom.xml for Maven 2 that does most of the stuff for you. Download my file and extract it into any directory which will become your new project root that way. After extraction you find the pom.xml file in the root. Before you can use the project pom however, there is an additional step you have to perform. Unfortunately the Facetmap libraries are not available in any Maven 2 repository I know of. I took the Facetmap Light 2.0 .jar file, fixed the name and provided a custom .pom file, so that I could use Facetmap within my own local repository. All you have to do, is take this .zip file and extract it into the root of your local Maven 2 repository. Under Linux this would be the ~/.m2/repository directory. Under Windows I think it is C:/Documents and Settings//m2/repository.

Once you have extracted the .zip file to that location, double-check that you have the following jar file in your local Maven 2 repository: com/facetmap/facetmap/2.0/facetmap-2.0.jar

Now that you have done the prerequisites it's time to build the project. Go to your project root and type mvn-package. This will compile everything and create a .war file ready for use as a webapplication. My project depends on the following libraries (that are all handled through Maven):

facetmap light

To make it easier, I added the Jetty plugin into the Maven 2 build file. Just type mvn jetty:run to fire up a local Servlet container on port 8080 and deploy our new .war file to that container. You can access the webapplication by opening http://localhost:8080/web

Sorry for the poor layout. In the left column you will find the different facets and their headings. In the right column you will see all the items that match your current selection. When opening the page for the first time, you will see all the movies in the result list until you have made a first selection by clicking one of the links.

I will get to the implementation details soon but let's create a Eclipse or IntelliJ project first. Stop Jetty by pressing Ctrl+C in the Maven console. Now type mvn idea:idea or mvn eclipse:eclipse to create your project files for either one of the two famous IDE's. You are now ready to open the project and look at some source code (which we will do in the next part).