Collective Intelligence in Action


I just finished one of the best books I have read in a long time. It is titled “Collective Intelligence in Action” from Manning Publications and is written by Satnam Alag. We were reading the book in a book circle, which my company runs twice a year.

Collective Intelligence is all about making web applications better by using intelligence gathered from user interactions and behavior. There are a lot of very successful web 2.0 applications out there, which harvest user intelligence and then use this data to improve the user experience. I liked this book because it was very different from other Java related books I have read. It is not focused on one particular technology or framework. It is not too code focused, even though quite mathematical sometimes. While I read it, I came up with all these cool new ideas how I could make my own websites better. I will start to implement a little bit of collective intelligence in the next few months. I can really recommend this book if you want to be inspired about some new advanced features for your web applications.

The book starts off by giving a brief overview about web 2.0 applications and collective intelligence. Then the author explains how users and items can be mapped to each other using either content based mapping or collaboration based mapping. It gets a bit mathematical here with some dot product computations and the cosine based similarity. Chapter 3 is all about tags, tagging and how to leverage tags in a web application. The chapters 4, 5 and 6 introduce some nifty tools that can be used to write a api based blog searcher and a web crawler. I have myself done this stuff when I build my web applications, so it was great to see alternative approaches. In the second and third part of the book the author introduces the different algorithms to make predictions or cluster users and items. You will learn about classification, regression, clustering etc. It is getting very theoretical and sometimes a bit hard to follow but very, very interesting. Since all the code examples in the book are in Java, Alag uses WEKA and JDM (Java Data Mining) to implement the algorithms.

The reader will also learn a bit Lucene, the popular text indexer and searching framework. Lucene is being used to do content based learning. The Lucene parts are pretty basic though and should be familiar to those of you, who have worked with Lucene before. The book ends with a practical example of how to build a recommendation engine similar to Amazon.

Who is the book for? I can recommend it to experienced Java developers who would like to try out some new things in their own web applications. There are tons of great ideas in this book. Having a website with a couple of hundred visitors per days is a plus if you practically want to unleash some collective intelligence.