Tuesday, March 22, 2011

Firefox 4, Twitter and NoSQL Elasticsearch

Here's the main message of this blog post: Firefox 4 was released, get it now. It's an amazing piece of work, done by, and for the community.


As Mozilla Metrics team members we prepared a dashboard to monitor this release using several sources of information, and one of them is twitter, basically, "what are people saying about us?"


This obviously requires a different approach compared to normal BI; We need almost realtime data, a blazing-fast response and search capabilities. Perfect opportunity to a NoSQL approach.


We already had a very good experience using solr, an enterprise search platform that runs on top of Lucene. This time we decided to use ElasticSearch, also an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Lucene. While the principle is very much the same as Solr, the fact that ES thinks in JSON for both query DSL and indexing documents add more BI-like features to it.


The result? Nothing less than astonishing! Using regular CDA to connect to elasticsearch, we were able to do our CDE dashboard development using the regular techniques. NoSQL datasource? SQL datasource? Excel? Don't care - it's all data.


Here's the screenshot as of... well, now (5pm, Europe/Lisbon time):




Important things to note:

  • At 5pm, firefox 4 noise is bigger than IE9 related noise for it's release date
  • User feedback is great!



Here's a quick screencast of it working, courtesy of Daniel Einspanjer, master of Metrics. Amazing stuff, thanks Daniel:


2 comments:

  1. I would be really cool to see a tutorial on how to build something like this with elasticsearch.

    ReplyDelete