<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6644329693530300467</id><updated>2013-06-17T09:37:02.521+01:00</updated><category term='mobile'/><category term='pentaho ctools cde'/><category term='ctools cdc'/><category term='cdf'/><category term='cde ctools'/><category term='pentaho'/><category term='releases'/><category term='mdx'/><category term='dashboards'/><category term='elasticsearch'/><category term='community'/><category term='cbf'/><category term='ctools cdv'/><category term='ccc'/><category term='ctools cdg'/><category term='autoincludes'/><category term='general'/><category term='ctools'/><category term='book'/><category term='olap'/><category term='cda'/><category term='ctools infographic'/><category term='ommunity'/><category term='puppet'/><category term='kettle'/><category term='pentaho ctools marketplace'/><category term='saiku'/><category term='mondrian'/><category term='pdi'/><category term='mozilla'/><category term='bsh'/><category term='ctools releasecycle'/><category term='async'/><category term='cde'/><category term='reporting'/><category term='cst'/><title type='text'>Pedro Alves on Business Intelligence</title><subtitle type='html'>Eyes on the Pentaho planet, feet on the ground</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default?start-index=26&amp;max-results=25'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>135</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6477303478698969965</id><published>2013-06-14T15:20:00.001+01:00</published><updated>2013-06-14T15:20:36.147+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ommunity'/><category scheme='http://www.blogger.com/atom/ns#' term='pentaho'/><title type='text'>PCM13: Pentaho Community Event 2013 - Let's start!</title><content type='html'>&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-6CKpaXMTFA4/Ubsl5eXxPoI/AAAAAAAAAkc/D73brr4drVE/s1600/GroupPicture.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-6CKpaXMTFA4/Ubsl5eXxPoI/AAAAAAAAAkc/D73brr4drVE/s1600/GroupPicture.jpg" height="173" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;h3&gt;What&lt;/h3&gt;Let's get this going! &lt;br /&gt;&lt;br /&gt;#PCM13 is starting to get organized. For the ones that don't know, the  Pentaho Community Meetings have been happening since 2008 a bit all over  Europe. Last year was in &lt;a href="http://wiki.pentaho.com/display/COM/Community+Meetup+Amsterdam+2012" target="_blank"&gt;Amsterdam&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;It's one of my main objectives to end up with the silly "Enterprise /  Community" differentiation. This is an event for everyone with a  connection to Pentaho - customers, user, hackers, developers, students.  I'll try to bring more of the Pentaho people, but the main spirit will  remain the same! Sun, fun, huge amounts of beer and lots of  presentations. I'll personally slap hard anyone that even tries to  change that!&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Where &lt;/h3&gt;This year we're going to break 2 rules: We usually don't repeat the  event location and it's usually entirely sponsored by the community  itself.&lt;br /&gt;&lt;br /&gt;When I accepted the role of SVP for Community I put as condition that  we'd sponsor this year's event, and that we'd repeat Portugal and make  even a better event than the amazing event in &lt;a href="http://kjube.blogspot.de/2010/09/pentaho-community-gathering-live.html" target="_blank"&gt;Cascais&lt;/a&gt;. Location yet to be determined but probably in the amazing &lt;a href="http://en.wikipedia.org/wiki/Sintra" target="_blank"&gt;Sintra&lt;/a&gt; village.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;When&lt;/h3&gt;So let's start by setting the date in stone. There are 2 options on the  table: 27/28 of September or 5/6 of October. I personally prefer the  second, but can adapt - the community will chose in the end!&lt;br /&gt;&lt;br /&gt;So go to &lt;a href="http://forums.pentaho.com/showthread.php?144316-Pentaho-Community-Meetup-2013-Portugal!-Vote-for-the-date!" target="_blank"&gt;this thread on the Pentaho Foruns&lt;/a&gt; and take your pick!&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6477303478698969965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/06/pcm13-pentaho-community-event-2013-lets.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6477303478698969965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6477303478698969965'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/06/pcm13-pentaho-community-event-2013-lets.html' title='PCM13: Pentaho Community Event 2013 - Let&apos;s start!'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-6CKpaXMTFA4/Ubsl5eXxPoI/AAAAAAAAAkc/D73brr4drVE/s72-c/GroupPicture.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-5631340244744703695</id><published>2013-06-07T11:16:00.001+01:00</published><updated>2013-06-07T11:50:42.441+01:00</updated><title type='text'>Pentaho London User Group Meeting - Thursday 20th June, 6pm</title><content type='html'>&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;Hello everyone!&lt;/span&gt;&lt;/div&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;Exactly 2 months after and one ocean across the &lt;a href="http://www.pentahobrasil.com.br/eventos/pentahoday2013/" target="_blank"&gt;Pentaho Community event in Brazil&lt;/a&gt;, that had an all time record of about 200 atendees, we're having a User Group meeting in London. Shared points between both events? A common topic, amazing people willing to share their experiences and learn from others and the always fundamental beer and pizza! Unfortunately, the amazing Brazilian weather won't be the same. Most likely it will be cold, dark and wet! Welcome to London! :)&lt;/span&gt;&lt;/div&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;The event will be held at the &lt;a href="http://skillsmatter.com/go/find-us"&gt;Skills Matter Exchange&lt;/a&gt; in the Clerkenwell area of London on Thursday, June 20, 2013 at 6.00 pm. We're targeting the Pentaho Community, which in my definition includes customers, users, developers, basically anyone that's willing to spend some time helping to make the product better. It's one of my main goals as Senior VP of Community to create conditions for that to happen, and I'm interested in hearing ideas and feedback that anyone is willing to share.&lt;/span&gt;&lt;/div&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;Here's the current agenda:&lt;/span&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;&lt;b&gt;&lt;a href="https://twitter.com/mattcasters" target="_blank"&gt;Matt Casters&lt;/a&gt;, creator and architect of PDI / Kettle&lt;/b&gt;&lt;/span&gt;&lt;span style="font-size: medium;"&gt; will lead a demo and discussion on how Pentaho supports Hadoop and  big data analytics. &lt;/span&gt; &lt;/div&gt;&lt;/li&gt;&lt;li&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;&lt;b&gt;&lt;a href="https://twitter.com/tofusnafu" target="_blank"&gt;Dave Romano&lt;/a&gt; &lt;/b&gt;&lt;/span&gt;&lt;span style="font-size: medium;"&gt;will lead a talk on how big data  start-up Causata has been using Pentaho, specifically covering its  use of a custom repository, step plug-ins and embedding Kettle&lt;/span&gt; &lt;/div&gt;&lt;/li&gt;&lt;li&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;a href="https://twitter.com/pmalves" target="_blank"&gt;&lt;span style="font-size: medium;"&gt;&lt;b&gt;Pedro Alves&lt;/b&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-size: medium;"&gt; will present CPK, the Community Plugin Kickstarter, a tool that allow &lt;/span&gt;&lt;span style="font-size: medium;"&gt;&lt;span style="font-size: medium;"&gt;non-developers to create Pentaho Plugins&lt;/span&gt;&lt;/span&gt; &lt;/div&gt;&lt;/li&gt;&lt;li&gt;&lt;div style="margin-bottom: 0.14in; text-align: left;"&gt;&lt;a href="https://twitter.com/beardlazy" target="_blank"&gt;&lt;span style="font-size: medium;"&gt;&lt;b&gt;Simon Raybould&lt;/b&gt;&lt;/span&gt;&lt;/a&gt;&lt;span style="font-size: medium;"&gt; &lt;span style="font-size: small;"&gt;will&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size: medium;"&gt;describe the dashboard centric implementation at Found, heavily centered around Ctools and Mondrian&lt;/span&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;br /&gt;&lt;span style="font-size: medium;"&gt;Please note that most of the presentations are technically oriented, mainly of interest to consultants and developers. We invite you to propose discussions, technical presentations, user stories and hosted Q&amp;amp;A sessions to educate and inspire other users.&lt;/span&gt;&lt;/div&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;There will be plenty of time before and after the meetup for informal networking. For more information and to register, please &lt;a href="http://www.meetup.com/Pentaho-London-User-Group/events/119727912/"&gt;visit this link&lt;/a&gt;. On behalf of PLUG organiser &lt;a href="https://twitter.com/codek1" target="_blank"&gt;Dan Keeley&lt;/a&gt; and Pentaho, we hope to see you on June 20&lt;/span&gt;&lt;sup&gt;&lt;span style="font-size: medium;"&gt;th&lt;/span&gt;&lt;/sup&gt;&lt;span style="font-size: medium;"&gt;&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: medium;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div align="JUSTIFY" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-size: medium;"&gt;&lt;a href="https://twitter.com/pmalves" target="_blank"&gt;Pedro Alves&lt;/a&gt;, Senior VP of Community, Pentaho&lt;/span&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/5631340244744703695/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/06/pentaho-london-user-group-meeting.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/5631340244744703695'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/5631340244744703695'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/06/pentaho-london-user-group-meeting.html' title='Pentaho London User Group Meeting - Thursday 20th June, 6pm'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-4120772170636221189</id><published>2013-05-13T10:13:00.001+01:00</published><updated>2013-05-13T10:13:54.522+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ctools'/><title type='text'>Ctools Icon Set</title><content type='html'>Our UX team, tired of putting up to the speed where we develop new ctools, decided that it would be a good idea to build an official icon set that you can get&lt;a href="http://www.webdetails.pt/icon_set/icon_set_48px.html" target="_blank"&gt; from our website&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Obviously, we allow it to be available to anyone, so feel free to (ab)use it!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-NJoVwlDkNKM/UZCuxtr8Z2I/AAAAAAAAAi4/SXKX7zYemPA/s1600/ctoolsIcons.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-NJoVwlDkNKM/UZCuxtr8Z2I/AAAAAAAAAi4/SXKX7zYemPA/s1600/ctoolsIcons.png" height="245" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/4120772170636221189/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/05/ctools-icon-set.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/4120772170636221189'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/4120772170636221189'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/05/ctools-icon-set.html' title='Ctools Icon Set'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-NJoVwlDkNKM/UZCuxtr8Z2I/AAAAAAAAAi4/SXKX7zYemPA/s72-c/ctoolsIcons.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-7978491918395747169</id><published>2013-05-02T14:41:00.000+01:00</published><updated>2013-05-02T14:41:03.014+01:00</updated><title type='text'>Experimenting and sharing CCC2 charts</title><content type='html'>CCC, now on version 2, is a very powerful charting engine, with a lot of customization abilities. Like &lt;a href="http://www.on-reporting.com/" target="_blank"&gt;Thomas Morgner&lt;/a&gt; likes to quote, with great power comes great responsibility.&amp;nbsp; While a lot is possible, also a lot is not easy to do, and requires some knowledge.&lt;br /&gt;&lt;h3&gt;&lt;br /&gt;&lt;/h3&gt;&lt;h3&gt;The CCC website&lt;/h3&gt;&lt;br /&gt;The &lt;a href="http://www.webdetails.pt/" target="_blank"&gt;team&lt;/a&gt; made a huge effort to document CCC2. The &lt;a href="http://www.webdetails.pt/ctools/ccc.html" target="_blank"&gt;website&lt;/a&gt; and the &lt;a href="http://www.webdetails.pt/ctools/charts/jsdoc/symbols/pvc.html" target="_blank"&gt;CCC jsdocs&lt;/a&gt;, complemented by &lt;a href="http://mbostock.github.io/protovis/jsdoc/" target="_blank"&gt;Protovis Jsdocs&lt;/a&gt;, provides a great amount of resources for knowing how to best solve a problem&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-o3nmGoJe0qQ/UYJrPD-CE7I/AAAAAAAAAiM/vGcZEu3wak0/s1600/cccFiddle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-o3nmGoJe0qQ/UYJrPD-CE7I/AAAAAAAAAiM/vGcZEu3wak0/s1600/cccFiddle.png" height="226" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;JsFiddle&lt;/h3&gt;&lt;br /&gt;There's also a great, well known, resource for sharing experiments with javascript, &lt;a href="http://jsfiddle.net/" target="_blank"&gt;JsFiddle&lt;/a&gt;. We can also setup a CCC2 playground that people can use and fork to play with.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://jsfiddle.net/duarteleao/7maGD/" target="_blank"&gt;Here's a fiddle&lt;/a&gt; with a sample of CCC2 barchart with annotations.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-YaWksRtEzx4/UYJsf6GVbeI/AAAAAAAAAiY/1_sFrsQJcAk/s1600/cccFiddle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-YaWksRtEzx4/UYJsf6GVbeI/AAAAAAAAAiY/1_sFrsQJcAk/s1600/cccFiddle.png" height="226" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/7978491918395747169/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/05/experimenting-and-sharing-ccc2-charts.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7978491918395747169'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7978491918395747169'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/05/experimenting-and-sharing-ccc2-charts.html' title='Experimenting and sharing CCC2 charts'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-o3nmGoJe0qQ/UYJrPD-CE7I/AAAAAAAAAiM/vGcZEu3wak0/s72-c/cccFiddle.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-3038138574226109480</id><published>2013-04-22T01:29:00.004+01:00</published><updated>2013-04-22T13:40:07.818+01:00</updated><title type='text'>A new challenge - Webdetails joins Pentaho</title><content type='html'>&lt;h3&gt;The Announcement &lt;/h3&gt;Now here's a blog post that a while back I wouldn't even think about writing.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I'm very happy to announce that &lt;a href="http://www.webdetails.pt/" target="_blank"&gt;Webdetails&lt;/a&gt; will join &lt;a href="http://www.pentaho.com/" target="_blank"&gt;Pentaho&lt;/a&gt;!&lt;/b&gt; Here's an excerpt from the press release:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;style type="text/css"&gt;P { margin-bottom: 0.08in; direction: ltr; color: rgb(0, 0, 0); line-height: 115%; widows: 2; orphans: 2; }P.ctl {  }A:link { color: rgb(42, 51, 132); text-decoration: none; }A.western:link {  }A.cjk:link {  }A.ctl:link { font-family: "Times New Roman"; }&lt;/style&gt;  &lt;br /&gt;&lt;div class="western" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-family: Calibri, serif;"&gt;&lt;b&gt;ORLANDO, FLA&lt;/b&gt;&lt;/span&gt;&lt;span style="font-family: Calibri, serif;"&gt; - April 22, 2013 – Delivering the &lt;a class="western" href="http://www.pentaho.com/explore/"&gt;future of analytics&lt;/a&gt;, Pentaho announced today that it has completed the acquisition of its Portugal-based consulting partner &lt;a class="western" href="http://www.webdetails.pt/"&gt;Webdetails&lt;/a&gt;. Pentaho will benefit from Webdetails’ visual interface development expertise and international consulting services provided by its 20-strong team. Webdetails’ founder Pedro Alves is a high-profile member of Pentaho’s open source community and will take on the new role of Senior VP, Community for Pentaho.&lt;/span&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;br /&gt;All the team at webdetails feel incredibly proud for the vote of confidence, and very excited to keep working as hard we can but now with the new goal of improving the pentaho platform and the community ecosystem. Oh! - wait, we've been doing that for the past 5 years! :p&lt;br /&gt;&lt;br /&gt;The daily work of the now Portuguese pentaho crew won't change much - we'll still operate and provide services under the the name Webdetails (now with an additional tag &lt;i&gt;a pentaho company&lt;/i&gt;), have our UX team rock and rolling and keep creating the ctools until we run out of letters on the alphabet. &lt;br /&gt;&lt;br /&gt;We'll just have the huge opportunity of doing this things at a much larger scale. The plan to conquer the world advances as planned! &lt;a href="http://www.youtube.com/watch?v=pVY1-v97Mic" target="_blank"&gt;&lt;i&gt;&lt;insert evil="" here="" laughter=""&gt;&lt;/insert&gt;&lt;/i&gt;&lt;/a&gt;&lt;br /&gt;&lt;h3&gt;&amp;nbsp;&lt;/h3&gt;&lt;h3&gt;The Community&lt;/h3&gt;I had the pleasure to make the announcement first hand in the &lt;a href="http://www.pentahobrasil.com.br/eventos/pentahoday2013/" target="_blank"&gt;Pentaho Day 2013&lt;/a&gt;, the amazing Brazilian community event that had over 180 people that chose to spend a Saturday on a room with a faulty A/C talking about a subject that unites them.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Qd4mrTq6UIo/UXR8hCs9keI/AAAAAAAAAho/YPNWgAUCXpg/s1600/foto-oficial-pentaho-day-brasil-2013-small.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="130" src="http://4.bp.blogspot.com/-Qd4mrTq6UIo/UXR8hCs9keI/AAAAAAAAAho/YPNWgAUCXpg/s1600/foto-oficial-pentaho-day-brasil-2013-small.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="western" style="margin-bottom: 0.14in;"&gt;&lt;b&gt;I'll take the new role of SVP of Community for Pentaho&lt;/b&gt;, where I'll try               &lt;style type="text/css"&gt;P { margin-bottom: 0.08in; direction: ltr; color: rgb(0, 0, 0); line-height: 115%; widows: 2; orphans: 2; }P.ctl {  }A:link { color: rgb(42, 51, 132); text-decoration: none; }A.western:link {  }A.cjk:link {  }A.ctl:link { font-family: "Times New Roman"; }&lt;/style&gt;&lt;span style="font-family: Calibri, serif;"&gt;to be "&lt;i&gt;the chief advocate and interface to Pentaho’s active open source developer community&lt;/i&gt;". As you may have guessed the bit on italics was copy-pasted directly from the aforementioned press release. I don't even know what that means! &lt;/span&gt;&lt;/div&gt;&lt;div class="western" style="margin-bottom: 0.14in;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="western" style="margin-bottom: 0.14in;"&gt;&lt;span style="font-family: Calibri, serif;"&gt;As I had the opportunity to present to the audience in Brazil, the guys at pentaho that thought it would be a good idea to put me on that role have no idea what's going to happen!&lt;/span&gt;&lt;/div&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-cBbWq46VXAA/UXR8o6I1e3I/AAAAAAAAAh0/j1eZJSAnWKg/s1600/01-face.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="174" src="http://2.bp.blogspot.com/-cBbWq46VXAA/UXR8o6I1e3I/AAAAAAAAAh0/j1eZJSAnWKg/s1600/01-face.png" width="200" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;&lt;i&gt;This is not me&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;Building a community is not about going to events on tropical countries, drinking huge amounts of liquor and saying how cool pentaho is (&lt;i&gt;wait, scratch the part about liquor and going to tropical countries, I do like that&lt;/i&gt;) &lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-6E_V2VGOsUo/UXR8d_ROh1I/AAAAAAAAAhc/VtFDtTot1jw/s1600/03-devil.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="176" src="http://1.bp.blogspot.com/-6E_V2VGOsUo/UXR8d_ROh1I/AAAAAAAAAhc/VtFDtTot1jw/s1600/03-devil.png" width="200" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;&lt;i&gt;This is me!&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;A community doesn't exist because a company wants it to. &lt;b&gt;A community is a reflex of the strategy of a company&lt;/b&gt;. People will only gather around a common cause - any cause - if they feel connected to it. &lt;br /&gt;&lt;br /&gt;I'll be doing a lot of evangelization work, but not externally. It's an inside job, where I'll try to share what the community is doing, feeling, saying and fight to make that information reach the product and the strategy.&lt;br /&gt;&lt;br /&gt;I do believe there's a lot to be done, and it's not an easy task to try to achieve a good balance between the Enterprise and Community objectives, but I do believe there's lots of room for improvement!&lt;br /&gt;&lt;br /&gt;Please feel free to come to me for any suggestions, complaints, rants, whatever. Pentaho needs to make a good job in allowing people from the community to succeed. It's the only way this works. We're not a bunch of good guys. At least I'm not! Only when people feel they're getting something they'll be willing to give something back.&lt;br /&gt;&lt;br /&gt;Community is all about creating Opportunity!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-pedro</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/3038138574226109480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/04/a-new-challenge-webdetails-joins-pentaho.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3038138574226109480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3038138574226109480'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/04/a-new-challenge-webdetails-joins-pentaho.html' title='A new challenge - Webdetails joins Pentaho'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-Qd4mrTq6UIo/UXR8hCs9keI/AAAAAAAAAho/YPNWgAUCXpg/s72-c/foto-oficial-pentaho-day-brasil-2013-small.png' height='72' width='72'/><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6444348537791580072</id><published>2013-03-27T12:18:00.000Z</published><updated>2013-03-27T12:18:29.656Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ctools'/><title type='text'>New Ctools Releases: 13.03.25</title><content type='html'>Easter release! &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;CDF&lt;/h3&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: x-small;"&gt;CCC V2 integration!&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0);"&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * CCC V2 integration&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added capability to detect session timeout and request new credentials (instead of silently failing)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Feature: Support for extra options in select component &amp;nbsp; &amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Feature: &amp;nbsp;Add clickAction &amp;amp; expand selectors to table component&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed Heatgrid Sample&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fix: Multiselect component now checks for null values regardless of the plugin used.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fix [REDMINE-1822] - Tooltip not showing in select components&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fix [REDMINE-309] - td.expandingClass now has a dummy click for it to work as intended.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fix: Column format in table component now takes hidden columns into account. Ex: If column 0 is hidden: - Now: format for column 1 is colFormats[1] - Previously: format for column 1 was colFormats[0] &lt;/span&gt;&lt;br /&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;CDE&lt;/h3&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: x-small;"&gt;Support for CCC V2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Support for CCC V2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Update siteMap component to accept an url from where to fetch the siteMap json using an ajax call.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Implemented [Redmine-93] - &amp;nbsp;Close the Popup Component&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [Redmine-1840] - External resource editing is failing&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [Redmine-345] - Scrollbars in PopupComponent&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [Redmine-1874] - &amp;nbsp;Remplate import should now work as intended&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;h3&gt;CDA&lt;/h3&gt;&lt;span style="font-size: xx-small;"&gt;Full c&lt;span style="background-color: rgba(255, 255, 255, 0);"&gt;hangelog:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [REDMINE-1851] - Added conditions that protect the TableComponent from empty metadata queries (eg. Select include mode with output options [1,0] and an empty query)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [REDMINE-1881] - IntegerArrays not working on sql queries&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Reintroduced deprecated SQLReportDataFactory.setQuery for pentaho 3.9 support&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;GGG&lt;/h3&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: x-small;"&gt;&amp;nbsp; &amp;nbsp; * Support for CCC v2&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: x-small;"&gt;&amp;nbsp; &amp;nbsp; * Internal architectural changes - refactored cgg to cgg-core and cgg-pentaho.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;h3&gt;CDV&lt;/h3&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="background-color: rgba(255, 255, 255, 0); font-size: xx-small;"&gt;*&amp;nbsp;There were some edge cases where parameters were not being correctly parsed in cdv's validation dashboard&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6444348537791580072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/03/new-ctools-releases-130325.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6444348537791580072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6444348537791580072'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/03/new-ctools-releases-130325.html' title='New Ctools Releases: 13.03.25'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-3891122705335918763</id><published>2013-03-15T11:33:00.002Z</published><updated>2013-03-15T11:33:56.734Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='community'/><title type='text'>Pentaho Day 2013 in Brazil, Fortaleza</title><content type='html'>I'm pretty excited about this. The &lt;a href="http://br.groups.yahoo.com/group/pentahobr/" target="_blank"&gt;Brazilian pentaho community&lt;/a&gt;, one of the most active ones I know of, is organizing the second community event in Brazil, this time in Fortaleza, April 20&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-ncUUzaLwHZI/UUME7f8YXNI/AAAAAAAAAg8/sO9DK85LkYc/s1600/poster.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-ncUUzaLwHZI/UUME7f8YXNI/AAAAAAAAAg8/sO9DK85LkYc/s1600/poster.jpg" height="320" width="237" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;It's gonna be as great as all the other ones. Registrations can be done &lt;a href="http://pentahoday2013-es2004.eventbrite.com/?rank=1" target="_blank"&gt;here&lt;/a&gt; and discussion space will be in the &lt;a href="https://www.facebook.com/pentahobrasil" target="_blank"&gt;pentahobrasil&lt;/a&gt; facebook page.&lt;br /&gt;&lt;br /&gt;The event agenda will be shared later, as we're giving the community the opportunity to talk about their own experiences.&lt;br /&gt;&lt;br /&gt;This is a joint organization of all pentaho consulting companies in Brazil in an amazing display of cooperation. I'll be there! :)</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/3891122705335918763/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/03/pentaho-day-2013-in-brazil-fortaleza.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3891122705335918763'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3891122705335918763'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/03/pentaho-day-2013-in-brazil-fortaleza.html' title='Pentaho Day 2013 in Brazil, Fortaleza'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-ncUUzaLwHZI/UUME7f8YXNI/AAAAAAAAAg8/sO9DK85LkYc/s72-c/poster.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-3152518414683646503</id><published>2013-02-27T10:52:00.000Z</published><updated>2013-02-27T10:52:39.909Z</updated><title type='text'>CPK - Community Plugin Kickstarter - Request For Comments</title><content type='html'>&lt;h1&gt;CPK&lt;/h1&gt;One more Ctool in the works. Once again I'm calling out for comments, suggestions, whatever. Let me know what you thing and suggestions for improvements, or even if you think this makes no sense at all.&lt;br /&gt;&lt;br /&gt; &lt;h2&gt;Motivation&lt;/h2&gt;&lt;a href="http://www.pentaho.com/"&gt;Pentaho&lt;/a&gt; is very well known for being a very good Business Analytics software, but is in fact much more than that; Pentaho is a great platform to build on top of. &lt;br /&gt; Using an easy analogy, I see Pentaho acting as an operating sytem where people can build Application on top of&lt;br /&gt; &lt;h2&gt;Objective&lt;/h2&gt;The goal of CPK is to provide a simple and easy way to develop pentaho plugins that behave like packaged applications, simplifying it's structure. &lt;br /&gt; The UI is built using &lt;a href="http://cde.webdetails.org/"&gt;CDE&lt;/a&gt;, with a simple methodology to create new dashboards / pages and a sitemap that provides simple navigation and a default template&lt;br /&gt; There are 3 options for doing server-side code:&lt;br /&gt; &lt;ol&gt;&lt;li&gt;&lt;a href="http://kettle.pentaho.org/"&gt;Kettle&lt;/a&gt; transformations&lt;/li&gt;&lt;li&gt;Javascript server side code execution&lt;/li&gt;&lt;li&gt;Java classes&lt;/li&gt;&lt;/ol&gt;The first two are recommended, since it's easier to register new endpoints just by dropping code in a directory and no compilation is necessary.&lt;br /&gt; With this approach not only we expect to make it easier and faster to develop plugins, we also hope to lower down the specific technical requirements to build them. The end goal is that business consultants are able to build new plugins, not requiring specific java knowledge.&lt;br /&gt; &lt;h2&gt;Structure&lt;/h2&gt;This is the resulting plugin structure. Ideally, no compilation is necessary, so everything except maybe the lib directory could be stored in a &lt;em&gt;VCS&lt;/em&gt; system.&lt;br /&gt; This is the proposed stub configuration&lt;br /&gt; &lt;em&gt;to be completed&lt;/em&gt;&lt;br /&gt; &lt;pre&gt;&lt;code&gt;CPK_Plugin&lt;br /&gt;|-- conf&lt;br /&gt;|-- dashboards&lt;br /&gt;|-- endpoints&lt;br /&gt;|-- lib&lt;br /&gt;`-- plugin.xml&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;CPK administrative features&lt;/h2&gt;Besides providing the regular templating for creating new plugins, CPK can have an administrative UI with the following features:&lt;br /&gt; &lt;ul&gt;&lt;li&gt;List existing plugins&lt;/li&gt;&lt;li&gt;Detect if the plugins are up to date with the latest version of CPK&lt;/li&gt;&lt;li&gt;Allow the creation of new plugins&lt;/li&gt;&lt;li&gt;Allow to change plugin metadata&lt;/li&gt;&lt;li&gt;List and register new endpoints (UI and code)&lt;/li&gt;&lt;/ul&gt;Here's a list of stretch goals / nice to have&lt;br /&gt; &lt;ul&gt;&lt;li&gt;Import UI from existing dashboards in solution&lt;/li&gt;&lt;li&gt;Allow editing dashboards from this UI&lt;/li&gt;&lt;li&gt;Submit marketplace metadata to &lt;em&gt;Pentaho&lt;/em&gt;&lt;/li&gt;&lt;li&gt;Generate distribution zip package&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Updates&lt;/h2&gt;As the CPK framework or any of it's dependencies gets improved, the plugins themselves can't stay outdated. There will be a version information attached to the &lt;em&gt;CPK plugin version&lt;/em&gt; so that it's possible to upgrade to the latest version.&lt;br /&gt; &lt;h2&gt;Dependencies&lt;/h2&gt;CPK will have as little code as possible, making it as simple as possible to develop plugins. However, it will need a few dependencies:&lt;br /&gt; &lt;ul&gt;&lt;li&gt;Pentaho&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/webdetails/cpf"&gt;CPF&lt;/a&gt; - Community Plugin Framework, with   the common set of code for the plugins&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/webdetails/cde"&gt;CDE&lt;/a&gt; - Community Dashboard Editor&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/webdetails/cdf"&gt;CDF&lt;/a&gt; - Community Dashboard Framework&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/webdetails/cda"&gt;CDA&lt;/a&gt; - Community Data Access&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Link with Pentaho Marketplace&lt;/h2&gt;Once a plugin is developed, and the authors think it's in a state that can be shared, CPK will be able to generate a packaged plugin and metadata information so it can be integrated into Pentaho's marketplace. &lt;br /&gt; Pentaho will then be able to categorize / approve the plugin so that it becomes available to other users through the marketplace&lt;br /&gt; &lt;h2&gt;License&lt;/h2&gt;This project uses &lt;a href="http://www.mozilla.org/MPL/2.0/"&gt;MPLv2&lt;/a&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/3152518414683646503/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/cpk-community-plugin-kickstarter.html#comment-form' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3152518414683646503'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/3152518414683646503'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/cpk-community-plugin-kickstarter.html' title='CPK - Community Plugin Kickstarter - Request For Comments'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-8926585120396576887</id><published>2013-02-22T12:23:00.000Z</published><updated>2013-02-22T12:23:20.915Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='cbf'/><category scheme='http://www.blogger.com/atom/ns#' term='puppet'/><category scheme='http://www.blogger.com/atom/ns#' term='mozilla'/><title type='text'>CBF - Build RPMs automatically</title><content type='html'>&lt;h3&gt;Motivation&lt;/h3&gt;&lt;a href="http://cbf.webdetails.org/" target="_blank"&gt;CBF&lt;/a&gt; - Community Build Framework - is an amazing tool for being able to maintain several projects, compile pentaho and even doing remote deploys using &lt;i&gt;rsync&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;However, there are cases where rsync is not the best way to maintain a server installation. At &lt;a href="http://www.mozilla.com/" target="_blank"&gt;Mozilla&lt;/a&gt;, IT uses &lt;a href="https://puppetlabs.com/" target="_blank"&gt;Puppet&lt;/a&gt; to maintain everything related to automated server maintenance. The way CBF is structured allows us to define rules to automate solution checkouts, but we need to automate the installation of 3 extra bits:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Pentaho webapp&lt;/li&gt;&lt;li&gt;Pentaho solution (the &lt;i&gt;system/&lt;/i&gt; and &lt;i&gt;admin/&lt;/i&gt; directories)&lt;/li&gt;&lt;li&gt;Administration / Enterprise console &lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;Approach&lt;/h3&gt;The cluster uses &lt;a href="http://www.redhat.com/promo/Red_Hat_Enterprise_Linux6/" target="_blank"&gt;RHEL6&lt;/a&gt; distribution, so we chose to pack the distributions using &lt;a href="http://en.wikipedia.org/wiki/RPM_Package_Manager" target="_blank"&gt;rpm&lt;/a&gt; format. This should work for any rpm-based distribution. If someone wants to do the same for &lt;a href="http://en.wikipedia.org/wiki/Deb_%28file_format%29" target="_blank"&gt;deb&lt;/a&gt; or something else, just get in touch or simply &lt;a href="https://github.com/webdetails/cbf" target="_blank"&gt;fork CBF on github&lt;/a&gt; and send a pull request. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Building RPMs through CBF&lt;/h3&gt;CBF now has 2 extra options (be sure to grab the latest version from &lt;a href="https://github.com/webdetails/cbf" target="_blank"&gt;&lt;i&gt;github&lt;/i&gt;&lt;/a&gt;):&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;create-rpm-dist - Creates a tgz file with pentaho, pentaho-style and ROOT folders - to be used with a puppet script&lt;br /&gt;&amp;nbsp;create-rpms - Creates rpms based on the specs found on the config/rpm folder for the solution&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&amp;nbsp;In order for this to be usable, we need to create a directory called rpm under config, and place there the .spec files for the rpm's we're creating. We will also need a specific makefile.&lt;br /&gt;&lt;br /&gt;This is the structure:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;code&gt;project-metrics&lt;br /&gt;├── config&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── build-ee-pedro.properties&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── build-pedro.properties&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── build.properties&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── rpm&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── makefile&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── webdetails-pentaho-adminconsole.spec&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── webdetails-pentaho-adminconsole-stage.spec&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── webdetails-pentaho-solution.spec&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── webdetails-pentaho-solution-stage.spec&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     ├── webdetails-pentaho-webapp.spec&lt;br /&gt;│&amp;nbsp;&amp;nbsp;     └── webdetails-pentaho-webapp-stage.spec&lt;br /&gt;├── etl&lt;br /&gt;├── patches&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── project-metrics&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── target-build&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── target-dist&lt;br /&gt;├── patches-ee&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── enterprise-console&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── server&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── solution&lt;br /&gt;└── solution -&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;gt;&lt;/span&gt; ../project-metrics-solution&lt;br /&gt;&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;/blockquote&gt;&lt;br /&gt;&amp;nbsp;You'll see that I'm building here 6 RPMs, for a production and stage environment.&lt;br /&gt;&lt;br /&gt;You can &lt;a href="http://www.webdetails.pt/ficheiros/rpm-sample.zip" target="_blank"&gt;download the entire set of files from here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Here's a sample of the file &lt;i&gt;webdetails-pentaho-solution.spec&lt;/i&gt;:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: xx-small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;code&gt;&lt;br /&gt;Name:  Webdetails-Pentaho-Webapp&lt;br /&gt;Version: @VERSION@&lt;br /&gt;Release: 1%{?dist}&lt;br /&gt;Summary: Pentaho webapp customized for Mozilla metrics&lt;br /&gt;Group:  Applications/Engineering&lt;br /&gt;License: MPL-1.0&lt;br /&gt;Source:     webdetails-pentaho.tgz&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;%description&lt;br /&gt;&lt;br /&gt;%prep&lt;br /&gt;rm -rf %{buildroot}&lt;br /&gt;%build&lt;br /&gt;tar zxvf ../SOURCES/webdetails-pentaho.tgz &lt;br /&gt;&lt;br /&gt;%install&lt;br /&gt;mkdir %{buildroot}&lt;br /&gt;make DESTDIR=%{buildroot} install&lt;br /&gt;&lt;br /&gt;%clean&lt;br /&gt;rm -rf %{buildroot}&lt;br /&gt;&lt;br /&gt;%files&lt;br /&gt;%defattr(-,root,root,-)&lt;br /&gt;%doc&lt;br /&gt;&lt;br /&gt;/opt/pentaho/pentaho-server/server/webapps/pentaho&lt;br /&gt;/opt/pentaho/pentaho-server/server/webapps/pentaho-style&lt;br /&gt;/opt/pentaho/pentaho-server/server/webapps/ROOT&lt;br /&gt;&lt;br /&gt;%changelog&lt;br /&gt;&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;And here's the makefile that is called by the specs. I chose to install the pentaho server under &lt;i&gt;/opt/pentaho/&lt;/i&gt;, but this can be changed to any desired path:  &lt;span style="font-size: xx-small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;code&gt;&lt;br /&gt;&lt;br /&gt;all: install&lt;br /&gt;    &lt;br /&gt;install:&lt;br /&gt; mkdir $(DESTDIR)/opt&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server/server/&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server/server/webapps&lt;br /&gt; cp -r pentaho $(DESTDIR)/opt/pentaho/pentaho-server/server/webapps/pentaho&lt;br /&gt; cp -r pentaho-style $(DESTDIR)/opt/pentaho/pentaho-server/server/webapps/pentaho-style    &lt;br /&gt; cp -r ROOT $(DESTDIR)/opt/pentaho/pentaho-server/server/webapps/ROOT    &lt;br /&gt;&lt;br /&gt;install-solution:&lt;br /&gt; mkdir $(DESTDIR)/opt&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-solution&lt;br /&gt; cp *.* $(DESTDIR)/opt/pentaho/pentaho-solution/&lt;br /&gt; cp -r system $(DESTDIR)/opt/pentaho/pentaho-solution/system&lt;br /&gt; cp -r admin $(DESTDIR)/opt/pentaho/pentaho-solution/admin&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;install-stage:&lt;br /&gt; mkdir $(DESTDIR)/opt&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server-stage&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server-stage/server  &lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-server-stage/server/webapps&lt;br /&gt; cp -r pentaho $(DESTDIR)/opt/pentaho/pentaho-server-stage/server/webapps/pentaho&lt;br /&gt; cp -r pentaho-style $(DESTDIR)/opt/pentaho/pentaho-server-stage/server/webapps/pentaho-style    &lt;br /&gt; cp -r ROOT $(DESTDIR)/opt/pentaho/pentaho-server-stage/server/webapps/ROOT    &lt;br /&gt;&lt;br /&gt;install-solution-stage:&lt;br /&gt; mkdir $(DESTDIR)/opt&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho&lt;br /&gt; mkdir $(DESTDIR)/opt/pentaho/pentaho-solution-stage&lt;br /&gt; cp *.* $(DESTDIR)/opt/pentaho/pentaho-solution-stage/ &lt;br /&gt; cp -r system $(DESTDIR)/opt/pentaho/pentaho-solution-stage/system&lt;br /&gt; cp -r admin $(DESTDIR)/opt/pentaho/pentaho-solution-stage/admin&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;h3&gt;Dependencies&amp;nbsp;&lt;/h3&gt;Note that this is depends on having &lt;i&gt;rpmbuild&lt;/i&gt; installed on your system. Would work on linux and mac, and even eventually on windows with &lt;i&gt;cygwin&lt;/i&gt;. &lt;br /&gt;&lt;br /&gt;After running the &lt;i&gt;create-rpms&lt;/i&gt; ant target, you should get the RPMs in the &lt;i&gt;dist/rpm/RPMS/&lt;/i&gt; directory. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Next steps&lt;/h3&gt;You probably know what you want to do with the RPMs. You can either install them directly, or if you're using &lt;i&gt;puppet&lt;/i&gt; you may want to set up a yum repository. Lots of tutorials on how to do this, &lt;a href="http://www.techrepublic.com/blog/opensource/create-your-own-yum-repository/609" target="_blank"&gt;here's one&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Packaging Enterprise Edition&lt;/h3&gt;&lt;br /&gt;By default, this targets know how to handle the CE version. We can change a few options and tell it to pack the Enterprise Edition, just by setting some extra options pointing to a different location where we unpacked the EE version:&lt;br /&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;code&gt;&lt;br /&gt;&lt;br /&gt;######################################&lt;br /&gt;## EE BASED RPM BUILD&lt;br /&gt;######################################&lt;br /&gt;&lt;br /&gt;rpm.source.webapp = /home/pedro/tex/pentaho/ee/4.8/biserver-ee/tomcat/webapps/&lt;br /&gt;rpm.source.solution = /home/pedro/tex/pentaho/ee/4.8/biserver-ee/pentaho-solutions/&lt;br /&gt;rpm.source.administration-console = /home/pedro/tex/pentaho/ee/4.8/enterprise-console&lt;br /&gt;&lt;br /&gt;&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/8926585120396576887/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/cbf-build-rpms-automatically.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8926585120396576887'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8926585120396576887'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/cbf-build-rpms-automatically.html' title='CBF - Build RPMs automatically'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6131498777787728605</id><published>2013-02-12T18:37:00.000Z</published><updated>2013-02-12T18:37:38.759Z</updated><title type='text'>New Ctools releases - 13.02.07</title><content type='html'>Here it goes, a new release available:&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;CDF&lt;/h3&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Major Change: Async behavior and whole new component lifecycle.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Introduced Async Behavior and new component life cycle&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Initial Storage fetch moved to server side&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added a separate resource for all js shims&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Refactored CoreComponents into smaller pieces&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Fixed require.js dependencies. Had to reintroduce a resource called  CoreComponents so that EE 4.8 still finds it when resolving  dependencies&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Fix styling issues with widgets&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Upgraded jquery sparkline to 2.1&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Fix 'notNullMeasure' function in OlapUtils&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added Jasmine testing library&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added support for secondAxis minimum and maximum value (by Andrea Torre)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Add support for arbitrary user data in Views&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Set TableComponent to use async ajax&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;changed deps to newer pentaho version&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Implemented legacyProperty support&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Changed the way "earliestDate" and "latestDate" are retrieving  their value. Either set by a property, or with a default value (no  Dashboards.getParamaterValue)&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;CDE&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Major Change: Support for async components and their new lifecycle.&lt;/span&gt;&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Async behavior for components&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Add NewSelector component &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added javascript unit tests&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Exposing 4 properties (2 for setting date, 2 for setting offsets) under the date range input component. Removed the "getParameterValue" call for setting up the date, as it looked unpractical.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added method that allows file/folder deletion&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added support for css url transformation for components from other plugins&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Allowed res to work with symlinks&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;CD&lt;span style="font-size: small;"&gt;A&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added JSONP support to CDA&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added binary exporter&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Implemented setMdxDataFactoryBaseConnectionProperties. Mdx quereies, when executed in 4.8, will pass the correct&lt;span style="font-size: xx-small;"&gt; &lt;/span&gt;properties that are defined in olap/datasources.xml, namely UseContentChecksum=true;JdbcConnectionUuid=FOOBAR&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Support for blobs encoded as java.sql.Blob&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;CacheActivator: only commit if no exception, rollback otherwise&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Upgraded Hazelcast to version 2.4&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h3&gt;&lt;span style="font-size: xx-small;"&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;CD&lt;span style="font-size: small;"&gt;C&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h3&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Fixed timeToLive bug&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Updated Hazelcast to 2.4&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Can now properly persist cache contents with stock mondrian;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Added new settings: master, syncCacheOnStart&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: xx-small;"&gt;Fixed CDC after async and component lifecycle changes &lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6131498777787728605/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/new-ctools-releases-130207.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6131498777787728605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6131498777787728605'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/new-ctools-releases-130207.html' title='New Ctools releases - 13.02.07'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-8685421564320026488</id><published>2013-02-01T20:16:00.000Z</published><updated>2013-02-01T20:16:21.357Z</updated><title type='text'>Pentaho Bigdata - 101 to a bit more</title><content type='html'>&lt;div class="renderbox"&gt;&lt;h1&gt;&lt;/h1&gt;&lt;h1&gt;Motivation&lt;/h1&gt;Do I need one? Haven't you read the news? It's bigdata, this will make us all rich! &lt;i&gt;&lt;/i&gt;&lt;/div&gt;&lt;br /&gt;I'm not one of the voices that claim that this is the best invention  since the wheel. There's a lot of hype out there regarding bigdata, and  vendors desperatly seeking business opportunities are on top of this  appear every day.&lt;br /&gt;&lt;br /&gt;Having said all that, hadoop is an amazing framework. It's one more  tool at hand to be chosen for specific set of problems. It's getting  easier and easier for companies to adopt it and really increases the  power to do certain type of &lt;i&gt;ad-hoc&lt;/i&gt; analysis that would be impossible otherwise. But there's some things that should be said out loud:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Building and maintaining an hadoop cluster is expensive&lt;/li&gt;&lt;li&gt;You need a team that knows what it's doing - more than the cost of &lt;i&gt;big data&lt;/i&gt; is the cost of &lt;i&gt;bad data&lt;/i&gt;&lt;/li&gt;&lt;li&gt;Not everyone needs it&lt;/li&gt;&lt;/ul&gt;Before someone jumps at me: cost is obviously relative. We're talking about&amp;nbsp; &lt;br /&gt;&lt;ul&gt;&lt;/ul&gt;&lt;h1&gt;&lt;i&gt;Hadoop&lt;/i&gt; or &lt;i&gt;BigData&lt;/i&gt;?&lt;/h1&gt;There's a huge difference between &lt;a href="http://hadoop.apache.org/" title="Welcome to Apache™ Hadoop®!"&gt;&lt;i&gt;Hadoop&lt;/i&gt;&lt;/a&gt; and &lt;i&gt;Big Data&lt;/i&gt;. Simply put, &lt;i&gt;Hadoop&lt;/i&gt; is a framework that provides a reliable shared storage, provided by &lt;i&gt;HDFS&lt;/i&gt;, and a processing framework given by &lt;i&gt;MapReduce&lt;/i&gt;. As we dig deeper there are other pieces of the puzzle that start to appear but theses are the fundamental ones.&lt;br /&gt;&lt;br /&gt;This is just engineering talk. &lt;i&gt;Big Data&lt;/i&gt; is what you do with  it. And that makes all the difference. Write a check for a cluster,  install hadoop on it and you'll end up with a bunch of noisy machines  and 0 added value. The real challenge starts there - what you do with  the data.&lt;br /&gt;&lt;h1&gt;The challenge&lt;/h1&gt;At &lt;a href="http://www.mozilla.org/"&gt;Mozilla&lt;/a&gt;, one of the Hadoop clusters of the Metrics team uses is about 60 nodes, and it is used to store / process several different data sources., stored between &lt;i&gt;hdfs&lt;/i&gt;, &lt;i&gt;hbase&lt;/i&gt;, &lt;i&gt;hive&lt;/i&gt; and a bunch of foreign sound words that meant little to me. It was about time &lt;br /&gt;The initial goal sounded relatively easy.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;Analyze a bunch of weblogs that are stored in &lt;i&gt;hdfs&lt;/i&gt;, process them using geo localization and find out how many users per country saw the web pages&lt;/blockquote&gt;&lt;br /&gt;Now... where do I start? I know other engineers from the team that write their own &lt;i&gt;java&lt;/i&gt; code for the map reduce jobs. I'm too old for that. &lt;br /&gt;&lt;br /&gt;I heard that &lt;a href="https://pig.apache.org/" title="Welcome to Apache Pig!"&gt;Pig&lt;/a&gt; would also be an option. The last thing that I need right now is having  to learn yet another technology - unless absolutely necessary.&lt;br /&gt;&lt;br /&gt;I had heard about all the work &lt;a href="http://www.pentaho.com/big-data/"&gt;Pentaho has done with Big Data Analytics&lt;/a&gt; but never quite understood what all that was about. But the idea of being able to use an &lt;a href="http://kettle.pentaho.com/"&gt; extremely powerful ETL tool&lt;/a&gt; that me and my team have been using for ages with very good results is, to say the least, appealing.&lt;br /&gt;&lt;br /&gt;But the first step has nothing to do with it. For me is to answer the  question: What exactly is hadoop and how does it work anyway?&lt;br /&gt;&lt;h1&gt;Hadoop 101&lt;/h1&gt;&lt;a href="http://www.readability.com/m?url=http%3A%2F%2Fwww.ibm.com%2Fdeveloperworks%2Fdata%2Flibrary%2Ftecharticle%2Fdm-1209hadoopbigdata%2F" title="Open Source Big Data for the Impatient, Part 1: Hadoop tutorial: Hello World with Java, Pig, Hive, Flume, Fuse, Oozie, and Sqoop with Informix, DB2, and MySQL — www.ibm.com — Readability"&gt;This link&lt;/a&gt; proved to be a great resource to get me up and running. I have access  to the staging, research and if needed production cluster at Mozilla but  using it as an experimentation ground doesn't make me comfortable at  all. So I decided to install it locally and try to get it working.&lt;br /&gt;&lt;h2&gt;Basic concepts: Hadoop&lt;/h2&gt;Hadoop was created by &lt;a href="http://en.wikipedia.org/wiki/Doug_Cutting" title="Doug Cutting"&gt;Doug Cutting&lt;/a&gt; and Michael J. Cafarella and was originally developed to support distribution for the &lt;a href="http://en.wikipedia.org/wiki/Nutch" title="Nutch"&gt;Nutch&lt;/a&gt; search engine project.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://upload.wikimedia.org/wikipedia/en/2/2b/Hadoop_1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://upload.wikimedia.org/wikipedia/en/2/2b/Hadoop_1.png" height="248" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;There are lots of components on hadoop, but the core is divided into 2 main subprojects:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;MapReduce&lt;/i&gt; - A framework that schedules and assigns jobs and tasks on the cluster&lt;/li&gt;&lt;li&gt;&lt;i&gt;HDFS&lt;/i&gt; -  A distributed file system that guarantees scalability and reliability&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;There are some important services running on the cluster. Mapreduce work is managed by the &lt;i&gt;Job Tracker&lt;/i&gt;, running on the master and handed over to the different &lt;i&gt;Task Trackers&lt;/i&gt; on the nodes.&lt;br /&gt;&lt;br /&gt;On the data side, the master runs a &lt;i&gt;Name Node&lt;/i&gt; that keeps a reference to every file and block in the file system, and talks with the different &lt;i&gt;Data Nodes&lt;/i&gt; throughtout the slaves in the cluster.&lt;br /&gt;&lt;br /&gt;One of the big advantages of mapreduce over the generic concept of &lt;a href="http://en.wikipedia.org/wiki/Grid_computing"&gt;grid computing&lt;/a&gt; is it's ability to process the data that is stored locally; The planners try as much as possible to reduce bandwidth usage by processing local data.&lt;br /&gt;&lt;br /&gt;There's obviously a lot more around, but I'll stop here for the sake of simplicity.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Installing Hadoop&lt;/h2&gt;Like what happens in &lt;i&gt;Linux&lt;/i&gt;, even though the main project is driven by &lt;a href="http://hadoop.apache.org/"&gt;&lt;i&gt;Apache&lt;/i&gt;&lt;/a&gt;, there are several different distributions, that ensure that all the independent hadoop sub projects are correctly configured and ready to talk between them.&lt;br /&gt;&lt;br /&gt;The main providers are &lt;a href="http://hortonworks.com/technology/hortonworksdataplatform/" rel="nofollow"&gt;HortonWorks&lt;/a&gt;, &lt;a href="http://www.cloudera.com/products-services/" rel="nofollow"&gt;Cloudera&lt;/a&gt; and &lt;a href="http://www.mapr.com/products" rel="nofollow"&gt;MapR&lt;/a&gt;. At Mozilla we use Cloudera's &lt;a href="https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads"&gt;CDH3&lt;/a&gt;, so that's what I chose to install.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Downloading the Virtual Machine&lt;/h4&gt;&lt;br /&gt;I chose to install a pre-configured virtual machine.&amp;nbsp; I'm not very interested in the small details of configuration, so a one-node cluster is more than enough to get me started.&lt;br /&gt;&lt;br /&gt;Cloudera provides pre-&lt;a href="https://ccp.cloudera.com/display/SUPPORT/CDH+Downloads"&gt;built virtual machines&lt;/a&gt; you can use, and in different formats. I use &lt;a href="https://www.virtualbox.org/"&gt;Virtual Box&lt;/a&gt;, so that's the one I used. &lt;a href="http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/"&gt;This excellent post&lt;/a&gt; provides details on how to install it, plus a great overview of hadoop.&lt;br /&gt;&lt;br /&gt;In the end, booting the VM will result in something like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-QHzubmh7MPc/UQqK2X2xdSI/AAAAAAAAAd0/LtHfmD_0b2k/s1600/vm.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-QHzubmh7MPc/UQqK2X2xdSI/AAAAAAAAAd0/LtHfmD_0b2k/s1600/vm.png" height="254" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Almost ready to start playing with your system. There's only some extra network changes we need to ensure communication between host and client.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Network configurations changes&lt;/h4&gt;&lt;br /&gt;There are some configuration changes that will prove to be important. I don't want to do everything from within the VM, I also want to be able to connect from my host machine to it, and run kettle connected to it.&lt;br /&gt;&lt;br /&gt;First step is to configure the network interfaces in virtual box. For vmware or others the instructions may vary. I defined 2 network adapters, one with &lt;i&gt;Nat&lt;/i&gt; to allow for outside connections and one &lt;i&gt;Host-only&lt;/i&gt; adapter. This will allow a static ip connection between the host and the client:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Z8kogelVz6M/UQqNOXf0naI/AAAAAAAAAeE/sYxPLrlZZyE/s1600/network.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-Z8kogelVz6M/UQqNOXf0naI/AAAAAAAAAeE/sYxPLrlZZyE/s1600/network.png" height="244" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If this is correctly configured, you should see an extra interface in your host with ip &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;192.168.56.1&lt;/span&gt;, and your client would have &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;192.168.56.101&lt;/span&gt;. For convenience, I chose &lt;i&gt;hadoop-pedro&lt;/i&gt; for my machine (lousy name, but my text, my name!), so I changed the following configuration files:&lt;br /&gt;&lt;br /&gt;Client: &lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;/etc/sysconfig/network&lt;/i&gt; - Adding &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;HOSTNAME=hadoop-pedro&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;/etc/hosts&lt;/i&gt;&amp;nbsp;&lt;/li&gt;&lt;/ul&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ cat /etc/hosts&lt;br /&gt;127.0.0.1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; localhost.localdomain localhost&lt;br /&gt;192.168.56.101&amp;nbsp;&amp;nbsp; hadoop-pedro hadoop-pedro.local&lt;br /&gt;::1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; localhost6.localdomain6 localhost6&lt;/span&gt;&lt;/blockquote&gt;Host:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;/etc/hosts&lt;/i&gt; - Add the following line: &lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ cat /etc/hosts | grep hadoop-pedro&lt;br /&gt;192.168.56.101&amp;nbsp; hadoop-pedro hadoop-pedro.local &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&amp;nbsp;&lt;/span&gt; This should ensure proper communication between host and VM. You should be able to &lt;i&gt;ping hadoop-pedro&lt;/i&gt; and get results from the hosts.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Hadoop configurations changes&lt;/h4&gt;CDH's configurations default to the local interfaces, and in order to guarantee that it works flawlessly when called from the hosts, I got better results by changing hadoop's configuration files to attach to the new hostname.&lt;br /&gt;&lt;br /&gt;Hadoop is installed in &lt;i&gt;/usr/lib/hadoop&lt;/i&gt;, and inside there's a &lt;i&gt;conf/&lt;/i&gt; directory that holds the configuration files.&lt;br /&gt;&lt;br /&gt;There are some configuration changes that will prove to be important. The default configuration files makes hadoop's services listen to 0.0.0.0. I got better results by pointing to the specific IP address. So here's the properties I changed:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;core-site.xml&lt;/i&gt;: Change &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;fs.default.name&lt;/span&gt; to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;hdfs://hadoop-pedro.local:8020&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;mapred-site.xml&lt;/i&gt;: Change &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;mapred.job.tracker&lt;/span&gt; to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;hadoop-pedro.local:8021&lt;/span&gt; and &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;jobtracker.thrift.address&lt;/span&gt; to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace; font-size: x-small;"&gt;hadoop-pedro.local:9290&lt;/span&gt;. Also add the following properties: &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;mapred.map.child.java.opts&lt;/span&gt;&lt;/span&gt; to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;-Xmx768m&lt;/span&gt;&lt;/span&gt; and &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;mapred.reduce.child.java.opts&lt;/span&gt;&lt;/span&gt; to &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;-Xmx1536m &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;Reboot the VM for all the changes to have effect. Should be ready to go. &lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Knowing our way around it&lt;/h2&gt;&lt;h3&gt;Services&lt;/h3&gt;I really feel more comfortable knowing what happens in the system, what is running, how to restart, how to know what's happening. If you wanted to start the services manually, here's what should be run:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ /etc/init.d/hadoop-0.20-namenode start&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ /etc/init.d/hadoop-0.20-secondarynamenode start&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ /etc/init.d/hadoop-0.20-datanode start&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ /etc/init.d/hadoop-0.20-jobtracker start&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ /etc/init.d/hadoop-0.20-tasktracker start&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;You'll probably notice this is absolutely coherent with the description of the different hadoop components that I described before. There are a few others,&amp;nbsp; &lt;a href="http://zookeeper.apache.org/"&gt;&lt;i&gt;zookeeper&lt;/i&gt;&lt;/a&gt; and &lt;a href="http://hbase.apache.org/"&gt;&lt;i&gt;hbase&lt;/i&gt;&lt;/a&gt; for example, that we won't need for now.&lt;br /&gt;&lt;br /&gt;If you want to stop the services... just run the opposite way.&lt;br /&gt;&lt;br /&gt;The logs are under &lt;i&gt;/var/log/hadoop/&lt;/i&gt;. To know what's going on, simply follow them:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ tail -F /var/log/hadoop/*&lt;/span&gt;&lt;/blockquote&gt;&lt;h3&gt;&amp;nbsp;&lt;/h3&gt;&lt;h3&gt;Command line utils&lt;/h3&gt;Hadoop comes with a command line executable to interact with the system. You'll find the command &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;hadoop&lt;/span&gt; on path (or under the &lt;i&gt;bin/&lt;/i&gt; directory of the hadoop distribution). Execute it without arguments to see how it works. The ones I use more often are &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;hadoop fs&lt;/span&gt; to interact with hdfs and more infrequently &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;hadoop job&lt;/span&gt; to query job execution.&lt;br /&gt;&lt;h3&gt;&amp;nbsp;&lt;/h3&gt;&lt;h3&gt;Web utility ports&lt;/h3&gt;There are some important ports to look for:&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Namenode / DFS status&lt;i&gt;: http://hadoop-pedro:50070/&lt;/i&gt;&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-2rn8TOjuC0s/UQq0xax4vvI/AAAAAAAAAeY/L7y8uJd2PHg/s1600/nodename.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-2rn8TOjuC0s/UQq0xax4vvI/AAAAAAAAAeY/L7y8uJd2PHg/s1600/nodename.png" height="196" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Information about the status of our filesystem cluster&lt;br /&gt;&lt;h4&gt;Job Tracker&lt;i&gt;: http://hadoop-pedro:50030/&lt;/i&gt;&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-IUuOgKBA4fA/UQq0xBBcQ0I/AAAAAAAAAeU/z0kc6FCmOgM/s1600/jobtracker.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-IUuOgKBA4fA/UQq0xBBcQ0I/AAAAAAAAAeU/z0kc6FCmOgM/s1600/jobtracker.png" height="196" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;One of the most useful ones. Displays information about running jobs and it's where we can inspect the output of the individual tasks running on the nodes.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Task Tracker&lt;i&gt;: http://hadoop-pedro:50060/&lt;/i&gt;&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-Q6Vv-unib-c/UQq0xFOzszI/AAAAAAAAAec/VoFdYGuA4_Y/s1600/tasktracker.png" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-Q6Vv-unib-c/UQq0xFOzszI/AAAAAAAAAec/VoFdYGuA4_Y/s1600/tasktracker.png" height="196" width="320" /&gt;&amp;nbsp;&lt;/a&gt; &lt;/div&gt;&lt;h4&gt;&lt;/h4&gt;Displays the status of individual tasks.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Hdfs&lt;/h3&gt;It's fundamental to know how to interact with hdfs. I use the command line &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs&lt;/span&gt; . Once again, run without arguments to know what are the different options.&lt;br /&gt;&lt;br /&gt;We can either run locally without specifying the hdfs server or remotely specifying the full &lt;a href="http://commons.apache.org/vfs/"&gt;VFS&lt;/a&gt; path:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;/span&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;hadoop fs -ls /&lt;/span&gt; &lt;/blockquote&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ hadoop fs -ls hdfs://hadoop-pedro:8020/&lt;/span&gt;&lt;/blockquote&gt;If not present yet, I recommend creating a home directory for your user on hadoop, on my case... surprise... &lt;i&gt;pedro&lt;/i&gt;.&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;hadoop fs -mkdir hdfs://hadoop-pedro:8020/user/pedro&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;The most commonly used commands are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -ls :&lt;/span&gt; List files&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -mkdir :&lt;/span&gt; Make directory&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -put :&lt;/span&gt; Put local files into hdfs&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -get :&lt;/span&gt; Get files from hdfs&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -cat :&lt;/span&gt; Show the contents of a file in hdfs &lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -rm :&lt;/span&gt; Remove a file&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ hadoop fs -rmr :&lt;/span&gt; Recursively remove a directory&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;Pentaho Bigdata&lt;/h1&gt;Once I started to get familiarized with the hadoop infrastructure and  starting to look at kettle, I was surprised about the level of  documentation of &lt;a href="http://wiki.pentaho.com/display/BAD/Pentaho+Big+Data+Community+Home" title="Pentaho Big Data Community Home - Pentaho Big Data - Pentaho Wiki"&gt;Pentaho's Big Data plugin&lt;/a&gt;. This is not an easy concept. It's hard to use, hard to debug, lots of stuff to know. So having a &lt;i&gt;Wiki&lt;/i&gt; with a good set of documentation aimed more at concrete examples is very good.&lt;br /&gt;&lt;br /&gt;My first question was obviously &lt;i&gt;"How do I start? What do I download?"&lt;/i&gt;. The &lt;a href="http://wiki.pentaho.com/display/BAD/Pentaho+Big+Data+Community+Home" title="Pentaho Big Data Community Home - Pentaho Big Data - Pentaho Wiki"&gt;wiki&lt;/a&gt; suggests downloading a &lt;a href="http://wiki.pentaho.com/display/BAD/Downloads" title="Downloads - Pentaho Big Data - Pentaho Wiki"&gt;stable kettle version&lt;/a&gt; and you'd get up and running in no time. But that would be too easy, and we wouldn't understand what was happening behind the hood.&lt;br /&gt;&lt;h2&gt;Compiling kettle&lt;/h2&gt;I always compile kettle from source. Everyone does that, right? :)&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;$ svn co svn://source.pentaho.org/svnkettleroot/Kettle/branches/4.4.1&lt;br /&gt;$ cd kettle-4.4.1&lt;br /&gt;$ ant clean distrib&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Please note that I'm using the 4.4.1 branch. This is always changing. I don't yet feel confident about using 5.0, so pay attention to the one you should be using. &lt;br /&gt;&lt;br /&gt;In the end, we'll get a ready to run kettle in the &lt;i&gt;distrib&lt;/i&gt; directory. This doesn't have the bigdata plugin.&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Bigdata plugin&lt;/h2&gt;&lt;h3&gt;Compiling&lt;/h3&gt;&lt;br /&gt;Next step is to compile the &lt;i&gt;bigdata&lt;/i&gt; plugin. Fortunately this one's already on &lt;i&gt;git&lt;/i&gt;.&amp;nbsp; &lt;br /&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;$ git clone https://github.com/pentaho/big-data-plugin.git&lt;br /&gt;$ cd big-data-plugin&lt;br /&gt;$ ant&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;There's an important detail that made me lose a lot of time and is not obvious at all. I'll describe the details later, but the bigdata plugin prepares a zip of a bunch of jars and dependencies to copy to hadoop, and that's a static bundle. By default, points to &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;TRUNK-SNAPSHOT&lt;/span&gt;, which means that will download the latest version of kettle, eventually causing incompatibilities with the kettle version we chose before. &lt;br /&gt;&lt;br /&gt;You can edit the file &lt;i&gt;build.properties&lt;/i&gt; and change the following line:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;$ dependency.kettle.revision=4.4.0-stable&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;I'm not aware of any artifact that points to a continuous build of 4.x, so I chose the closest version available.&lt;br /&gt;&lt;br /&gt;If you compile again you'll get a plugin version ready to use under the &lt;i&gt;dist&lt;/i&gt; directory. Unzip in the plugins director. &lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;$ tar -xzf dist/pentaho-big-data-plugin-TRUNK-SNAPSHOT.tar.gz -C ~/.kettle/plugins&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;h3&gt;Configuring&lt;/h3&gt;After installing the plugin, we need to configure it properly. There's an important file that needs to be changed.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;$ vim pentaho-big-data-plugin/plugin.properties&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;You need to change the following properties:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;active.hadoop.configuration = cdh3u4&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pmr.kettle.dfs.install.dir = /user/&lt;username&gt;/pentaho/mapreduce&lt;/username&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pmr.kettle.additional.plugins = steps/maxmind&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;Like I mentioned before, there are several hadoop distributions and each of them has made the modifications they considered necessary in order to ensure everything works well. This is a good thing. The bad thing is that 3rd party integrators have to comply with all the variants.&lt;br /&gt;&lt;br /&gt;Pentaho developers did a great approach to try to minimize, to a certain extent, this problem. They developed a &lt;a href="http://en.wikipedia.org/wiki/Shim_%28computing%29"&gt;shim&lt;/a&gt; around the common hadoop code (if you like to mess with source code, you'll find it under the package &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;org.pentaho.hadoop.shim.common&lt;/span&gt;) to comply with the variants.&lt;br /&gt;&lt;br /&gt;Like I mentioned, I'm using Cloudera's &lt;i&gt;CDH3u4&lt;/i&gt;. Luckily it's one of the supported versions. You can see the possible values by looking at the directory &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pentaho-big-data-plugin/hadoop-configurations/&lt;/span&gt;&lt;/span&gt; . Currently the supported versions are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;cdh3u4&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;cdh4&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;hadoop-20&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;mapr&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;I'm sure this list will increase with time and relevance. &lt;br /&gt;&lt;br /&gt;The second fundamental property is &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pmr.kettle.dfs.install.dir&lt;/span&gt;&lt;/span&gt;. This is where kettle will be copied to in hdfs in order for mapreduce to be able to find all the dependencies of our jobs/transformations. Due to the way permissions are setup on Mozilla's cluster, I have to use my remote username. So I pointed it to my home dir in &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/user/pedroalves/pentaho/mapreduce.&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The third option is a fundamental one on my case. The bundle file that gets copied to hdfs and run on hadoop has only the core transformations and steps (and bigdata plugin, obviously) . On my case I wanted to add another one. The format is relative to the kettle directory, and my geoip plugin is under &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;steps/maxmind&lt;/span&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Everything is ready to start using. If we now launch spoon, we should see the bigdata steps:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-HlDPbT5-a3Q/UQuwzioAKKI/AAAAAAAAAe0/IfjVEZmstOE/s1600/bigdataplugin.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-HlDPbT5-a3Q/UQuwzioAKKI/AAAAAAAAAe0/IfjVEZmstOE/s1600/bigdataplugin.png" height="205" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;Running mapreduce tasks&lt;/h2&gt;&lt;h3&gt;Setting up the environment&lt;/h3&gt;Back to my initial challenge: Parse and geolocate weblogs. On my case, I wanted to know how many and which &lt;a href="https://wiki.mozilla.org/Websites/Snippets"&gt;snippets&lt;/a&gt; were seen on a daily basis by country. Pentaho bigdata wiki has a &lt;a href="http://wiki.pentaho.com/display/BAD/Using+Pentaho+MapReduce+to+Parse+Weblog+Data"&gt;very detailed example&lt;/a&gt; on how to achieve the majority of this, but lacked the geolocation step. &lt;br /&gt;&lt;br /&gt;The files are stored in the main cluster, in a hdfs directory. I started by copying a sample of those files to my local vm, simulating the real environment:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;$ hadoop fs -get hdfs://&lt;cluster&gt;:8020/www_weblogs/dir/part-r-00000.gz .&lt;br /&gt;$ hadoop fs -mkdir hdfs://hadoop-pedro:8020&lt;/cluster&gt;&lt;/code&gt;&lt;code&gt;&lt;code&gt;/www_weblogs/dir/&lt;/code&gt;&lt;br /&gt;$ &lt;/code&gt;&lt;code&gt;hadoop fs -put &lt;/code&gt;&lt;code&gt;&lt;code&gt;part-r-00000.gz&lt;/code&gt; hdfs://hadoop-pedro:8020&lt;/code&gt;&lt;code&gt;&lt;code&gt;/www_weblogs/dir/&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;h3&gt;&lt;/h3&gt;&lt;h3&gt;Preparing the job and the transformation&lt;/h3&gt;My job is pretty simple, almost a direct call to pentaho mapreduce step:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-b8NbVE_WGCo/UQvrGfNaK8I/AAAAAAAAAfE/mFoa9LjEzQY/s1600/job.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-b8NbVE_WGCo/UQvrGfNaK8I/AAAAAAAAAfE/mFoa9LjEzQY/s1600/job.png" height="199" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;We need to fill in some information related to this step:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Cluster information&lt;/li&gt;&lt;li&gt;Map transformation&lt;/li&gt;&lt;li&gt;Reduce transformation (if needed)&lt;/li&gt;&lt;li&gt;Combiner transformation (if needed)&lt;/li&gt;&lt;li&gt;Information about input and output&lt;/li&gt;&lt;/ul&gt;As usual best practices recommend, I used variables as much as possible. Here are the ones that I'm using, and should be self-explanatory:&lt;br /&gt;&amp;nbsp;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;br /&gt;# local&lt;br /&gt;SNIPPET_HDFS_HOST=hadoop-pedro.local&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;SNIPPET_HDFS_PORT=8020&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;SNIPPET_JT_HOST=hadoop-pedro.local&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;SNIPPET_JT_PORT=8021&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;SNIPPET_HDFS_INPUT_PATH=/www_weblogs/snippets-stats.mozilla.org/&lt;span style="font-size: x-small;"&gt;dir&lt;/span&gt;/&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;SNIPPET_HDFS_OUTPUT_PATH=/user/pedro/tests/snippets&lt;/span&gt; &lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The input and output formats are &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;org.apache.hadoop.mapred.TextInputFormat&lt;/span&gt;&lt;/span&gt; and &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;org.apache.hadoop.mapred.TextOutputFormat&lt;/span&gt;&lt;/span&gt;. You can see in &lt;a href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/InputFormat.html"&gt;hadoop documentation&lt;/a&gt; the possible values to put here, always with the possibility to write your own. Same for &lt;a href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/OutputFormat.html"&gt;output formats&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;This is my transformation, ready to be executed:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-XjqGfekn4ts/UQvv2Hus_pI/AAAAAAAAAfY/d0UWOI2fafg/s1600/transformation1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-XjqGfekn4ts/UQvv2Hus_pI/AAAAAAAAAfY/d0UWOI2fafg/s1600/transformation1.png" height="200" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Running the job &lt;/h3&gt;When I run the job, I immediately see in the logs the following lines:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;INFO  01-02 16:49:52,665 - Spoon - Starting job...&lt;br /&gt;INFO  01-02 16:49:52,666 - test_mapreduce_job - Start of job execution&lt;br /&gt;INFO  01-02 16:49:52,668 - test_mapreduce_job - Starting entry [Pentaho MapReduce]&lt;br /&gt;INFO  01-02 16:49:52,708 - test_mapper_with_geoip - Dispatching started for transformation [test_mapper_with_geoip]&lt;br /&gt;INFO  01-02 16:49:52,817 - test_reducer - Dispatching started for transformation [test_reducer]&lt;br /&gt;INFO  01-02 16:49:52,836 - Pentaho MapReduce - Cleaning output path: hdfs://hadoop-pedro.local:8020/user/pedro/tests/snippets&lt;br /&gt;INFO  01-02 16:49:52,841 - Pentaho MapReduce - Installing Kettle to /user/pedroalves/pentaho/mapreduce/4.4.0-TRUNK-SNAPSHOT-cdh3u4&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This looks good. However, a few moments (or minutes, depending on where the cluster is), I get a few less motivating messages:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;INFO  01-02 16:50:01,687 - Total input paths to process : 1&lt;br /&gt;INFO  01-02 16:50:01,843 - Pentaho MapReduce - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0&lt;br /&gt;INFO  01-02 16:50:06,844 - Pentaho MapReduce - Setup Complete: 0.0 Mapper Completion: 0.0 Reducer Completion: 0.0&lt;br /&gt;INFO  01-02 16:50:11,857 - Pentaho MapReduce - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0&lt;br /&gt;INFO  01-02 16:50:16,861 - Pentaho MapReduce - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0&lt;br /&gt;INFO  01-02 16:50:21,878 - Pentaho MapReduce - Setup Complete: 100.0 Mapper Completion: 0.0 Reducer Completion: 0.0&lt;br /&gt;ERROR 01-02 16:50:21,920 - Pentaho MapReduce - [FAILED] -- Task: attempt_201301301222_0006_m_000000_0  Attempt: attempt_201301301222_0006_m_000000_0  Event: 1&lt;br /&gt;java.io.IOException: org.pentaho.di.core.exception.KettleException:&lt;br /&gt;We failed to initialize at least one step.  Execution can not begin!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;        at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.run(PentahoMapRunnable.java:467)&lt;br /&gt;        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)&lt;br /&gt;        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)&lt;br /&gt;        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)&lt;br /&gt;        at java.security.AccessController.doPrivileged(Native Method)&lt;br /&gt;        at javax.security.auth.Subject.doAs(Subject.java:396)&lt;br /&gt;        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)&lt;br /&gt;        at org.apache.hadoop.mapred.Child.main(Child.java:264)&lt;br /&gt;Caused by: org.pentaho.di.core.exception.KettleException:&lt;br /&gt;We failed to initialize at least one step.  Execution can not begin!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;        at org.pentaho.di.trans.Trans.prepareExecution(Trans.java:932)&lt;br /&gt;        at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.run(PentahoMapRunnable.java:354)&lt;br /&gt;        ... 7 more&lt;br /&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;And loops until I stop the job. From the log messages I would have absolutely no idea what was going on.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Under the hood&lt;/h3&gt;I had no option but to try and go deeper into understanding what happens under the hoods. To change that I had to go to the source of the information... &lt;a href="https://github.com/pentaho/big-data-plugin/blob/master/src/org/pentaho/di/job/entries/hadooptransjobexecutor/JobEntryHadoopTransJobExecutor.java"&gt;literally&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;The approach is actually pretty simple, and follows the instructions on practically every hadoop book, but tweaked so that we can execute transformation without the hassle of writing pure java code. Here's the sequence&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Detect the shim we chose. This will guarantee later on that the specifics of each distribution is respected&lt;/li&gt;&lt;li&gt;The mapreduce step is processed to get:&lt;/li&gt;&lt;ol&gt;&lt;li&gt;The configuration for the mapper&lt;/li&gt;&lt;li&gt;Configurations for the combiner&lt;/li&gt;&lt;li&gt;Configurations for the reducer&lt;/li&gt;&lt;li&gt;Input and Output formats&lt;/li&gt;&lt;li&gt;Cluster information&lt;/li&gt;&lt;li&gt;Input paths&lt;/li&gt;&lt;li&gt;Output paths&lt;/li&gt;&lt;li&gt;User defined configurations&lt;/li&gt;&lt;li&gt;Number of map and reduce tasks&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;Our set of kettle variables will be passed to the hadoop configuration, ensuring all the environment stays the same&lt;/li&gt;&lt;li&gt;The output path is deleted, if that was the chosen option&lt;/li&gt;&lt;li&gt;Bigdata plugin properties are read to determine the kettle installation directory. This depends on the kettle version, so a single cluster supports the usage of different versions at the same time&lt;/li&gt;&lt;li&gt;Checks if kettle is already installed in hdfs. It does that by seeing if the chosen hdfs directory exists (on my specific case evaluated to: &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;/user/pedroalves/pentaho/mapreduce/4.4.0-TRUNK-SNAPSHOT-cdh3u4&lt;/span&gt;&lt;/span&gt; ) and if it has the subdirectories &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;lib&lt;/span&gt;&lt;/span&gt; and &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;plugins&lt;/span&gt;&lt;/span&gt;. This is bound to change in the future, as it clearly inefficient and unable to detect changes to the content of those directories&lt;/li&gt;&lt;li&gt;The kettle archive (&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;pentaho-big-data-plugin/pentaho-mapreduce-libraries.zip&lt;/span&gt;&lt;/span&gt;), bigdata plugin and the extra plugins we specified&lt;/li&gt;&lt;li&gt;Everything is registred in haddop's &lt;a href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html"&gt;DistributedCache&lt;/a&gt;, for local file access and classpath registration&lt;/li&gt;&lt;li&gt;The job is finally submitted to execution&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;h3&gt;Debugging the transformation&lt;/h3&gt;&lt;br /&gt;Once the job is submitted, we will be able to track it's execution in hadoop's &lt;i&gt;Job Tracker&lt;/i&gt; at &lt;i&gt;http://hadoop-pedro:50030/&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-RdecFlyzS9Y/UQv8WB962cI/AAAAAAAAAfs/Gu2N_ARYunE/s1600/jobRunning.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-RdecFlyzS9Y/UQv8WB962cI/AAAAAAAAAfs/Gu2N_ARYunE/s1600/jobRunning.png" height="258" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If you follow the link to get more details on the running job, you'll be able to get details on the specifics of the job configuration and the specified tasks. On my case, following the link on the map task I'm able to see the exception thrown by the mapper transformation&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-lIXb8i_Ka-A/UQv9KBy6HqI/AAAAAAAAAf0/OFDLgJwyiO8/s1600/taskdetails1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-lIXb8i_Ka-A/UQv9KBy6HqI/AAAAAAAAAf0/OFDLgJwyiO8/s1600/taskdetails1.png" height="258" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;If we click on one the task, we'll be able to see all task attempts that have been made. And individually access the task logs. And there is the very familiar kettle output, with a line that clearly states what's going on.&lt;br /&gt;&amp;nbsp;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;ERROR 31-01 03:10:07,458 - Lookup Country - Error initializing max mind database file location '/usr/local/share/GeoIP/GeoIPCity.dat'&lt;br /&gt;ERROR 31-01 03:10:07,458 - Lookup Country - org.pentaho.di.core.exception.KettleStepException: &lt;br /&gt;Unable to set up MaxMind database '/usr/local/share/GeoIP/GeoIPCity.dat'&lt;br /&gt;/usr/local/share/GeoIP/GeoIPCity.dat (No such file or directory)&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;I will spare you of all the pain that we had to go through to fix this. This seems simple but it's not, it was a very very hard task to ensure that the &lt;i&gt;.dat&lt;/i&gt; files were available on all the nodes. In the end, &lt;a href="https://twitter.com/mattcasters/status/296282749348745217"&gt;Matt Casters&lt;/a&gt; and I completely rewrote the Maxmind plugin step, &lt;a href="https://github.com/mattcasters/MaxMindGeoIPLookup"&gt;which is now also on github&lt;/a&gt;, to support VFS.&lt;br /&gt;&lt;br /&gt;Pro tip: Everything happened to me. Apparent thread locks, the system totally hanging with 100% cpu usage and no log output anywhere, that I eventually traced down to memory usage. One trick I managed to use and was very useful was to send a &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;QUIT&lt;/span&gt;&lt;/span&gt; signal to the task process (with &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;kill -QUIT &lt;pid&gt;&lt;/pid&gt;&lt;/span&gt;&lt;/span&gt;). Despite the scary name, this will cause the &lt;i&gt;JVM&lt;/i&gt; to do a thread dump, allowing us to spy on what it's doing. This tip is true for any java program.&lt;br /&gt;&lt;br /&gt;After several days changing the plugin and debugging the origin of the problem, I finally discovered that by default mapreduce tasks run with a maximum memory of &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;-Xmx200m&lt;/span&gt;&lt;/span&gt;. Since I was using the city level geolocation, that value was clearly insufficient to run the transformation, that ran into OOM/GC issues, which didn't happen when I used the geo location only at country level. So do yourself a favor - increase the available memory on the cluster.&lt;br /&gt;&lt;br /&gt;Pentaho still needs to improve the debugging abilities of pentaho bigdata plugin. Like I wrote &lt;a href="http://pedroalves-bi.blogspot.pt/2013/01/debugging-kettle-tasks-in-mapreduce.html"&gt;on my last post&lt;/a&gt; I ended up developing a change to the WriteToLog step to allow displaying only the top N rows of the dataset. Helps a bit until they allow us to do &lt;a href="http://jira.pentaho.com/browse/PDI-9148"&gt;proper debugging from within spoon&lt;/a&gt; like any regular transformation.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-iGQWdiv24eI/UQwGAF53XjI/AAAAAAAAAgI/U1hxMcWxvIo/s1600/transformation2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-iGQWdiv24eI/UQwGAF53XjI/AAAAAAAAAgI/U1hxMcWxvIo/s1600/transformation2.png" height="199" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;After all the changes to the maxmind step and increasing cluster memory, I ended up copying the GeoIP files to my hdfs user directory and specified the location using the following variables (the step also supports variable substitution - thanks Matt!)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;maxmind.geoip.path = hdfs://hadoop-pedro:8020/user/pedro/geoip/GeoIP.dat&lt;br /&gt;maxmind.geoipcity.path = hdfs://hadoop-pedro:8020/user/pedro/geoip/GeoIPCity.dat &lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Running the transformation again results in a successful run of both map and reduce tasks!&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-e74dZ94N0BQ/UQwHomHHxII/AAAAAAAAAgQ/WSF7wAKAY54/s1600/jobtrackerSucces.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-e74dZ94N0BQ/UQwHomHHxII/AAAAAAAAAgQ/WSF7wAKAY54/s1600/jobtrackerSucces.png" height="259" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;And we have access to the output&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;maxmind.geoip.path = hdfs://hadoop-pedro:8020/user/pedro/geoip/GeoIP.dat&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;$ hadoop fs -cat hdfs://hadoop-pedro:8020/user/pedro/tests/snippets/part-00000 | head -n 40&lt;br /&gt;13/Aug/2012|Afghanistan|&lt;span style="font-size: x-small;"&gt;1234&lt;/span&gt;|31&lt;br /&gt;13/Aug/2012|Afghanistan|&lt;span style="font-size: x-small;"&gt;2345&lt;/span&gt;|7&lt;br /&gt;13/Aug/2012|Aland Islands|3456|16&lt;br /&gt;13/Aug/2012|Aland Islands|&lt;span style="font-size: x-small;"&gt;4567&lt;/span&gt;|1&lt;br /&gt;13/Aug/2012|Albania|&lt;span style="font-size: x-small;"&gt;5678&lt;/span&gt;|7&lt;br /&gt;13/Aug/2012|Albania|&lt;span style="font-size: x-small;"&gt;7890&lt;/span&gt;|5&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;# get all the files in case there &lt;span style="font-size: x-small;"&gt;were multiple red&lt;span style="font-size: x-small;"&gt;ucers running&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;code&gt;&lt;code&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;$ hadoop fs -getmerge hdfs://hadoop-pedro:8020/user/pedro/tests/snippets/ result.txt &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/span&gt;&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Final Remarks and Credits&lt;/h2&gt;&lt;br /&gt;It was a very tough week, but absolutely fundamental for me to understand how things work. Hadoop is an amazing framework, and being able to take full advantage of kettle to run our map reduce analysis is a huge bonus. There's still a lot of user experience improvements around these steps, but considering the alternative is to write java code manually or learn other new languages make this a great start.&lt;br /&gt;&lt;br /&gt;This blog post was never meant to be a full pentaho bigdata tutorial. What I hope is that from this point on, understanding in detail how things work and what happens when we press the "run" button allows me to do further development with much more speed, since I know exactly where to look for.&lt;br /&gt;&lt;br /&gt;I also got the chance to understand the very basics of how hadoop works, and know what each of the components does. Next step will be digging into &lt;i&gt;hbase&lt;/i&gt; and &lt;i&gt;hive&lt;/i&gt;. &lt;br /&gt;&lt;br /&gt;I was also a bit suspicious of the performance and overhead of executing kettle transformations on hadoop, and how it would compare with &lt;a href="http://pig.apache.org/"&gt;pig&lt;/a&gt;. Having seen the code and how lightweight the wrapper around kettle is, I have no doubts that using kettle instead of learning new stuff or approaches is, indeed an astonishingly efficient way to run map reduce jobs.&lt;br /&gt;&lt;br /&gt;Need to credit a bunch of people that helped me throughout this last week. Mark Reid, Xavier Stevens and Daniel Einspanjer from Mozilla, Doug Moran, Matt Casters and Matt Burgess from Pentaho and Maria Roldan from &lt;a href="http://www.webdetails.pt/"&gt;webdetails&lt;/a&gt;. I'm aware I was a royal PITA the last few days :)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/8685421564320026488/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/pentaho-bigdata-101-to-bit-more.html#comment-form' title='13 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8685421564320026488'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8685421564320026488'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/02/pentaho-bigdata-101-to-bit-more.html' title='Pentaho Bigdata - 101 to a bit more'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-QHzubmh7MPc/UQqK2X2xdSI/AAAAAAAAAd0/LtHfmD_0b2k/s72-c/vm.png' height='72' width='72'/><thr:total>13</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-1054695403821008050</id><published>2013-01-23T15:48:00.000Z</published><updated>2013-01-23T16:57:17.695Z</updated><title type='text'>Debugging kettle tasks in MapReduce - Sane WriteToLog</title><content type='html'>Finally started to play with &lt;a href="http://wiki.pentaho.com/display/BAD/Pentaho+Big+Data+Community+Home"&gt;bigdata and pentaho&lt;/a&gt;. On my specific case, &lt;a href="https://ccp.cloudera.com/display/SUPPORT/CDH3u4+Downloadable+Tarballs"&gt;Cloudera CDH3u4&lt;/a&gt;. At &lt;a href="https://www.mozilla.org/"&gt;Mozilla&lt;/a&gt; we have a few clusters of over 80 machines that we're using to backup a bunch of services&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Debugging mapreduce tasks&lt;/h3&gt;&lt;br /&gt;It took me a while to get my head around the concepts of how kettle integrated with the mapreducer tasks. When I did, the first thing I noticed is how complex it is to know what's happening. Until &lt;a href="http://www.ibridge.be/"&gt;Matt Casters&lt;/a&gt; and &lt;a href="http://kettle.pentaho.com/"&gt;friend&lt;/a&gt; get the chance to implement &lt;a href="http://jira.pentaho.com/browse/PDI-9148"&gt;PDI-9148&lt;/a&gt;, we need to do things manually - as in inspecting logs, etc.&lt;br /&gt;&lt;br /&gt;My first approach was writing to text files. I tested direct output to &lt;a href="http://hadoop.apache.org/docs/hdfs/current/hdfs_design.html"&gt;hdfs&lt;/a&gt;, but for some reason didn't work. Using direct file system means that output will be spread through all the cluster nodes. This approach generally sucks.&lt;br /&gt;&lt;br /&gt;I also thought about using some hand-made logic in a javascript step, but then looked at the &lt;a href="http://wiki.pentaho.com/display/EAI/Write+to+log"&gt;&lt;i&gt;WriteToLog&lt;/i&gt;&lt;/a&gt; step. This step generally works, but with a great flaw on it; it has no way to limit the output of it. If we have millions of rows, we'll have a huge log generated - and that's not good.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;An improved &lt;i&gt;Write To Log &lt;/i&gt;step&lt;/h3&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-HOGON9lUJqk/UQAB62qkdBI/AAAAAAAAAdU/-90ySh9E_48/s1600/writeToLog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-HOGON9lUJqk/UQAB62qkdBI/AAAAAAAAAdU/-90ySh9E_48/s1600/writeToLog.png" height="189" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;i&gt;If it's not there, just do it yourself, the code is open&lt;/i&gt;. So I did. I added the ability of specifying a limit to the output of the step. This is very useful to inspect how the dataset is looking inside a map or reduce task. Once I deployed this change to my cluster, this is how my tasktracker log looks like (I ran this with a previous writeToLog version and ended up with a crashed browser and almost half a gigabyte of log files). This shows the first 5 lines of our dataset, with the key and value of our dataset:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-_ubwLn8jxHc/UQADVoCduVI/AAAAAAAAAdk/97m52SlAj74/s1600/writeToLog2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-_ubwLn8jxHc/UQADVoCduVI/AAAAAAAAAdk/97m52SlAj74/s1600/writeToLog2.png" height="247" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I'll work with the kettle team in order to put this into the main code line, hopefully will be in 4.4.1 and 5.0. This is &lt;a href="http://jira.pentaho.com/browse/PDI-9195"&gt;PDI-9195 &lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/1054695403821008050/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/debugging-kettle-tasks-in-mapreduce.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1054695403821008050'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1054695403821008050'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/debugging-kettle-tasks-in-mapreduce.html' title='Debugging kettle tasks in MapReduce - Sane WriteToLog'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-HOGON9lUJqk/UQAB62qkdBI/AAAAAAAAAdU/-90ySh9E_48/s72-c/writeToLog.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6257078989156960788</id><published>2013-01-08T13:06:00.000Z</published><updated>2013-01-08T15:07:08.970Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ctools'/><title type='text'>CDE / CDF components and templates in Pentaho solution directory</title><content type='html'>Someone brought to my attention that this is a very useful, though undocumented feature. &lt;i&gt;TL;DR&lt;/i&gt;,&amp;nbsp; you can have templates and components in the solution directory and not in the plugin directory, that gets wiped on upgrades&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;It's a common requirement having to develop templates or components to extend the capabilities of the &lt;a href="http://ctools.webdetails.org/"&gt;Ctools&lt;/a&gt;. The normal place is to put it in the &lt;i&gt;solution/system/plugin&lt;/i&gt; directory, along the others.&lt;br /&gt;&lt;br /&gt;However, that has a huge inconvenience - whenever we upgrade / reinstall the plugin we need to copy the resources back, not forgetting to do a backup before.&lt;br /&gt;&lt;br /&gt;That's not actually needed since a while back. If we put the resources directly under the &lt;i&gt;pentaho-solution/&lt;plugin-name&gt;/&lt;/plugin-name&gt;&lt;/i&gt; all the Ctools will know what to do.&lt;br /&gt;&lt;br /&gt;Here's a real life example that can serve as reference:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: xx-small;"&gt;cde&lt;br /&gt;├── components&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── EmailPrpt&lt;br /&gt;│&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── component.xml&lt;br /&gt;│&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── emailPrpt-implementation.js&lt;br /&gt;│&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; └── emailPrpt.xaction&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── VideoGallery&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── ceebox-implementation.js&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── component.xml&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── css&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; └── ceebox.css&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── images&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── cee-close-btn.png&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── cee-next-btn.gif&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── cee-next-btn.png&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── cee-prev-btn.gif&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; ├── cee-prev-btn.png&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; │&amp;nbsp;&amp;nbsp; └── loader.gif&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── jquery.ceebox.js&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; └── jquery.swfobject.js&lt;br /&gt;├── styles&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── Clean.html&lt;br /&gt;├── templates&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── index.xml&lt;br /&gt;└── widgets&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── IncomeStatementDetailTable.cdfde&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── IncomeStatementDetailTable.component.xml&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── IncomeStatementDetailTable.wcdf&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── index.xml&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── sample.cda&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── sample.cdfde&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; └── sample.wcdf&lt;br /&gt;cdf&lt;br /&gt;├── components&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── jfreechart-cda.xaction&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── traffic.xaction&lt;br /&gt;├── includes&lt;br /&gt;│&amp;nbsp;&amp;nbsp; ├── index.xml&lt;br /&gt;│&amp;nbsp;&amp;nbsp; └── Operations&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── facilityAccount.cda&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── facilityAccount.cdfde&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ├── facilityAccount.wcdf&lt;br /&gt;│&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; └── index.xml&lt;br /&gt;└── index.xml&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6257078989156960788/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/cde-cdf-components-and-templates-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6257078989156960788'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6257078989156960788'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/cde-cdf-components-and-templates-in.html' title='CDE / CDF components and templates in Pentaho solution directory'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-1957718025195470369</id><published>2013-01-04T17:46:00.003Z</published><updated>2013-01-04T17:59:13.316Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='async'/><category scheme='http://www.blogger.com/atom/ns#' term='cdf'/><title type='text'>CDF Async Support</title><content type='html'>&lt;h2 style="vertical-align: top;"&gt;Introduction &lt;/h2&gt;This is a huge change! Since the beginning, CDF behaved in a synchronous way. The problem is that with several components, performance suffers with it. This change represents a major overhaul of the main CDF code in order to change that. Now, if a priority is specified, all components are executed simultaneously, speeding up the render of a dashboard.&lt;br /&gt;&lt;br /&gt;We tried to maintain backward compatibility. Since the new async behavior requires a new - and simpler - way of defining component interaction, by default old dashboards will still render in a "fake synchronous" mode, by applying a specific heuristic where sequential sets of priorities are assigned to components, emulating old behavior.&lt;br /&gt;&lt;br /&gt;This blog post (who's contents are also available in CDF's documentation) is a guide to converting old components and dashboards to the new       async style, and developing new ones based on asynchronous querying.&lt;br /&gt;&lt;br /&gt;This is currently in&lt;i&gt; dev&lt;/i&gt; and will soon make it's way into stable releases. CDE support is obviously included&amp;nbsp; &lt;br /&gt;&lt;div class="webdetailsBoxShadow"&gt;&lt;h2 style="vertical-align: bottom;"&gt;Rationale&lt;/h2&gt;The first step to understanding the changes in the async patch is understanding       the CDF component lifecycle. When a component is updated, the basic update       lifecycle looks like this:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;preExecution -&amp;gt; update -&amp;gt; postExecution&lt;br /&gt; &lt;/code&gt;&lt;/pre&gt;Usually, though, there will be a call to a data source, with a subsequent call       to postFetch, and only then is the component rendered:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;preExecution -&amp;gt; update -&amp;gt; query -&amp;gt; postFetch -&amp;gt; redraw -&amp;gt; postExecution&lt;br /&gt; &lt;/code&gt;&lt;/pre&gt;This is a more typical lifecycle, and one that has some important limitations.       First, preExeuction and postExecution are entirely the responsibility of CDF       itself, rather than the  component. Because CDF has no control over the contents       of the update method, it has no way of ensuring that, should the component       execute an asynchronous query, postExecution only runs after redraw. In this       case, you're likely to see this instead:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;preExecution -&amp;gt; update -&amp;gt; postExecution -&amp;gt; query -&amp;gt; postFetch -&amp;gt; redraw&lt;br /&gt; &lt;/code&gt;&lt;/pre&gt;Which breaks the contract for postExecution running after the component is done       updating. The solution here is that the component itself must take control of       postExecution, while keeping the burden of implementing the lifecycle in CDF       rather than passing it to the component developer. On a related topic, postFetch       has become a de facto standard part of the lifecycle, yet its implementation was       left to the component implenters, which leads to a fairly large amount of       boilerplate code.&lt;br /&gt;Our objective here was to retool the base component so as to deal with both       of these issues, thus allowing queries to be performed asynchronously while        reducing the developer effort involved in creating a component.&lt;/div&gt;&lt;div class="webdetailsBoxShadow"&gt;&lt;h2 style="vertical-align: top;"&gt;Component execution order and Priority&lt;/h2&gt;There are no major changes in the way components behave. There is, however an       important caveat - since all components (that have been converted) will be       executed simultaneously, we can no longer rely on the order of execution. &lt;br /&gt;There's now an additional property named &lt;i&gt;priority&lt;/i&gt;. The priority of component       execution, defaulting to 5. The lower the number, the higher priority the       component has. Components with same priority with be executed simultaneously.       Useful in places where we need to give higher priority to filters or other       components that need to be executed before other components.&lt;br /&gt;This way there's no longer the need to use dummy parameters and postChange       tricks to do, for instance, cascade prompts.&lt;/div&gt;&lt;div class="webdetailsBoxShadow"&gt;&lt;h2 style="vertical-align: bottom;"&gt;Backward compatibility and changes&lt;/h2&gt;We did a big effort in order to maintain backward compatibility, but some care       has to be taken. What we do is assume that if components have no priority, we       give them a sequential value, trying to emulate the old behavior. It's       recommended that proper priorities are set in order to take advantage of the new       improvements.&lt;br /&gt;If using &lt;i&gt;CDE&lt;/i&gt;, please note that if you edit a dashboard and save it, &lt;b&gt;all         components will have a default priority of 5&lt;/b&gt;. This may break the old behavior.       If you need to change a dashboard, make sure you tweak the priorities, if       needed.&lt;/div&gt;&lt;div class="webdetailsBoxShadow"&gt;&lt;h2 style="vertical-align: top;"&gt;&lt;span class="h2Num h2NumEven"&gt;&lt;/span&gt;Developing Components&lt;/h2&gt;Components desiring to use asynchronous queries should inherit from the new       UnmanagedComponent, instead of BaseComponent. The UnmanagedComponent base class       provides pre-composed methods that implement the core lifecycle, for a variety       of different scenarios:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;code&gt;synchronous&lt;/code&gt; implements a synchronous lifecycle identical to the core         CDF lifecycle.&lt;/li&gt;&lt;li&gt;&lt;code&gt;triggerQuery&lt;/code&gt; implements a simple interface to a lifecycle built around         Query objects.&lt;/li&gt;&lt;li&gt;&lt;code&gt;triggerAjax&lt;/code&gt; implements a simple interface to a lifecycle built around         AJAX calls.&lt;/li&gt;&lt;/ul&gt;Since all these lifecycle methods expect a callback that handles the actual       component rendering, it's conventional style to have that callback as a method       of the Component, called &lt;code&gt;redraw&lt;/code&gt;. It's also considered standard practice to       use &lt;code&gt;Function#bind&lt;/code&gt; or &lt;code&gt;_.bind&lt;/code&gt; to ensure that, inside the &lt;code&gt;redraw&lt;/code&gt; callback,       &lt;code&gt;this&lt;/code&gt; points to the component itself.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Use synchronous if Your Component Doesn't Use External Data&lt;/h3&gt;Components that don't use any external data at all can continue subclassing       BaseComponent without any change of functionality. However, for the sake of       consistency (or because you want querying to be optional -- see the section for       details), your can use subclass UnmanagedComponent and use the &lt;code&gt;synchronous&lt;/code&gt;      lifecycle method to emulate BaseComponent's behaviour:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;update: function() {&lt;br /&gt;  this.synchronous(this.redraw);&lt;br /&gt; }&lt;br /&gt; &lt;/code&gt;&lt;/pre&gt;If you want to pass parameters to &lt;code&gt;redraw&lt;/code&gt;, you can pass them as an array to       &lt;code&gt;synchronous&lt;/code&gt;:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;update: function() {&lt;br /&gt;  /* Will call this.redraw(1,2,3) */&lt;br /&gt;  this.synchronous(this.redraw, [1,2,3]);&lt;br /&gt;}&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;h3&gt;Use triggerQuery when You Want Your Component To Use CDA/Query Objects&lt;/h3&gt;If you're using a CDA data source, you probably want to use &lt;code&gt;triggerQuery&lt;/code&gt; to       handle the component lifecycle for you. &lt;code&gt;triggerQuery&lt;/code&gt; expects at a minimum       a query definition and a redraw callback to process the query results. The       query definition is an object of the form:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;{&lt;br /&gt;  dataAccessId: 'myQuery',&lt;br /&gt; file: '/path/to/my/datasourceDefinition.cda'&lt;br /&gt;}&lt;br /&gt; &lt;/code&gt;&lt;/pre&gt;Typically, if you're using CDE, these properties will be added to one of either       &lt;code&gt;this.queryDefinition&lt;/code&gt; or &lt;code&gt;this.chartDefinition&lt;/code&gt; so you can just use this       pattern:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;update: function() {&lt;br /&gt; var redraw = _.bind(this.redraw,this);&lt;br /&gt; this.triggerQuery(this.queryDefinition, redraw);&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&lt;/code&gt; &lt;/pre&gt;&lt;h3&gt;Alternating Between Static And Query-Based Data&lt;/h3&gt;As the lifecycle methods are completely self-contained, you can switch between       them at will, deciding on an appropriate lifecycle at runtime. A common pattern       (used e.g. in SelectComponent, and the CccComponent family) is exposing a       &lt;code&gt;valuesArray&lt;/code&gt; property, and using static data if &lt;code&gt;valuesArray&lt;/code&gt; is provided, or       a query if it is not. Using UnmanagedComponent, this convention would like like       this:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;update: function() {&lt;br /&gt; var redraw = _.bind(this.redraw,this);&lt;br /&gt; if(this.valuesArray &amp;amp;&amp;amp; this.valuesArray.length &amp;gt; 0) {&lt;br /&gt;  this.synchronous(redraw,this.valuesArray);&lt;br /&gt; } else {&lt;br /&gt;  this.triggerQuery(this.queryDefinition,redraw);&lt;br /&gt; }&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;h3&gt;Rolling Your Own&lt;/h3&gt;If you prefer having absolute control over your component, you can eschew the       use of any of the lifecycle methods. Instead, you're expected to follow these       guidelines:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Call &lt;code&gt;this.preExec()&lt;/code&gt; as soon as possible, and bail out if it returns false.&lt;/li&gt;&lt;li&gt;If &lt;code&gt;this.preExec()&lt;/code&gt; returned true, call &lt;code&gt;this.block()&lt;/code&gt; before any meaningful         amount of work is done.&lt;/li&gt;&lt;li&gt;If you called &lt;code&gt;this.block()&lt;/code&gt;, make sure to always call &lt;code&gt;this.unblock()&lt;/code&gt; as         well once all relevant work is done.&lt;/li&gt;&lt;li&gt;If you want to use any sort of AJAX, consider using &lt;code&gt;triggerAjax()&lt;/code&gt;&lt;/li&gt;&lt;li&gt;Call &lt;code&gt;this.postExec()&lt;/code&gt; once all processing is done&lt;/li&gt;&lt;li&gt;You can override &lt;code&gt;this.block&lt;/code&gt; and &lt;code&gt;this.unblock&lt;/code&gt; to implement component         specific UI blocking. If you override either, you &lt;i&gt;must&lt;/i&gt; override the other         as well.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div class="webdetailsBoxShadow"&gt;&lt;h2 style="vertical-align: bottom;"&gt;New and Changed Features&lt;/h2&gt;&lt;h3&gt;Component Cloning&lt;/h3&gt;If your component holds any references to other components, you need to override       the &lt;code&gt;clone&lt;/code&gt; method so as to ensure that you don't accidentally clone the target       component. For example, if your component has a property named &lt;code&gt;otherComponent&lt;/code&gt;      pointing at another component, you should override &lt;code&gt;clone&lt;/code&gt; using this general       template:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;clone: function(parameterRemap,componentRemap,htmlRemap) {&lt;br /&gt; var other = this.otherComponent;&lt;br /&gt; delete this.otherComponent;&lt;br /&gt; var that = this.base(parameterRemap,componentRemap,htmlRemap);&lt;br /&gt; this.otherComponent = that.otherComponent = other;&lt;br /&gt; return that;&lt;br /&gt;}&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;h3&gt;New Base Component Class: UnmanagedComponent&lt;/h3&gt;UnmanagedComponent is a new base class for components. It provides the base on       which all asynchronous components should be built.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Per-Component &lt;code&gt;isManaged&lt;/code&gt; Flag&lt;/h3&gt;Each component should have a member property with isManaged, indicating whether       CDF should managed the component's lifecycle. Components where &lt;code&gt;isManaged&lt;/code&gt; is        false need to implement it's own calls to the lifecycle&lt;br /&gt;&lt;br /&gt;&lt;h2 style="vertical-align: top;"&gt;Component Stub&lt;/h2&gt;Here's an example of a stub to be used whenever you need to use a new component.     Just override the redraw function to what you need and define the component     as you'd usually do&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;ExampleComponent = UnmanagedComponent.extend({&lt;br /&gt;&lt;br /&gt;  update: function() {&lt;br /&gt;    var redraw = _.bind(this.redraw,this);&lt;br /&gt;    if(this.valuesArray &amp;amp;&amp;amp; this.valuesArray.length &amp;gt; 0) {&lt;br /&gt;      this.synchronous(redraw,{resultset: this.valuesArray});&lt;br /&gt;    } else {&lt;br /&gt;      this.triggerQuery(this.queryDefinition,redraw);&lt;br /&gt;    }&lt;br /&gt;  },&lt;br /&gt;&lt;br /&gt;  redraw: function(data){&lt;br /&gt;&lt;br /&gt;    /* Specific code goes here */&lt;br /&gt;    if(!this.isInitialized){&lt;br /&gt;      this.compiledStr = Mustache.compile(" Got a result set with  {{nrRows}} rows and {{nrCols}} columns &lt;br /&gt;&lt;br /&gt;");&lt;br /&gt;      this.isInitialized = true;&lt;br /&gt;    }&lt;br /&gt;    $("#"+this.htmlObject).html(this.compiledStr({&lt;br /&gt;      nrRows: data.resultset.length||0,&lt;br /&gt;      nrCols: data.resultset[0]?data.resultset[0].length || 0:0&lt;br /&gt;    }));&lt;br /&gt;&lt;br /&gt;  }&lt;br /&gt;&lt;br /&gt;});&lt;br /&gt;&lt;br /&gt;customComponent = &lt;br /&gt;  {&lt;br /&gt;  name: "regionSelector",&lt;br /&gt;  type: "example",&lt;br /&gt;  parameters:[],&lt;br /&gt;  valuesArray:[["1","Lisbon"],["2","Dusseldorf"]],&lt;br /&gt;  priority: 5,&lt;br /&gt;  htmlObject: "sampleObject",&lt;br /&gt;  executeAtStart: true&lt;br /&gt;};&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;h2 style="vertical-align: top;"&gt;Debugging lifecycle&lt;/h2&gt;&lt;div style="vertical-align: top;"&gt;In order to be able to more easily track the lifecycle of CDF, we added some extra debugging features. If you use &lt;a href="http://www.mozilla.org/en-US/firefox"&gt;Firefox&lt;/a&gt;'s &lt;a href="https://getfirebug.com/"&gt;Firebug&lt;/a&gt; or a recent version of &lt;a href="http://www.google.com/chrome"&gt;Chrome&lt;/a&gt; (I believe &amp;gt;= 24) you'll be able to track the execution in a nice way:&lt;/div&gt;&lt;div style="vertical-align: top;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="vertical-align: top;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-4WGt1msNoBM/UOcSZGOCLKI/AAAAAAAAAc4/e96KEuk-Ths/s1600/lifecycle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-4WGt1msNoBM/UOcSZGOCLKI/AAAAAAAAAc4/e96KEuk-Ths/s1600/lifecycle.png" height="123" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/1957718025195470369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/cdf-async-support.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1957718025195470369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1957718025195470369'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2013/01/cdf-async-support.html' title='CDF Async Support'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-4WGt1msNoBM/UOcSZGOCLKI/AAAAAAAAAc4/e96KEuk-Ths/s72-c/lifecycle.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6421438005087105279</id><published>2012-12-20T11:36:00.002Z</published><updated>2012-12-20T11:36:35.793Z</updated><title type='text'>Chartmas gift: CCC 2 released</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-LJizPVoIPGs/UNLzTC5MM1I/AAAAAAAAAco/e08MYZIndzY/s1600/ccc2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-LJizPVoIPGs/UNLzTC5MM1I/AAAAAAAAAco/e08MYZIndzY/s1600/ccc2.png" height="121" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;I admit we're really bad at at Seasons' Greetings. I also assume that the worst thing I&amp;nbsp; could do was to start wishing peace on earth with a Jingle bells, as I surely would end up listed in a &lt;a href="http://www.buzzfeed.com/hgrant/26-of-the-best-of-the-worst-family-holiday-cards"&gt;website like this&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;So we decided to do what we do best - deliver software!&lt;br /&gt;&lt;br /&gt;The team worked really hard to be able to release, still during 2012, the new version of CCC, our Community Charting Components. As usual, totally opensource and free of charge - The only requirement is that you enjoy it!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If you want to see a little bit of all the changes around it, feel free to browse the &lt;a href="http://www.webdetails.pt/ccc2/"&gt;demonstration site&lt;/a&gt; we put up for you:&lt;br /&gt;&lt;a href="https://www.blogger.com/goog_376184731"&gt;&lt;br /&gt;&lt;/a&gt;&lt;a href="http://www.webdetails.pt/ccc2/"&gt;http://www.webdetails.pt/ccc2/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Since this requires a lot of testing, we created a separate branch for CDE with CCC2. During January we'll merge it to the main code line, as we feel confident about backward compatibility.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Have fun, and a great 2013!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-pedro</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6421438005087105279/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/chartmas-gift-ccc-2-released.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6421438005087105279'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6421438005087105279'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/chartmas-gift-ccc-2-released.html' title='Chartmas gift: CCC 2 released'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-LJizPVoIPGs/UNLzTC5MM1I/AAAAAAAAAco/e08MYZIndzY/s72-c/ccc2.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-8515534362033578288</id><published>2012-12-14T18:21:00.000Z</published><updated>2012-12-14T18:47:45.118Z</updated><title type='text'>Sharing Mondrian cache in Pentaho</title><content type='html'>&lt;h3&gt;Problem &lt;/h3&gt;Conceptually, this should be very simple. You have a mondrian cube, &lt;i&gt;Sales&lt;/i&gt;, for instance. You build analysis, reports, dashboards, etc, on it. You would expect that issuing the same query though all of them would result in Mondrian getting the cells from cache, right?&lt;br /&gt;&lt;br /&gt;Wrong&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Why?&lt;/h3&gt;The problem is that every connection type does it's own access type to mondrian. Some examples:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Xactions and Analyzer uses &lt;a href="http://source.pentaho.org/svnroot/bi-platform-v2/tags/4.8.0-stable/bi-platform-plugin-services/connections/src/org/pentaho/platform/plugin/services/connections/mondrian/MDXConnection.java"&gt;MDXConnection&lt;/a&gt;&lt;/li&gt;&lt;li&gt;PRD builds it's own connections&lt;/li&gt;&lt;li&gt;CDA uses PRD&lt;/li&gt;&lt;li&gt;Saiku uses XMLA Servlet&lt;/li&gt;&lt;li&gt;Jpivot uses 2 different methods for a single query&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;If your datawarehouse is small, you may opt to ignore all this. But for large chunks of data this gets important, as all the queries are done several times, impacting performance and memory usage.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Approach &lt;/h3&gt;The root of the problem is in mondrian, and the methodology it uses to determine if a connection is new. Basically it builds an internal composite key made of 2 distinct informations:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;i&gt;Key = &amp;lt;is it the same schema?&lt;/i&gt;&lt;i&gt;&lt;i&gt;&amp;gt;&lt;/i&gt; + &lt;/i&gt;&lt;i&gt;&lt;i&gt;&amp;lt;&lt;/i&gt;is it the same connection?&lt;/i&gt;&lt;i&gt;&amp;gt;&lt;/i&gt;&lt;/blockquote&gt;&lt;br /&gt;Depending on the origin of the query, each of this informations are different. There's already a way to solve the schema part, by &lt;a href="http://mondrian.pentaho.com/documentation/configuration.php"&gt;telling mondrian&lt;/a&gt; to use the content checksum through the connection flag &lt;i&gt;UseContentChecksum&lt;/i&gt;. This may not be the best option if we use &lt;i&gt;Dynamic Schema Processors&lt;/i&gt; but it's the best we got.&lt;br /&gt;&lt;br /&gt;The second part generated lots of discussion.&amp;nbsp; Finally, &lt;a href="http://julianhyde.blogspot.com/"&gt;Julian Hyde&lt;/a&gt; unleashed his hammer of Mondrian architect and decided that the best approach was to implement another property called &lt;i&gt;JdbcConnectionUuid&lt;/i&gt; that would uniquely identify that connection. From that point on, it passes to the persons responsible for configuring the system the burden of ensuring the connection is the same. You can follow all the details in this &lt;a href="http://jira.pentaho.com/browse/MONDRIAN-1231"&gt;jira issue&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Implementation&lt;/h3&gt;The problem with this solution is that we (by we I mean a multi team of pentaho people and community) had not only to change mondrian itself, but all the pieces around it that connected to mondrian. Some samples of some of the jiras involved:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://jira.pentaho.com/browse/MONDRIAN-1231"&gt;JIRA-1231&lt;/a&gt;: Mondrian changes&lt;/li&gt;&lt;li&gt;&lt;a href="http://jira.pentaho.com/browse/PRD-4003?"&gt;PRD-4003&lt;/a&gt;: PRD changes&lt;/li&gt;&lt;li&gt;&lt;a href="http://jira.pentaho.com/browse/BISERVER-7429"&gt;BISERVER-7429&lt;/a&gt;: Bi server related changes to ensure properties are passed correctly&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/webdetails/cda/commit/0ab2751b59a9dfe953e24f91cd6dce0a2c6c86a8"&gt;CDA changes&lt;/a&gt; to allow transparently read info from &lt;i&gt;olap/datasources.xml&lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;All of this has been fixed and implemented in &lt;i&gt;Pentaho 4.8&lt;/i&gt; (that includes &lt;i&gt;Mondrian 3.5.0&lt;/i&gt;)&lt;br /&gt;&lt;h3&gt;How to use it&lt;/h3&gt;In order to use this new approach, there are some things that need to be done&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;BI server&lt;/h4&gt;In &lt;i&gt;solution/system/olap/datasources.xml&lt;/i&gt; you need to add the 2 properties mentioned above to each catalog's &lt;i&gt;DatasourceInfo&lt;/i&gt;, ending with a line like this (example taken from the &lt;i&gt;SteelWheels&lt;/i&gt; catalog):&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;Provider=mondrian;DataSource=SampleData;EnableXmla=true;&lt;b&gt;UseContentChecksum&lt;/b&gt;=true;&lt;b&gt;JdbcConnectionUuid&lt;/b&gt;=SteelWheels&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;In summary, one needs to add &lt;i&gt;UseContentChecksum=true&lt;/i&gt; to all catalogs and a unique string to each of the catalog's &lt;i&gt;JdbcConnectionUuid=foo&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;Also - please vote on &lt;a href="http://jira.pentaho.com/browse/BISERVER-7641"&gt;BISERVER-7641&lt;/a&gt;. I think this should be the default behavior. &lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Analyzer and Xactions&lt;/h4&gt;Just works&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Report Designer&lt;/h4&gt;Unfortunately,  &lt;a href="http://jira.pentaho.com/browse/PRD-4004"&gt;PRD-4004&lt;/a&gt; wasn't resolved. It still ignores the configuration we have in &lt;i&gt;datasources.xml&lt;/i&gt;, but we have a way to explicitly set the options on the report. When editing a mondrian datasource in PRD there's a &lt;i&gt;Properties&lt;/i&gt; button where we can add them&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-dAjjHJVdRJM/UMtm_2rW65I/AAAAAAAAAcY/InOO_R2iBG0/s1600/PRD.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/-dAjjHJVdRJM/UMtm_2rW65I/AAAAAAAAAcY/InOO_R2iBG0/s1600/PRD.png" height="204" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;CDA&lt;/h4&gt;Since &lt;a href="https://github.com/webdetails/cda/commit/0ab2751b59a9dfe953e24f91cd6dce0a2c6c86a8"&gt;this commit&lt;/a&gt;,&amp;nbsp; CDA is able to read &lt;i&gt;datasources.xml&lt;/i&gt;. It's now on trunk and should be included on the first stable release of 2013.&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Saiku&lt;/h4&gt;&lt;br /&gt;We need to make some changes in Saiku, since it ships it's own mondrian version. These are the steps that need to be followed:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Edit &lt;i&gt;saiku/plugin.spring.xml&lt;/i&gt; and comment the lines referring to the properties &lt;i&gt;datasourceResolverClass&lt;/i&gt; and &lt;i&gt;saikuDatasourceProcessor&lt;/i&gt;&lt;/li&gt;&lt;li&gt;Delete mondrian jars from &lt;i&gt;saiku/lib/&lt;/i&gt;. (&lt;i&gt;eg: rm -f saiku/lib/mondrian* &lt;/i&gt;&lt;i&gt;&lt;i&gt;saiku/&lt;/i&gt;lib/olap4j* &lt;/i&gt;&lt;i&gt;&lt;i&gt;saiku/&lt;/i&gt;lib/eigenbase*&lt;/i&gt;)&lt;/li&gt;&lt;/ol&gt;We're working with the guys from &lt;a href="http://analytical-labs.com/"&gt;Analytical Labs&lt;/a&gt; in order to include in saiku a script called &lt;i&gt;saiku-shareMondrian.sh&lt;/i&gt; that would do there steps and should be available in the following days.&lt;br /&gt;&lt;h4&gt;Jpivot&lt;/h4&gt;This thing is dead. Hacks after hacks to make the connections to mondrian. Use &lt;i&gt;Analyzer&lt;/i&gt; or &lt;i&gt;Saiku&lt;/i&gt;.</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/8515534362033578288/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/sharing-mondrian-cache-in-pentaho.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8515534362033578288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8515534362033578288'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/sharing-mondrian-cache-in-pentaho.html' title='Sharing Mondrian cache in Pentaho'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-dAjjHJVdRJM/UMtm_2rW65I/AAAAAAAAAcY/InOO_R2iBG0/s72-c/PRD.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-896832654845185853</id><published>2012-12-10T13:15:00.002Z</published><updated>2012-12-13T11:36:44.382Z</updated><title type='text'>Multiple provider authentication in pentaho</title><content type='html'>&lt;h1&gt;&lt;/h1&gt;&lt;h2&gt;Objective&lt;/h2&gt;We needed to add support for multiple authentication in &lt;a href="http://www.pentaho.com/"&gt;Pentaho&lt;/a&gt;. Currently, there are several providers that can be used, like LDAP, hibernate and jdbc, but it's not possible to use several at the same time.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Pentaho doesn't allow for that out of the box, but it's simple enough, so we created a project to do just that, than you can find on &lt;a href="https://github.com/webdetails/bi-platform-engine-security-multipleuserrole"&gt;github&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This project allows for that by implementing a bean that will cycle through all the desired providers&lt;br /&gt;&amp;nbsp; &lt;br /&gt;&lt;h2&gt;Source code&lt;/h2&gt;This project is hosted at &lt;a href="https://github.com/webdetails/bi-platform-engine-security-multipleuserrole"&gt;https://github.com/webdetails/bi-platform-engine-security-multipleuserrole&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Also, please vote in this &lt;a href="http://jira.pentaho.com/browse/BISERVER-7829"&gt;jira issue&lt;/a&gt; to get this in the main pentaho source&lt;br /&gt;&lt;br /&gt;&lt;h2&gt;How to use&lt;/h2&gt;Following this steps should get you going:&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Compile the project&lt;/h3&gt;Just run &lt;i&gt;ant&lt;/i&gt; and you shuold be all set&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Deploy the jar in Pentaho&lt;/h3&gt;Copy the resulting file to pentaho's lib dir, (eg: &lt;i&gt;/opt/pentaho/server/webapps/pentaho/WEB-INF/lib/&lt;/i&gt;).&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Copy the muiltiple provider spring config files to &lt;i&gt;solution/system&lt;/i&gt;&lt;/h3&gt;Copy the following files from &lt;i&gt;resources/&lt;/i&gt; to &lt;i&gt;pentaho-solutions/system/&lt;/i&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;applicationContext-pentaho-security-multiple.xml&lt;/li&gt;&lt;li&gt;applicationContext-spring-security-multiple.xml&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;h3&gt;Change &lt;i&gt;pentaho-spring-beans.xml&lt;/i&gt; to load the new files&lt;/h3&gt;Instead of loading one of the individual files (defaults to hibernate authentication),  tell pentaho to instead load the new configuration files. &lt;br /&gt;&lt;i&gt;pentaho-spring-beans.xml&lt;/i&gt; should then look something like:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;beans&gt;&lt;br /&gt;&amp;lt;beans&amp;gt;&lt;br /&gt;  &amp;lt;import resource="pentahosystemconfig.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="adminplugins.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="systemlisteners.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="sessionstartupactions.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="applicationcontext-spring-security.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="applicationcontext-common-authorization.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="applicationcontext-spring-security-multiple.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="applicationcontext-pentaho-security-multiple.xml" /&amp;gt;&lt;br /&gt;  &amp;lt;import resource="pentahoobjects.spring.xml" /&amp;gt;&lt;br /&gt;&amp;lt;/beans&amp;gt;&lt;br /&gt;&lt;/beans&gt;&lt;/code&gt;&lt;/pre&gt;&lt;pre&gt;&lt;code&gt;&amp;nbsp;&lt;/code&gt;&lt;/pre&gt;&lt;i&gt;Note: This snippet is taken from pentaho 4.8, different versions may have files with different content&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Change the list of providers in &lt;i&gt;applicationContext-spring-security.xml&lt;/i&gt;&lt;/h3&gt;In &lt;i&gt;applicationContext-spring-security.xml&lt;/i&gt;, look for a bean named &lt;i&gt;authenticationManager&lt;/i&gt;, and add the providers you want. If you're using the sample file &lt;i&gt;applicationContext-spring-security-multiple.xml&lt;/i&gt;, the 2 referenced beans are called &lt;i&gt;daoAuthenticationProvider&lt;/i&gt; and &lt;i&gt;daoAuthenticationProvider2&lt;/i&gt;. You're not limited to just 2 providers.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Edit &lt;i&gt;applicationContext-spring-security-multiple.xml&lt;/i&gt; and&lt;i&gt; applicationContext-pentaho-security-multiple.xml&lt;/i&gt;&amp;nbsp; &lt;/h3&gt;Edit &lt;i&gt;applicationContext-spring-security-multiple.xml&lt;/i&gt; and&lt;i&gt; applicationContext-pentaho-security-multiple.xml&lt;/i&gt; for your case.&lt;br /&gt;&lt;br /&gt;This is the part where you configure the types of authentication you want. Even if it seems complicated at first, you'll notice that the 2 configuration files for multiple authentication are simply a concatenations of the individual files provided by pentaho, making sure the bean names don't colide. &lt;br /&gt;On this example case we have hibernate and memory authentication joined together.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Launch the bi-server&lt;/h3&gt;Launch the BI server and &lt;i&gt;hopefully&lt;/i&gt; you're all set.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Troubleshooting&lt;/h3&gt;Of course this won't work out of the box. Pay close attention to the logs. One of the most common errors is bean id collision, which is reported in the pentaho logs.&lt;br /&gt;One other option is setting the &lt;i&gt;spring&lt;/i&gt; logs to debug in &lt;i&gt;log4j.xml&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Questions / doubts&lt;/h3&gt;For any additional questions, feel free to get in touch: ctools &lt;i&gt;at&lt;/i&gt; webdetails.pt</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/896832654845185853/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/multiple-provider-authentication-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/896832654845185853'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/896832654845185853'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/multiple-provider-authentication-in.html' title='Multiple provider authentication in pentaho'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-1994840351187533943</id><published>2012-12-05T19:12:00.001Z</published><updated>2013-01-04T14:29:42.587Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='pentaho ctools marketplace'/><title type='text'>Pentaho Marketplace is here</title><content type='html'>This is a huge milestone. &lt;a href="http://www.webdetails.pt/"&gt;Webdetails&lt;/a&gt; and &lt;a href="http://www.pentaho.com/"&gt;Pentaho &lt;/a&gt;worked together to implement a Marketplace to the BI server. Will Gorman, VP of Engineering of Pentaho already had the chance to &lt;a href="http://www.willgorman.com/?p=55"&gt;blog about this&lt;/a&gt;, but it's such a big deal that we want to do as much noise as possible.&lt;br /&gt;&lt;br /&gt;There's also an equivalent effort for the kettle marketplace, but I'll focus on the platform side. The marketplace comes with the released version of Pentaho CE 4.8, available from &lt;a href="http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/4.8.0-stable/"&gt;sourceforge&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Objectives&lt;/h3&gt;The goals are simple:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Allow pentaho users to get in contact to what plugins exist&lt;/li&gt;&lt;li&gt;Provide a very simple way to install/maintain plugins&lt;/li&gt;&lt;li&gt;Get more users / feedback around existing plugins&lt;/li&gt;&lt;li&gt;Motivate new community contributions&lt;/li&gt;&lt;/ul&gt;For someone that is not familiar with the concept of &lt;a href="http://wiki.pentaho.com/display/ServerDoc2x/Developing+Plugins"&gt;pentaho plugins&lt;/a&gt;, he'll be surprised about the quantity - and quality - of some of them.&lt;br /&gt;&lt;br /&gt;My hope is even that the marketplace can render &lt;a href="http://pedroalves-bi.blogspot.de/2011/06/ctools-installer-making-things-fast.html"&gt;ctools-installer&lt;/a&gt; obsolete,&amp;nbsp; even if we absolutely plan to maintain it.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;How to use it&lt;/h3&gt;We've put a lot of effort to make it a straightforward task. Basically run through the UI, and the tasks are what you'd expect&lt;br /&gt;&lt;br /&gt;&lt;h4&gt;Launch the Marketplace&lt;/h4&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-xs4k7s9scHk/UL-ZLgHD8rI/AAAAAAAAAbk/N17qiMWbsPc/s1600/marketplace1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="212" src="http://1.bp.blogspot.com/-xs4k7s9scHk/UL-ZLgHD8rI/AAAAAAAAAbk/N17qiMWbsPc/s1600/marketplace1.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;See the list of available/installed plugins&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-n6c5M9uWcVg/UL-ZMeX1aII/AAAAAAAAAbo/FSxOMFom4HI/s1600/marketplace2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="212" src="http://1.bp.blogspot.com/-n6c5M9uWcVg/UL-ZMeX1aII/AAAAAAAAAbo/FSxOMFom4HI/s1600/marketplace2.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Check details on specific plugins&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-1z1vgeO36-o/UL-ZNK8o8jI/AAAAAAAAAbw/jUVh_meFiB0/s1600/marketplace3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="209" src="http://4.bp.blogspot.com/-1z1vgeO36-o/UL-ZNK8o8jI/AAAAAAAAAbw/jUVh_meFiB0/s1600/marketplace3.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Install it&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-Y72oZ2OJSPk/UL-ZN4MXCdI/AAAAAAAAAb8/4hqzTYjMkIU/s1600/marketplace4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="210" src="http://3.bp.blogspot.com/-Y72oZ2OJSPk/UL-ZN4MXCdI/AAAAAAAAAb8/4hqzTYjMkIU/s1600/marketplace4.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Reboot, and you're done&lt;/h4&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-RvCE4yJFAiQ/UL-ZPJna6xI/AAAAAAAAAcA/nmLWb23E0D0/s1600/marketplace5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="210" src="http://1.bp.blogspot.com/-RvCE4yJFAiQ/UL-ZPJna6xI/AAAAAAAAAcA/nmLWb23E0D0/s1600/marketplace5.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;h4&gt;Report any issues you find&lt;/h4&gt;Can't get any simpler. If you find any issues, &lt;a href="http://jira.pentaho.com/browse/MARKET"&gt;please report them&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Do you want to contribute your own plugin?&lt;/h3&gt;Please do! See the instructions on Github for the &lt;a href="https://github.com/pentaho/marketplace-metadata"&gt;marketplace-metadata&lt;/a&gt; project. As a side note, the Marketplace itself is a plugin, and the &lt;a href="https://github.com/pentaho/marketplace"&gt;code available at github&lt;/a&gt; too. You can also get in contact with me or any of the vibrating community that's building this tools.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Installing the Marketplace in a older Pentaho version&lt;/h3&gt;Even if you don't have the latest and greatest version of pentaho, you can still benefit from the Marketplace. Just &lt;a href="http://ci.analytical-labs.com/view/Webdetails/job/Webdetails-Marketplace/"&gt;manually download the plugin&lt;/a&gt; from CI (choose your favorite flavor, &lt;i&gt;tgz&lt;/i&gt; or &lt;i&gt;zip&lt;/i&gt;) and drop it in your pentaho-solutions/system directory.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;i&gt;ps: please don't use the comments section for posting issues, use &lt;a href="http://forums.pentaho.com/"&gt;Pentaho forums&lt;/a&gt;&lt;/i&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/1994840351187533943/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/pentaho-marketplace-is-here.html#comment-form' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1994840351187533943'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1994840351187533943'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/12/pentaho-marketplace-is-here.html' title='Pentaho Marketplace is here'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-xs4k7s9scHk/UL-ZLgHD8rI/AAAAAAAAAbk/N17qiMWbsPc/s72-c/marketplace1.png' height='72' width='72'/><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6986404007005184952</id><published>2012-11-19T15:25:00.000Z</published><updated>2012-11-19T15:25:41.318Z</updated><title type='text'>Making CDF calls asynchronous</title><content type='html'>&lt;h3&gt;Motivation&lt;/h3&gt;Since the beginning, &lt;a href="http://cdf.webdetails.org/"&gt;CDF&lt;/a&gt; has always worked in sync mode. Really not sure if it was a decision based on simplification or simply lack of skill. It's time to change that, though.&lt;br /&gt;&lt;br /&gt;Having CDF components making synchronous requests simplifies the lifecycle. However, has a performance penalty, like the image shows:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-U1LhYaUqb20/UKoRlsFYzDI/AAAAAAAAAbQ/GRTwzE1TwcM/s1600/cdfAsync.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-U1LhYaUqb20/UKoRlsFYzDI/AAAAAAAAAbQ/GRTwzE1TwcM/s1600/cdfAsync.png" height="320" width="240" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;We can very easily get into a big number of components. 6 selects, 3 charts and a table add up to 10 requests. In the synchronous world, means that we will only make one request after getting the answer back. In asynchronous mode, &lt;i&gt;all requests are done at the same time&lt;/i&gt;* and then the dashboard will render much, much faster. There's also another problem; some browsers lock some of it's functionality while the request is being made. In Firefox, for instance, you can't even change tabs. That's bad. &lt;br /&gt;&lt;br /&gt;&amp;nbsp;CDF and &lt;a href="http://cde.webdetails.org/"&gt;CDE&lt;/a&gt; have tons of very important performance tweaks. All files are minimized and concatenated into one, all caching information is built so that as little requests are made and &lt;a href="http://cda.webdetais.org/"&gt;CDA&lt;/a&gt; has a very strong query cache layer, optionally backed up by &lt;a href="http://cdc.webdetails.org/"&gt;CDC&lt;/a&gt;. Turning the lifecycle into asynchronous is the next step&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Challenges&lt;/h3&gt;This is a big change, hopefully only for the CDF core and not for end users. Basically, in async mode, the concept of "&lt;i&gt;postExecution&lt;/i&gt;" changes; Instead of being executed after the component &lt;i&gt;executes&lt;/i&gt; itself, has to be called after the component &lt;i&gt;renders&lt;/i&gt; itself. So all components will have to be changed, one by one, to conform to the new lifecycle to be implemented.&lt;br /&gt;&lt;br /&gt;Currently all components inherit from &lt;i&gt;BaseComponent&lt;/i&gt;. That's where the core lifecycle is. Now we're implementing a new one, called &lt;i&gt;UnmanagedComponent&lt;/i&gt;. &lt;br /&gt;&lt;br /&gt;We're not converting all the components now. It would be a very big effort. It's something that will happen over time. Our focus right now are the following ones:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Selectors&lt;/li&gt;&lt;li&gt;CCC Charts&lt;/li&gt;&lt;li&gt;Query component&lt;/li&gt;&lt;li&gt;Table component&lt;/li&gt;&lt;/ul&gt;This should give us a good performance boost to start with.&lt;br /&gt;&lt;br /&gt;All this work is being developed as we speak, and you can follow the progress by monitoring the &lt;a href="https://github.com/webdetails/cdf/tree/async"&gt;async github&lt;/a&gt; branch, .&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Examples&lt;/h3&gt;&lt;br /&gt;Here are some samples of the old and new way of implementing components. This is the classic way of displaying a &lt;i&gt;Hello World&lt;/i&gt; component&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;HelloBaseComponent = BaseComponent.extend({&lt;br /&gt;  update: function() {&lt;br /&gt;    $("#" + this.htmlObject).text("Hello World!");&lt;br /&gt;  }&lt;br /&gt;});&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;This is on the new, and soon to be recommended approach:  &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;HelloUnmanagedComponent = UnmanagedComponent.extend({&lt;br /&gt;  update: function() {&lt;br /&gt;    var render = _.bind(this.render, this);&lt;br /&gt;    this.synchronous(render);&lt;br /&gt;  },&lt;br /&gt; &lt;br /&gt;  render: function() {&lt;br /&gt;    $("#" + this.htmlObject).text("Hello World!");&lt;br /&gt;  }&lt;br /&gt;});&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;This may not seem a big gain. However, if we need data from datasources, it's much easier to use:  &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;HelloQueryBaseComponent = BaseComponent.extend({&lt;br /&gt;  update: function() {&lt;br /&gt;    var myself = this;&lt;br /&gt;    var query = new Query(myself.queryDefinition);&lt;br /&gt;    query.fetchData(myself.parameters, function(values) {&lt;br /&gt;      var changedValues = undefined;&lt;br /&gt;        if((typeof(myself.postFetch)=='function')){&lt;br /&gt;          changedValues = myself.postFetch(values);                &lt;br /&gt;        }&lt;br /&gt;        if (changedValues !== undefined) {&lt;br /&gt;          values = changedValues;&lt;br /&gt;        }&lt;br /&gt;        myself.render(values);&lt;br /&gt;     });&lt;br /&gt;  },&lt;br /&gt; &lt;br /&gt;  render: function(data) {&lt;br /&gt;    $("#" + this.htmlObject).text(JSON.stringify(data));&lt;br /&gt;  }&lt;br /&gt;});&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;In the new format, you'll just have to call the &lt;i&gt;triggerQuery&lt;/i&gt; method  &lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;HelloQueryUnmanagedComponent = UnmanagedComponent.extend({&lt;br /&gt;  update: function() {&lt;br /&gt;    var render = _.bind(this.render,this);&lt;br /&gt;    this.triggerQuery(this.queryDefinition, render);&lt;br /&gt;  },&lt;br /&gt; &lt;br /&gt;  render: function(data) {&lt;br /&gt;    $("#" + this.htmlObject).text(JSON.stringify(data));&lt;br /&gt;  }&lt;br /&gt;});&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;Besides &lt;i&gt;triggerQuery&lt;/i&gt; there are other useful calls: &lt;i&gt;triggerAjax&lt;/i&gt; and &lt;i&gt;synchronous&lt;/i&gt;, the first one for standard ajax calls and the second for synchronized calls, like used in the first example&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;*Caveat&lt;/h3&gt;I wrote that "In asynchronous mode, &lt;i&gt;all requests are done at the same time&lt;/i&gt;". That's not exactly true as browsers limit the amount of concurrent requests. Depends on the browser, but it's something around 4 simultaneous requests. If we have 10 components, the requests will still be split into 2 or 3 batches.&lt;br /&gt;&lt;br /&gt;Just to let you know that we're also thinking about this, but that's for another blog post ;)</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6986404007005184952/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/11/making-cdf-calls-asynchronous.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6986404007005184952'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6986404007005184952'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/11/making-cdf-calls-asynchronous.html' title='Making CDF calls asynchronous'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-U1LhYaUqb20/UKoRlsFYzDI/AAAAAAAAAbQ/GRTwzE1TwcM/s72-c/cdfAsync.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-860346944914465875</id><published>2012-11-12T14:18:00.002Z</published><updated>2012-11-12T14:18:48.704Z</updated><title type='text'>New CDF Release 12.11.09 - Compatibility release with 4.8</title><content type='html'>Pentaho 4.8 EE &lt;a href="http://www.pentaho.com/48/"&gt;is out&lt;/a&gt;! Lot's of cool stuff there.&lt;br /&gt;&lt;br /&gt;And obviously we had to make sure our Ctools were compatible with it. (Ok, I'm lying - one of our users tested it and reported some incompatibility :p ).&lt;br /&gt;&lt;br /&gt;That incompatibility has been fixed and now there's a new CDF release (the only that had to be changed), that you can get in the &lt;a href="http://www.webdetails.pt/ctools.html#tabcdf"&gt;usual place&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/860346944914465875/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/11/new-cdf-release-121109-compatibility.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/860346944914465875'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/860346944914465875'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/11/new-cdf-release-121109-compatibility.html' title='New CDF Release 12.11.09 - Compatibility release with 4.8'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-7972674945719890802</id><published>2012-10-18T01:27:00.000+01:00</published><updated>2012-10-18T01:27:13.431+01:00</updated><title type='text'>Ctools Releases: 12.10.17</title><content type='html'>&lt;div&gt;&lt;b&gt;CDE:&lt;/b&gt;&lt;span style="font-size: xx-small;"&gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;Changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; * Added google analytics componente developed by Sinn Tecnologia&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added about &amp;amp; documentation buttons to CDE.&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added new features to CGG export window&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * General image update&lt;/span&gt;&lt;/div&gt;&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;b&gt;CDF:&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Main features:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;* Patches for CCC&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;* General image refresh&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * [FIX] Fixed linear scale calculation for second axis when all values are null.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * General image Update&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added pv.Rule#strokeDasharray (with support for SVG only, not IE)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * [FIX] Fixed linear scale calculation for second axis when all values are equal.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * AMD compatibility with current Pentaho release&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;br /&gt;&lt;b&gt;CDA:&lt;/b&gt;&lt;br /&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Major upgrades:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * &lt;/span&gt;Nasty bugs fixed!&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;Full Changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [REDMINE 1159] - Cache Manager does not work when bi-server is in file based repository&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [REDMINE 1087] - json output problem when number is "infinity"&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fixed [REDMINE 1217] - JSON infinity workaround&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Fix some instantiation issues with standalone mode&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Update build to include runtime-lib as a variable rather than a hardcoded value&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Image update&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * add datatable filter on post-process; stateless TableModelUtils static instead of singleton&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * AbstractKettleExporter: inherit from AbstractExporter&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * NaturalOrderComparator: lic header, clean&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * sort table model: cast where reflection is used&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * CdaContentGenerator: bad outputIndexId not fatal, don't check null name for extension , bool parse&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&amp;nbsp; &lt;/div&gt;&lt;div&gt;&lt;b&gt;CGG: &lt;/b&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;Main features:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: x-small;"&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;* Mainly bug fixes&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;Full changelog:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;* Sync CCC with that of the CDF branch&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Updated platform dependency to 4.5 stable&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added missing copyright for jquery bits&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Added missing output type declaration when no filename was specified&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-size: xx-small;"&gt;&amp;nbsp; &amp;nbsp; * Workaround for changes in detecting method name in draw &lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&amp;nbsp;&lt;/div&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/7972674945719890802/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/ctools-releases-121017.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7972674945719890802'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7972674945719890802'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/ctools-releases-121017.html' title='Ctools Releases: 12.10.17'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-6724041283722844445</id><published>2012-10-15T23:22:00.001+01:00</published><updated>2012-10-15T23:22:37.554+01:00</updated><title type='text'>Data Services Pyramid</title><content type='html'>&lt;div class="moz-text-html" lang="x-unicode"&gt;&lt;div&gt;&lt;h1&gt;Data Services Pyramid&lt;/h1&gt;While delivering services on &lt;i&gt;Business Intelligence&lt;/i&gt; - or any of the other words you want to make up for it, &lt;i&gt;Business Analytics&lt;/i&gt;, &lt;i&gt;Data Science&lt;/i&gt;, &lt;i&gt;whatever&lt;/i&gt; (I’ve always been more focused on actual content than on the wrapping  some people like to put around it) - one very important part that  sometimes ends up being a bit overlooked is &lt;i&gt;how&lt;/i&gt; you hand out the information.&lt;br /&gt;&lt;br /&gt;In the end it’s all about it. All the discussions carried out about  technology choices, implementation, hardware, all ends up the same way -  giving the information back to the consumers.&lt;br /&gt;&lt;br /&gt;This may seem obvious. It’s not, and has a lot of implications  especially when we have at our reach a huge amount of technologies to  choose from.&lt;br /&gt;&lt;br /&gt;This is written from the perspective of someone internal to a  company’s metrics / data engineer team. I’m wearing my Mozilla metrics  hat now, thinking about all the challenges we face on a daily basis.&lt;br /&gt;&lt;h2&gt;Information delivery&lt;/h2&gt;One may be tempted to think that the best way to deliver information  is the one where the user has the higher degrees of freedom when it  comes to interact with that data. It’s not. With great power comes great responsibility, and sometimes &lt;i&gt;less is better&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;A metrics person has to guarantee not only the data he delivers is  correct, but also that there’s no possibility of that data to be  misunderstood. I find the second part harder than the first, especially  when we’re dealing with scenarios with people that  have been in a specific line of business for a long time.&lt;br /&gt;&lt;h2&gt;Ground zero - Raw data&lt;/h2&gt;This is the example taken to the extreme. Raw data is, by far, the  most valuable source of information, and the one where we can draw more  information from. Obviously, makes no sense to anyone to hand out this  source to information consumers. Very few hold the &lt;i&gt;keys to the dungeon&lt;/i&gt;, and for good reasons.&lt;br /&gt;You need to go up in the stack, progressively converting data  language, the zeros and ones, into business terminology, to build your &lt;b&gt;Data Services Pyramid&lt;/b&gt;&lt;br /&gt;&lt;h2&gt;Data Services Pyramid &lt;/h2&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-h20I6JpLs2E/UHxaaupoUYI/AAAAAAAAAas/l-Q82z4ocns/s1600/Data_Services_Pyramid2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="273" src="http://4.bp.blogspot.com/-h20I6JpLs2E/UHxaaupoUYI/AAAAAAAAAas/l-Q82z4ocns/s1600/Data_Services_Pyramid2.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;While growing our information system, we’ll inevitably want to end up  with something like this. But instead of a structured stack, it’s very  easy to end up with a disjointed mess of technology and services.&lt;br /&gt;&lt;br /&gt;Should be easy to understand what kind of information deliverables  sit on each level. They obviously depend on the tools at use, but some  examples:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Raw data  &lt;ul&gt;&lt;li&gt;Handing out files&lt;/li&gt;&lt;li&gt;Direct file system access&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Storage area  &lt;ul&gt;&lt;li&gt;SQL&lt;/li&gt;&lt;li&gt;NoSQL endpoints&lt;/li&gt;&lt;li&gt;Hadoop / hive access&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Ad-hoc layer  &lt;ul&gt;&lt;li&gt;Metadata based tools&lt;/li&gt;&lt;li&gt;OLAP clients&lt;/li&gt;&lt;li&gt;MDX&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Preformatted deliverables  &lt;ul&gt;&lt;li&gt;Dashboards&lt;/li&gt;&lt;li&gt;Reports&lt;/li&gt;&lt;li&gt;Csv/xls exports&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;The Services challenge&lt;/h2&gt;We must have a very clear objective in mind - put users as far up the stack as possible. The goal is not to prevent them from accessing the  data. Much more important, it’s preventing them from &lt;i&gt;misinterpreting&lt;/i&gt; the data. Every time I see someone asking access to a database or hadoop, I start trembling - I know trouble’s coming.&lt;br /&gt;&lt;br /&gt;As we move up in the stack, we’re converting data language into  business language. This is a crucial point. As we work on this  translation, we need to set in stone our final language as a shared set  of &lt;i&gt;dimensions&lt;/i&gt;, well documented terminology, that poses no questions to anyone whenever they’re used.&lt;br /&gt;&lt;br /&gt;And this translation is &lt;i&gt;hard&lt;/i&gt;, moving from data terms to  business terms. The less people involved here, the better. That’s why  I’ve identified, in the Data Services Pyramid, a danger zone that we  should be very cautious with, mostly cause the fore mentioned translation  isn’t done yet.&lt;br /&gt;&lt;h2&gt;The Technology challenge&lt;/h2&gt;I once &lt;a href="https://plus.google.com/112678702228711889851/posts/eVeouesvaVX"&gt;read a post&lt;/a&gt; about the differences between Google and  Amazon when it came to inter-departments interaction, and how &lt;a href="http://en.wikipedia.org/wiki/Jeff_Bezos"&gt;Bezos&lt;/a&gt;  always insisted that such interaction &lt;i&gt;had&lt;/i&gt; to be done by strict  APIs and never in an unstructured way. What initially seemed a huge  overhead and caused a lot of despair, the services oriented approach  soon turned out to be Amazon’s greatest strength. After fine tuning all  inter-department relations, wrapping those up as  offerings to the  outside world was easy - and that was in the origin of the cloud  services offering.&lt;br /&gt;&lt;br /&gt;For some reason this keeps popping up to my mind, and I’m the kind of person that can’t remember what had for lunch.&lt;br /&gt;&lt;br /&gt;This principle should also apply for the technology stack. We’re in a  golden era for IT junkies. Things are moving at the speed of light,  every year or less there’s a new best thing in town and it’s getting  harder and harder to keep up with everything and separate the hype from  the real deal.&lt;br /&gt;&lt;br /&gt;Staying frozen on time is just not an option. The complete opposite -  always trying new things, new technologies and approaches - is equally  dangerous. Gets to a point where too few people are familiar with the  systems, they can easily get deprecated internally, the quality of the  information stored is progressively harder to validate and comes with  associated hardware costs, probably the cheapest of all.&lt;br /&gt;&lt;br /&gt;This has to become a two step process. Inside a metrics team, every  new technology has to pass through an approval stage. The goal is simple  - there has to be a limited set of approved tools and technologies  involved. Internally, one may prefer to use R, others perl, python, excel, etc. That’s perfectly fine and recommended,  but when it comes to the &lt;i&gt;official&lt;/i&gt; set of tools, everyone must be familiar with them. And make the list short, as there’s only so much one can know.&lt;br /&gt;&lt;br /&gt;The second step is the link with my initial story about Amazon. If we  apply the same principles to the technologies, we’re much less dependent on them, and gets much easier to swap out and optimize  individual pieces.&lt;br /&gt;&lt;br /&gt;If you think about the technologies and tools you already use and the  ones you’re evaluating, they most likely fit into a very specific place  inside the data services pyramid. This is where the services approach  kicks in. Even though we can have more than one choice sitting at each  level of the stack, it’s &lt;i&gt;crucial&lt;/i&gt; that each one talks with the  layer immediately below and is able to provide end points to the layers above to connect. Al the data translations that I mentioned before must  happen only once, at a very well determined place. Data and overall  service integrity is at stake here.&lt;br /&gt;It’s now easy to see how changing specific bits gets less painful and  error prone - as long as they maintain the same API interface to the  enclosing layers.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;--------------------------&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Feedback appreciated. This is the result of my experience and would be great to have the ability to brainstorm on this issue.&lt;/div&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/6724041283722844445/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/data-services-pyramid.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6724041283722844445'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/6724041283722844445'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/data-services-pyramid.html' title='Data Services Pyramid'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-h20I6JpLs2E/UHxaaupoUYI/AAAAAAAAAas/l-Q82z4ocns/s72-c/Data_Services_Pyramid2.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-1526610460349119543</id><published>2012-10-03T11:35:00.003+01:00</published><updated>2012-10-03T11:35:48.914+01:00</updated><title type='text'>CBF meets Ctools Installer</title><content type='html'>&lt;a href="http://pedroalves-bi.blogspot.pt/2011/06/ctools-installer-making-things-fast.html"&gt;Ctools-installer&lt;/a&gt; and &lt;a href="http://cbf.webdetails.org/"&gt;CBF&lt;/a&gt; are very useful projects that usually go hand in hand. And if you've been using them for a while you probably thought "why can't CBF just call ctools installer instead of having to pass the parameters manually?"&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Well, &lt;a href="https://twitter.com/AndtorG"&gt;Andrea Torre&lt;/a&gt; went ahead and sent us a patch to allow exactly that.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;There are a few new targets in CBF:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;br /&gt;ctools-installer        Install ctools (prompts for modules)&lt;br /&gt;ctools-installer-auto   Install ctools silently&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;The difference between them is obvious - one does an automated install (passing the &lt;i&gt;-y&lt;/i&gt; flag) while the other prompts for which plugins to install.&lt;br /&gt;&lt;br /&gt;All you need is to call this target. Here's an example:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;br /&gt;ant -Dproject=demo -Denv=pedro ctools-installer&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;We can control the behavior of this feature. It's possible to specify which branch to install (defaults to stable branch) and specify if you want ctools-installer to run after a CBF build:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;br /&gt;ctools.install = true&lt;br /&gt;ctools.branch = dev&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;Ty it and let us know. You need to upgrade your ctools-installer script (just run it and it will auto upgrade)&lt;br /&gt;&lt;br /&gt;On an also important note, CBF is now &lt;a href="https://github.com/webdetails/cbf"&gt;also on github&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/1526610460349119543/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/cbf-meets-ctools-installer.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1526610460349119543'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/1526610460349119543'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/10/cbf-meets-ctools-installer.html' title='CBF meets Ctools Installer'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-8951200115717292444</id><published>2012-09-21T12:16:00.001+01:00</published><updated>2012-09-22T14:25:01.631+01:00</updated><title type='text'>CBF and Versioning - How to develop Pentaho solutions in a team</title><content type='html'>&lt;span style="font-size: x-small;"&gt;&lt;i&gt;This blog post is sponsored by the &lt;a href="http://www.antoniusintelligence.nl/" target="_blank"&gt;Antonius&lt;/a&gt; BI team&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Problem&lt;/h3&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;We're struggling with versioning and deployment. We need some &amp;nbsp;help in  managing the development and (especially) deployment process&lt;/blockquote&gt;&lt;br /&gt;This is a very well known problem. While versioning is a relevant issue in all areas - and not only development - the correct way to approach it changes drastically, and &lt;a href="http://www.pentaho.com/" target="_blank"&gt;Pentaho&lt;/a&gt; is not different.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.webdetails.pt/" target="_blank"&gt;We've&lt;/a&gt; been working in pentaho projects for over 5 years. Since day 1, on scenarios where we had to manage:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Working on different projects locally&lt;/li&gt;&lt;li&gt;Working with multiple people on the same project&lt;/li&gt;&lt;li&gt;Managing several environments - development, staging, upgrades&lt;/li&gt;&lt;li&gt;Managing platform upgrades&lt;/li&gt;&lt;/ul&gt;&amp;nbsp;There are more reasons, but this are probably the most pressing ones. It's not up to me to convince &lt;i&gt;you&lt;/i&gt; that you need a good approach to this problems: you'll know you need it!&lt;br /&gt;&lt;br /&gt;I'll write a&amp;nbsp; collection of my experiences regarding this issue as of today. It can obviously change since we're always trying to optimize our processes.&lt;br /&gt;&lt;br /&gt;I've been asked a specific set of question, that I'll introduce while contextualizing the big picture.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Versioning &lt;/h3&gt;There are 2 ways to approach this problem:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Simple &lt;i&gt;pentaho-solutions&lt;/i&gt; versioning&lt;/li&gt;&lt;li&gt;&lt;a href="http://cbf.webdetails.org/" target="_blank"&gt;CBF&lt;/a&gt; (Community Build Framework) setup &lt;/li&gt;&lt;/ol&gt;The first one is a subset of the second. There's no easy way to say it: The first one is a must have. If you're not doing it, you are asking for trouble.&lt;br /&gt;&lt;br /&gt;As for the CBF... well, this is something Paul Stoellberger , &lt;a href="http://analytical-labs.com/" target="_blank"&gt;Saiku&lt;/a&gt; author and general BI guru said on the irc channel:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;br /&gt;&amp;lt; pstoellberger&amp;gt; i really need to get my cbf out again&lt;br /&gt;&amp;lt; pstoellberger&amp;gt; just done hackish installations recently&lt;/blockquote&gt;&lt;br /&gt;Everyone doing something serious on pentaho uses CBF. You may &lt;i&gt;think&lt;/i&gt; you don't need it. You may think it's complicated and doesn't worth the effort (since it's not an argument, we've put up a &lt;a href="http://cbf.webdetails.org/?q=content/quickstart" target="_blank"&gt;quickstart bundle&lt;/a&gt; for you). You're wrong, and you'll know it once you start using it.&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;VCS infrastructure&lt;/h3&gt;But one thing at a time. Before going through the specific workflows, you need to choose your &lt;i&gt;VCS&lt;/i&gt; (Version Control System) tool. Once again there are 2 options:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://subversion.apache.org/" target="_blank"&gt;SVN&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://git-scm.com/" target="_blank"&gt;GIT&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;i&gt;Note: if you think I forgot to include &lt;a href="http://www.nongnu.org/cvs/" target="_blank"&gt;CVS&lt;/a&gt; on this list, press Alt-F4 to close your &lt;a href="http://www.iesucks.info/" target="_blank"&gt;Internet Explorer&lt;/a&gt; browser and go back to 1998, we don't want you here!&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I'll skip the long arguments about those two. Use Git. It's amazing, handles branches and tag in a very efficient way, allows multiple remote repositories and has great UI tools that will be very handy.&lt;br /&gt;&lt;br /&gt;Also bear in mind that Git is not &lt;a href="http://www.github.com/" target="_blank"&gt;Github&lt;/a&gt;. While you can definitely host your solutions in there, you're not forced to. I'd even say most of us would rather keep our files and implementation very securely locked.&lt;br /&gt;&lt;br /&gt;So starting with infrastructure; Git doesn't even need a "server". Any shared directory could be used as the central repository. I've even used the &lt;i&gt;"poor man's git server"&lt;/i&gt;, initializing a repository ( &lt;i&gt;git init --bare myproject/&lt;/i&gt; ) in a &lt;a href="http://www.dropbox.com/" target="_blank"&gt;dropbox&lt;/a&gt; folder. That has proven to be a very error prone approach, since there's no way to guarantee that our repository won't get damaged with the dropbox synchronization. So use a proper system. There are 2 options (this is getting a bit repetitive):&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Use an existing one: &lt;a href="http://www.github.com/" target="_blank"&gt;Github&lt;/a&gt; (or something)&lt;/li&gt;&lt;li&gt;Install your own: &lt;a href="https://github.com/sitaramc/gitolite/wiki" target="_blank"&gt;Gitolite&lt;/a&gt; (or something)&lt;/li&gt;&lt;/ul&gt;Choosing github is a very valid and logical option. And doesn't mean your solutions have to be public, has you can have a payed account that will allow you to have private repositories. There should also be other alternatives on this lines.&lt;br /&gt;&lt;br /&gt;We use &lt;a href="https://github.com/sitaramc/gitolite/wiki" target="_blank"&gt;Gitolite&lt;/a&gt;. Once installed, it's very easy to administer (creating repositories, adding / managing people and permissions) and very secure, as it uses &lt;a href="http://en.wikipedia.org/wiki/Secure_Shell" target="_blank"&gt;ssh connections&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Regardless of what you chose, I'll now assume you have a proper VCS server available.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Versioning &lt;i&gt;pentaho-solutions&lt;/i&gt; directory&lt;/h3&gt;Originally we included the entire CBF structure in the same repository, as described in the &lt;a href="http://cbf.webdetails.org/?q=content/documentation-file-structure" target="_blank"&gt;documentation&lt;/a&gt;. SVN allows us to checkout a subdirectory of the repository, so we could checkout only the &lt;i&gt;project-client/solution/&lt;/i&gt; folder. In &lt;i&gt;git&lt;/i&gt; that's not possible and I could never get my head around &lt;a href="http://git-scm.com/book/en/Git-Tools-Submodules" target="_blank"&gt;submodules&lt;/a&gt;, so I simply have 2 different repositories (notice a trend here with the number 2?):&lt;br /&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;i&gt;project-client&lt;/i&gt;&lt;/li&gt;&lt;li&gt;&lt;i&gt;project-client-solution&lt;/i&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The second has all the BI server solution files. The first one has... everything else, from CBF specific structure to ETL. After we clone the project-client repository we either link to the &lt;i&gt;project-client-solution&lt;/i&gt; or clone that one inside the &lt;i&gt;project-client&lt;/i&gt; directory.&lt;br /&gt;&lt;br /&gt;Here's a real world example of one of our projects:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;pre&gt;&lt;span style="font-size: x-small;"&gt;pedro@arpeggio:~/tex/pentaho/project-client (master)$ d&lt;br /&gt;total 68&lt;br /&gt;24 -rw-r--r-- 1 pedro pedro 20859 Feb 21  2011 build.xml.cbf-3.7&lt;br /&gt; 4 drwxr-xr-x 2 pedro pedro  4096 May 21 16:41 config/&lt;br /&gt; 4 drwxr-xr-x 2 pedro pedro  4096 Feb 17  2011 etl/&lt;br /&gt; 4 -rwxr-xr-x 1 pedro pedro   138 Apr 27  2011 importCache.sh*&lt;br /&gt; 4 -rw-rw-r-- 1 pedro pedro  1057 May 21 14:46 kettle.properties.diogo&lt;br /&gt; 4 -rw-rw-r-- 1 pedro pedro  1118 May 21 14:46 kettle.properties.remote&lt;br /&gt; 8 -rw-rw-r-- 1 pedro pedro  4534 May 21 14:46 kettle.properties.server&lt;br /&gt; 4 drwxr-xr-x 4 pedro pedro  4096 Apr 19 12:57 patches/&lt;br /&gt; 4 drwxr-xr-x 5 pedro pedro  4096 Feb 17  2011 patches-ee/&lt;br /&gt; 4 -rwxrwxr-x 1 pedro pedro   261 May 21 14:46 remote_in.sh*&lt;br /&gt; 0 lrwxrwxrwx 1 pedro pedro    29 Sep  6  2011 solution -&amp;gt; ../project-client-solution/&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/blockquote&gt;&lt;br /&gt;The &lt;i&gt;project-client-solution&lt;/i&gt; is simply the pentaho solution folder &lt;b&gt;without&lt;/b&gt; the system specific folder, &lt;i&gt;admin, bi-developers, plugin-samples, &lt;/i&gt;&lt;i&gt;&lt;i&gt;steel-wheels,&lt;/i&gt; system&lt;/i&gt;. Tune this exclusion list at will in a file called &lt;a href="http://www.kernel.org/pub/software/scm/git/docs/gitignore.html"&gt;.gitignore&lt;/a&gt;. Here's mine for the &lt;i&gt;project-client-solution&lt;/i&gt;:&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-size: x-small;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;span style="font-size: x-small;"&gt;pedro@arpeggio:~/tex/pentaho/project-stonegate-solution (master)$ cat .gitignore &lt;br /&gt;admin/&lt;br /&gt;system/&lt;br /&gt;steel-wheels/&lt;br /&gt;bi-developers/&lt;br /&gt;*_tmp*&lt;br /&gt;index*.properties&lt;br /&gt;cde_sample/&lt;br /&gt;.project&lt;br /&gt;plugin-samples/&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/blockquote&gt;From this point on we can use the generic VCS techniques. Git can take a while to get used to, but the list of commands we need are very simple. I won't focus a lot on the &lt;i&gt;project-client&lt;/i&gt; CBF structure, as there's lots of documentation on the &lt;a href="http://cbf.webdetails.org/"&gt;CBF website&lt;/a&gt;, but everything still applies.&lt;br /&gt;&lt;br /&gt;Moving on to the list of questions&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;FAQ: Frequently Asked Questions&lt;/h3&gt;&lt;blockquote class="tr_bq"&gt;Q: How do I checkout a project with git?&lt;/blockquote&gt;A: &lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$ git clone git@yourserver.com:project-client-solution&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;This can be done not only for the development sandboxes (both &lt;i&gt;project-client&lt;/i&gt; and &lt;i&gt;project-client-solution&lt;/i&gt;) but also, for the latter, on the production and staging machines. That will allow us to manage versioning on the server too.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: Should several developers be working on the same development box? How to avoid conflicts?&lt;/blockquote&gt;A: I do not recommend this. It's always a good idea to have a local development sandbox. It's doable if the developers are working on different areas, but you'll get into conflicts that will be harder to isolate&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: How to check in/check out units of work?&lt;/blockquote&gt;A: Once we have our local repository, we can jump to the most up to date&amp;nbsp; with the command:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;$ git pull &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: How to check in my work?&lt;/blockquote&gt;&lt;br /&gt;Unlike svn, when you commit work it doesn't get pushed to the remote repository. You shuold commit early and often (don't even need internet for it) and then push to the central server. You do that with the following:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ git commit -m 'message' &lt;files&gt;&lt;/files&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ git push &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Here's where a visual tool gets useful. Mac users have &lt;a href="http://gitx.frim.nl/"&gt;GitX&lt;/a&gt;, linux users have &lt;a href="https://github.com/jessevdk/gitg"&gt;GitG&lt;/a&gt;, everyone has &lt;a href="http://gitk.sourceforge.net/"&gt;gitk&lt;/a&gt; and &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;git update&lt;/span&gt;. So for the commit part I usual use the visual tool.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: Guidelines how to package a new version of a dashboard and  migrate it from development to test, and from test to production&lt;/blockquote&gt;&lt;br /&gt;The development happens on the main branch, called the &lt;i&gt;master&lt;/i&gt; branch. When we're ready to release a certain version, we create a new branch with the name of that version (obviously, feel free to choose what you want). In this example, I'll call it &lt;i&gt;v1&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-AYGky0VxrNI/UF22-WsppLI/AAAAAAAAAaE/Ik5njUUdHuY/s1600/git-branch.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-AYGky0VxrNI/UF22-WsppLI/AAAAAAAAAaE/Ik5njUUdHuY/s1600/git-branch.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;That branch can then be be checked out on the QA server.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;dev$ git branch v1&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;dev$ git push&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ git pull&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ git checkout branch&amp;nbsp;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Next step is testing it. If we find a bug that need fixing, we can fix it on that branch. If appropriated, we can merge the fix back on the development branch&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ gi&lt;span style="font-size: x-small;"&gt;t commit &lt;span style="font-size: x-small;"&gt;-&lt;span style="font-size: x-small;"&gt;m 'fixed bug on v1&lt;span style="font-size: x-small;"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ gi&lt;span style="font-size: x-small;"&gt;t push&lt;/span&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;dev$ git &lt;span style="font-size: x-small;"&gt;checkout&amp;nbsp; master #&lt;span style="font-size: x-small;"&gt; be sure we change back to master&lt;/span&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;dev$ gi&lt;span style="font-size: x-small;"&gt;t mer&lt;span style="font-size: x-small;"&gt;ge v1 &lt;span style="font-size: x-small;"&gt;# pull the &lt;span style="font-size: x-small;"&gt;bug fix&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;dev$ gi&lt;span style="font-size: x-small;"&gt;t push&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; # fix integrated&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;After we're happy with the solution and ready to go to production. This is where the tags come to good use. We can create a tag on it and push that information. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ git tag v&lt;span style="font-size: x-small;"&gt;1.0&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;qa$ git push --tags&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;prod$ git pull&lt;span style="font-size: x-small;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-size: x-small;"&gt;prod$ git checkout v1.&lt;span style="font-size: x-small;"&gt;0&lt;/span&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;That would put you on the correct version. Don't forget to update the solution repository.&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: I was playing with the solution but I don't want to commit any change, just want to wipe the entire thing and get back to the clean state&lt;/blockquote&gt;&lt;br /&gt;&amp;nbsp;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;span style="font-size: x-small;"&gt;g&lt;span style="font-size: x-small;"&gt;it reset --hard HEAD&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q: I changed a single and I want to have it back / reverted to the last state&lt;/blockquote&gt;&lt;br /&gt;&amp;nbsp;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;span style="font-size: x-small;"&gt;g&lt;span style="font-size: x-small;"&gt;it checkout &lt;span style="font-size: x-small;"&gt;&lt;file&gt;&lt;/file&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;Q:&lt;i&gt; &lt;/i&gt;How to avoid overwriting each others work (which happens now if we're not careful)&lt;/blockquote&gt;&lt;br /&gt;No git command for this. Basically comes for free. However, if we're working on a bigger change, it's recommended that you create a new branch for it. That way you can work on that with guarantees that a specific feature can be developed independently.&lt;br /&gt;&lt;br /&gt;Here's a schematic of how it conceptually works:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-Kz_ayPvKD8M/UFw1Y0JQpQI/AAAAAAAAAZw/Xrqutp151mk/s1600/sd_merge.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="112" src="http://4.bp.blogspot.com/-Kz_ayPvKD8M/UFw1Y0JQpQI/AAAAAAAAAZw/Xrqutp151mk/s1600/sd_merge.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;The idea is to isolate a specific feature. It's very easy to start:&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;span style="font-size: x-small;"&gt;g&lt;span style="font-size: x-small;"&gt;it &lt;span style="font-size: x-small;"&gt;branch featureX&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This will start an isolated development. You can do regular commits, pushes, etc. When finished, you can can merge back to the man branch. You do that by&lt;br /&gt;&lt;br /&gt;1) switching to the main branch, usually called master:&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;span style="font-size: x-small;"&gt;g&lt;span style="font-size: x-small;"&gt;it &lt;span style="font-size: x-small;"&gt;checkout master&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;2) Merge feature x back to the main master&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;, Courier, monospace;"&gt;$ &lt;span style="font-size: x-small;"&gt;g&lt;span style="font-size: x-small;"&gt;it merge featureX&lt;span style="font-size: x-small;"&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;3) If there are any conflicts you'll need to resolve them and commit. After a push, feature X will then be available.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Conclusion&lt;/h3&gt;&lt;br /&gt;This doesn't aim to be a full tutorial about git. There are tons of great documentation, and it's really a powerful tool. But should provide some best approaches on how to best handle a pentaho implementation.&lt;br /&gt;&lt;br /&gt;Any extra questions, just email / comment here and I'll add them&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/8951200115717292444/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/09/cbf-and-versioning-how-to-develop.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8951200115717292444'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/8951200115717292444'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/09/cbf-and-versioning-how-to-develop.html' title='CBF and Versioning - How to develop Pentaho solutions in a team'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-AYGky0VxrNI/UF22-WsppLI/AAAAAAAAAaE/Ik5njUUdHuY/s72-c/git-branch.png' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6644329693530300467.post-7378370602393441676</id><published>2012-09-13T14:32:00.000+01:00</published><updated>2012-09-13T14:43:39.879+01:00</updated><title type='text'>CGG - Putting CCC charts in Pentaho reporting / other tools</title><content type='html'>This has been a long standing blog post. CGG has been around for ages now, it's even in &lt;a href="http://www.pentaho.com/" target="_blank"&gt;Pentaho&lt;/a&gt; platform core, and only a few knew about it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;CGG stands for Community Graphics Generator. Although it even has it's own &lt;a href="http://cgg.webdetails.org/" target="_blank"&gt;homepage&lt;/a&gt;,&amp;nbsp; I never had the chance to blog about it. It's a somewhat hardcore plugin: it's basically able to execute on the server side custom scripts (java / javascript) that outputs images that can be used in external systems.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;One of the most useful use cases is to be able to export &lt;a href="http://ccc.webdetails.org/" target="_blank"&gt;CCC&lt;/a&gt; charts to images (&lt;i&gt;png&lt;/i&gt; or &lt;i&gt;svg&lt;/i&gt;) - either to allow a user the ability to download it, or to include it in &lt;a href="http://reporting.pentaho.com/" target="_blank"&gt;Pentaho Reporting&lt;/a&gt; or any other of those tools&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here's an example of how it works. Imagine you develop for your users a great looking dashboard, &lt;i&gt;almost&lt;/i&gt; as good looking as one of the UIs &lt;a href="http://www.webdetails.pt/" target="_blank"&gt;we&lt;/a&gt; develop :) :&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-1X-PDBqVRuI/UFG8ooxY4pI/AAAAAAAAAYY/9JXmzluBWls/s1600/cgg_1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="293" src="http://2.bp.blogspot.com/-1X-PDBqVRuI/UFG8ooxY4pI/AAAAAAAAAYY/9JXmzluBWls/s320/cgg_1.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;i&gt;&amp;nbsp;(This was the dashboard we developed in a recent &lt;a href="http://ctools.webdetails.org/" target="_blank"&gt;Ctools&lt;/a&gt; training course in Orlando, Pentaho headquarters)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;As you can see, we went to some extent, exploring CCC capabilities, to fine tune the charts. Now we want to be able to use that line chart to PRD. Even though CGG does not have a UI and doesn't aim to be user friendly, &lt;a href="http://cde.webdetails.org/" target="_blank"&gt;CDE&lt;/a&gt; has a way to make the bridge to it, using some hidden features.&lt;br /&gt;&lt;br /&gt;Going straight to the point. In CDE, press &lt;i&gt;shift-G&lt;/i&gt;. That will open the CGG window (as a side reference, press &lt;i&gt;shift-?&lt;/i&gt; to see other very useful keyboard shortcuts):&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-y9ibP7G7fJY/UFG8pa4lzCI/AAAAAAAAAYg/7aK-8ks2U-E/s1600/cgg_2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="189" src="http://2.bp.blogspot.com/-y9ibP7G7fJY/UFG8pa4lzCI/AAAAAAAAAYg/7aK-8ks2U-E/s320/cgg_2.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;If you ever used this feature before, you'll notice this screen is a little different. We just added some more features, namely the ability to automatically get the url that generates that image from the outside - even I always struggled to find the right url. This is already available in the &lt;i&gt;dev&lt;/i&gt; builds and will be released in the next stable version.&lt;br /&gt;&lt;br /&gt;You'll be able to see that you can set some options there, namely the &lt;i&gt;outputType&lt;/i&gt;, that currently can be either &lt;i&gt;png&lt;/i&gt; or &lt;i&gt;svg&lt;/i&gt; and you may need to change the server url if are developing in a sandbox and want to publish it to a server.&amp;nbsp; I actually thought &lt;i&gt;svg&lt;/i&gt; support was broken in PRD but just tested it and that seems to be fixed, so take advantage of that feature.&lt;br /&gt;&lt;br /&gt;One other thing to note is that you &lt;b&gt;must&lt;/b&gt; take care of authentication. Either you pass the extra arguments &lt;i&gt;&amp;amp;userid=joe&amp;amp;password=password&lt;/i&gt; or you allow that url to be accessible with no password, or whatever.&lt;br /&gt;&lt;br /&gt;If you save your dashboard and try that url, this is what you'll get:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-IAMzNdU7xhw/UFG8qIfocNI/AAAAAAAAAYs/UMYLxw1X_rU/s1600/cgg_3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="220" src="http://2.bp.blogspot.com/-IAMzNdU7xhw/UFG8qIfocNI/AAAAAAAAAYs/UMYLxw1X_rU/s320/cgg_3.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This is what CGG is all about, and there's tons of engineering work underneath to allow this "simple step" to work with just one keystroke.&lt;br /&gt;&lt;br /&gt;Now it's gets very obvious what to do. Open PRD, add an &lt;i&gt;image&lt;/i&gt; component, and put that url (don't forget authentication)&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-wzLWZc3rXFo/UFG8rMs2f1I/AAAAAAAAAY0/da9SwOSZh44/s1600/cgg_4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="156" src="http://3.bp.blogspot.com/-wzLWZc3rXFo/UFG8rMs2f1I/AAAAAAAAAY0/da9SwOSZh44/s320/cgg_4.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;You'll also want to check the blog post I did a while back about using &lt;a href="http://pedroalves-bi.blogspot.de/2011/03/cda-datasources-in-prd.html" target="_blank"&gt;CDA datasources in PRD&lt;/a&gt;. In the meanwhile, in recent versions of PRD (4.5 and above) you don't need to download the CDA datasources, all you need to do is enable the experimental features in Edit -&amp;gt; Preferences -&amp;gt; General -&amp;gt; Enable Experimental Features.&lt;br /&gt;&lt;br /&gt;To render this report from the server, you'll need to copy to &lt;i&gt;pentaho/WEB-INF/lib/&lt;/i&gt; the file &lt;i&gt;pentaho-reporting-engine-classic-extensions-cda-*.jar&lt;/i&gt; that you'll find in PRD library directory.&lt;br /&gt;&lt;br /&gt;There's another feature of CGG. You can pass parameters to the query, by adding &lt;i&gt;&amp;amp;paramParameterName=ParameterValue&lt;/i&gt;&amp;nbsp; to the url. And that can be exposed from PRD too.&lt;br /&gt;&lt;br /&gt;Just create a prompt the usual way. On my sample, the parameter is called a month, and since the query is already on the dashboard too, I just need to select it to build the prompt in the prpt.&lt;br /&gt;&lt;br /&gt;However, in order to make the call with the new parameter, we can't use the &lt;i&gt;image&lt;/i&gt; component anymore, and use the &lt;i&gt;image-field&lt;/i&gt; instead.&lt;br /&gt;&lt;br /&gt;For that, we need to create a formula that will build that url. Here's the sample formula I used:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;span style="font-size: x-small;"&gt;="http://127.0.0.1:8080/pentaho/content/cgg/Draw?script=/SyncOrlando/Lab12/lineChart.js&amp;amp;outputType=png&amp;amp;userid=joe&amp;amp;password=password&amp;amp;parammonthParameter="&amp;amp;URLENCODE([month])&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-kO8SQa9ciwE/UFG8sMReoOI/AAAAAAAAAY8/eousWw6QXSw/s1600/cgg_5.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="198" src="http://2.bp.blogspot.com/-kO8SQa9ciwE/UFG8sMReoOI/AAAAAAAAAY8/eousWw6QXSw/s320/cgg_5.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;The &lt;i&gt;URLENCODE&lt;/i&gt; function allows us to be sure that our parameters will reach the server properly. In the end, we should have a report looking like this:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-xxLHdTAGgc0/UFG8tHPyd1I/AAAAAAAAAZA/RsApBlBs6GQ/s1600/cgg_6.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="198" src="http://2.bp.blogspot.com/-xxLHdTAGgc0/UFG8tHPyd1I/AAAAAAAAAZA/RsApBlBs6GQ/s320/cgg_6.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&amp;nbsp;Advanced topic: If you make reports with a lot of charts (you can even have cgg rendering a chart per line) you'll soon find out that your report starts to take a really long time to render. There's an explanation for it: PRD does a lot of passes to better determine the final layout. Since CGG doesn't support the HEAD request method, PRD won't get the appropriate info regarding cache, resulting in a bunch of requests for the image. Fortunately, &lt;a href="http://www.sherito.org/" target="_blank"&gt;Thomas Morgner&lt;/a&gt; allowed is a workaround to this issue, by changing a behavior in &lt;i&gt;libloader&lt;/i&gt;. In your &lt;i&gt;loader.properties&lt;/i&gt; file (located in &lt;i&gt;WEB-INF/classes&lt;/i&gt; for the server or create it under &lt;i&gt;prd/resources&lt;/i&gt; dir for the report designer) add the following lines:&lt;br /&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;# Controls the minimum time between HEAD requests regardless of&lt;br /&gt;# the date -headers given by the response object.&lt;br /&gt;org.pentaho.reporting.libraries.resourceloader.config.url.FixedCacheDelay=500000&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;blockquote class="tr_bq"&gt;&lt;span style="font-size: x-small;"&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;# Fixes the date headers by simply using Date.now() as mod-date.&lt;br /&gt;# This will break the HTTP specs and thus it is disabled by default.&lt;br /&gt;org.pentaho.reporting.libraries.resourceloader.config.url.FixBrokenWebServiceDateHeader=true&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;This particular feature will be available in Pentaho Reporting 3.9.1 (or you'll have to compile your own)&lt;br /&gt;&lt;br /&gt;And this is what it looks from the Pentaho BI server:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-2oMUgudCt3E/UFHewwHCfLI/AAAAAAAAAZY/N3tKjsfokwU/s1600/cgg_7.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="236" src="http://1.bp.blogspot.com/-2oMUgudCt3E/UFHewwHCfLI/AAAAAAAAAZY/N3tKjsfokwU/s320/cgg_7.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Cheers&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-pedro&lt;br /&gt;&lt;div class="" style="clear: both; text-align: center;"&gt;&amp;nbsp;&amp;nbsp; &lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://pedroalves-bi.blogspot.com/feeds/7378370602393441676/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/09/cgg-putting-ccc-charts-in-pentaho.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7378370602393441676'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6644329693530300467/posts/default/7378370602393441676'/><link rel='alternate' type='text/html' href='http://pedroalves-bi.blogspot.com/2012/09/cgg-putting-ccc-charts-in-pentaho.html' title='CGG - Putting CCC charts in Pentaho reporting / other tools'/><author><name>Pedro Alves</name><uri>https://plus.google.com/103084425334681885234</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh6.googleusercontent.com/-bZghMs_RUbY/AAAAAAAAAAI/AAAAAAAAAjs/OQM_Ot1-jgo/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-1X-PDBqVRuI/UFG8ooxY4pI/AAAAAAAAAYY/9JXmzluBWls/s72-c/cgg_1.png' height='72' width='72'/><thr:total>5</thr:total></entry></feed>