Thursday, December 15, 2011

Back to basics: Step by step Pentaho + Ctools installation

Let's get back to the basics. This entry is a step by step tutorial on how to install http://www.pentaho.com/, the Ctools and Saiku.

Pentaho:
  • Download Pentaho Server from sourceforge. Choose zip or tar.gz according to preferences
  • Create a folder "pentahoServer" and decompress the downloaded package
    inside it 
Ctools and Saiku:
  • Download CTools installer
  • Run 'ctools-installer.sh. It will display the usage, but you'll probably want to do something like: ./ctools-installer.sh -s pentahoServer/biserver-ce/pentaho-solutions -w pentahoServer/biserver-ce/tomcat/webapps/pentaho
  • Choose the packages you want to install. Try them all for some kick ass software ;)
Start it:
  • Run 'start-pentaho.sh', under biserver-ce. Pentaho's up and running
  • Go to http://127.0.0.1:8080/pentaho (or whatever server you installed it in) and login with almighty joe/password
  • Have fun!

Note: Ctools-installer is a bash script that work on *nix/mac out of the box. This tutorial teaches you how to run it on windows through cygwin

Monday, December 12, 2011

Substance AND Style

Just read the title of an interview from Quentin Gallivan titled BI visualization is just eye-candy.


No need to say I couldn't disagree more. After reading the actual interview I understand that the statement has to be read in the context of previous declarations by Christian Chabot, CEO and co-founder of Tableau Software when he said that visualization is key when presenting data to an end-user. But it's still a very very wrong headline.


While reading this two articles, there are some situations where I could almost directly quote Tableau or QlikTech's words. But I work only with Pentaho, so what's the catch? A few things, actually.

Self Service BI is a Myth

No matter what vendors try to tell you, this is a myth. For the simple reason that BI is a very vague word. You can definitely have "self service analysis" on a well defined datawarehouse, "self service reporting" on a predefined metadata model, even go a bit further ahead on a vertical market scenario but jumping to the generic terms is wrong.  And this is pretty obvious to anyone that actually worked on implementing a real BI project.

You can't streamline good Visualization

Both Tableau and QlikView have great features in terms of usability. It's easy to work with, and you can do some interesting things with it. But on it's own it does nothing, and you can't just give users a bunch of dials, bullet charts, maps, 3d charts and call the result "good visualization".

It's all about the consumer

This is the million dollar question. Who's gonna use it? I divide BI "consumers" in two categories:
  • Analysts
  • End users (CEOs, CFOs, operational users, everyone else except the first bullet)
 In my opinion and experience, most BI tools focus on the first bullet, and it's the second part that needs more love and attention. It's extremely hard to make sure users understand what they are seeing, and sometimes more options than one simple dropdown and one single table is too much. Because everything depends on the scenario and on the users that will consume the information. Those users never heard of Kimball, of dimensions, members, etc, and they don't have to. They need to know their business. Period. Analysts are pretty experienced users. In this cases visualization comes second to liberty to analyze the data.


It's all about the implementation

No tool does anything without a full blown implementation. This is what makes a BI project expensive. And in order to do a good implementation (substance and style) we need flexibility to do what the customer asks - and he'll ask for *a lot*. And this is why we're working with Pentaho over anything else. It's  an amazing platform- great to see Quentin's not changing that strategy - on top of what we were able to develop the Ctools and, so far, we've been able to fulfill nearly all the requirements customers asked us. And in terms of final product and visualization, when you compare one implementation done with insert BI tool here, well.... see for yourself ;)


Wednesday, December 7, 2011

Introducing Add-Ins in CTools

We just added Add-Ins support in CDF / CDE. This will have a great impact in both the development of the ctools but also in the usage




AddIns reference 
 
AddIns are CDF's extension points. Can be used in any component to be able to fine-control that component and extend it in a very simple way
The use case for this concept is TableComponent's colType. Tables are a fundamental piece of dashboards / visualizations and there needs to be simple ways to extend the ways they are rendered
AddIn Implementation
In order to implement an AddIn, you need to create an object like the following:
  var sparkline = {
    name: "sparkline",
    label: "Sparkline",
    defaults: {
      type: 'line'
    },
    init: function(){

      // Register this for datatables sort
      var myself = this;
      $.fn.dataTableExt.oSort[this.name+'-asc'] = function(a,b){
        return myself.sort(a,b)
      };
      $.fn.dataTableExt.oSort[this.name+'-desc'] = function(a,b){
        return myself.sort(b,a)
      };
        
    },
    
    sort: function(a,b){
      return this.sumStrArray(a) - this.sumStrArray(b);
    },
    
    sumStrArray: function(arr){
      return arr.split(',').reduce(function(prev, curr, index, array){  
        console.log("Current " + curr +"; prev " +  prev); 
        return parseFloat(curr) + (typeof(prev)==='number'?prev:parseFloat(prev));
      });
    },
    implementation: function (tgt, st, opt) {
      var t = $(tgt);
      t.sparkline(st.value.split(/,/),opt);
      t.removeClass("sparkline");
    }
  };

Dashboards.registerAddIn("Table", "colType", new AddIn(sparkline));
name, label, defaults, init and implementation are the important bits here. This specific code is an implementation of a sparkline using the jquery plugin.


Setting options
Options are passed on a per component basis, usually in the component's preExecution function. It can either be a static list of options that will be merged with the defaults or a function where the options change according to the state
function f(){
 
    // Option 1 :Static list

    this.setAddInOptions("colType","sparkline",{barColor: "red"});
 
    // option 2: function
    this.setAddInOptions("colType","sparkline",function(state){
        // Let's turn the second sparkline into a bar
        if(state.colIdx == "2"){
            return { type:'bar'};
        }
    });
}
Setting defaults
It's also possible to ser a site-wide / dashboard wide defaults, and like the previous option, can either be a static list or a function
Dashboards.setAddInDefaults("Table","colType","sparkline",{fillColor:"#aaa"});

Dashboards.setAddInDefaults("Table","colType","sparkline",function(state){ 
    return state.rowIdx%2?{fillColor:"#aaa"}:{fillColor:"#fff"};
});


Implementation arguments
implementation function has the following arguments:
    implementation: function (tgt, st, opt) {}
tgt
target - Target for the action. eg: On a cellType, it will be the td cell
state
state - Information about the specific addin call. On a cellType will be an object with: {rawData, tableData, colIdx, rowIdx, series, category, value, colFormat}
opt
options passed to this addIn
Calling AddIns from components
When developing a component, it's very easy to define a new AddIn type. Here's the example that TableComponent uses:
    var addIn = myself.getAddIn("colType",colType);
    addIn.call(td,state,myself.getAddInOptions("colType",addIn.getName()));
From this point on, there will be a new colType available to register.

 Implemented AddIns:

  • sparkline
  • pvSparkline
  • dataBar
  • trendArrow
  • hyperlink

Here's some screenshots for the implemented addins. More details under cde_samples in the solution repository after installing ctools:

  All credits to Brandon Jackson for sponsoring this huge develpment!  /bow!