Aren't you glad that I don't make a living out of inventing names for projects?
It's a well known fact that the data access part in CDF is very.... hacked (I would call it something different but I'd probably be banned from blogger.com).
There's no common layer, each component has the responsibility of getting the data it needs and worst of all it can permit sql injection (thus sql support is disabled by default).
This item has been on my todo list for a too long, and it's time for me to tackle it.
It wasn't obvious at all for me where to put this data access layer. I need something that I can use both on CDF and on our Dashboard Editor. Classloading hell regarding pentaho plugins are also a concern.
Since others have asked for the same, I decided to create a new independent plugin: CDA - Community Data Access. It will be accessed by simple url calls and support the following datasources:
- SQL
- MDX
- Metadata
- Kettle
- etc.
- Compositions of the above
It will support multiple outputs:
- JSON
- XML
- CSV
- XLS
- HTML visual mode
<?xml version="1.0" encoding="utf8"?>
<CDADescriptor>
<!-- DataSource definition.
Type controls what the datasource type is.
The connection and query controls how the data is fetched. These values
are specific to each access type
-->
<DataSources>
<Connection id="1" type="jdbc">
<Driver>org.hsqldb.jdbcDriver</Driver>
<Url>jdbc:hsqldb:hsql://localhost:9001/sampledata</Url>
<User>pentaho_user</User>
<Pass>password</Pass>
</Connection>
<Connection id="2" type="jndi">
<Jndi>sampledata</Jndi>
<Catalog>./sampledata.mondrian.xml</Catalog>
<Cube>SteelWheelsSales</Cube>
</Connection>
</DataSources>
<!-- DataAccess object controls the query itself
Access controls if the datasource can be seen from the outside or is to
be used from a Compound datasource only
-->
<DataAccess id="1" connection="1" type="sql" access="public" cache="true" cacheDuration="3600">
<Query>
select o.YEAR_ID, o.STATUS, sum(o.TOTALPRICE) as price from orderfact o
where o.STATUS = ${status} and o.ORDERDATE > ${orderDate}
group by o.YEAR_ID, o.STATUS
</Query>
<!-- All parameters are passed as string from the outside. Their
type is defined here. If needed, a conversion pattern must be specified
-->
<Parameters>
<Parameter name="status" type="String" default="Shipped"/>
<Parameter name="orderDate" type="Date" pattern="yyyy-MM-dd" default="2003-03-01"/>
</Parameters>
<!-- Settings block here overrides defaults and general settings
Columns block allow specifying name and CalculatedColumn the creation of new
columns based on formulas
-->
<Settings>
<Columns>
<Column idx="0">
<Name>Year</Name>
</Column>
<CalculatedColumn>
<Name>Test</Name>
<Formula>=[1]/100</Formula>
</CalculatedColumn>
</Columns>
<Export type="xls" includeTotals="true">
<Column idx="0" aggregator="None"/>
<Column idx="1" aggregator="Average"/>
</Export>
</Settings>
<!-- Output controls what is outputed and by what order. Defaults to everything -->
<Output indexes="1,0,2"/>
</DataAccess>
<DataAccess id="2" connection="2" type="mdx" access="public">
<Query>
select {[Measures].[Sales]} ON COLUMNS,
NON EMPTY [Time].Children ON ROWS
from [SteelWheelsSales]
where ([Order Status].[${status}])
</Query>
<Parameters>
<Parameter name="status" type="String" default="In Process"/>
</Parameters>
</DataAccess>
<!-- CompoundDataAccess can be used to join queries. Some types are
supported, and the user is responsible for assuring the different dataaccess
are compatible:
* Union: Appends different sources
* Join: Joins 2 resultsets based on one or more keys. Results have to be
sorted by those keys.
-->
<CompoundDataAccess id="3">
<Union>
<Source id="1"/>
<Source id="2"/>
</Union>
</CompoundDataAccess>
<CompoundDataAccess>
<Join>
<Left keys="0"/>
<Right keys="0"/>
</Join>
<Output indexes="0,2,3"/>
</CompoundDataAccess>
</CDADescriptor>
This is about it. Suggestions are welcomed, and this is the correct time to integrate them since I'm still on the drawing board. (But hurry, cause I need to have this done very quickly :) )
Have fun


