CDC - Community Distributed Cache
One more member for the CTools - and this one is big. Available through the ctools-installer using the -b dev flag, and requires Pentaho 4.5.
AboutCDC stands for Community Distributed Cache and allows for high-performance, scalable and distributed memory clustering cache based on Hazelcast for both CDA and Mondrian.
CDC is a pentaho plugin that provides the following features:
- CDA distributed cache support
- Mondrian distributed cache support
- Ability to switch between default and CDC cache for cda and mondrian
- Gracefully handles adding / removing new cache nodes
- Allows to selectively clear cache of specific CDE dashboards
- Allows to selectively clear cache of specific schemas / cubes / dimensions of mondrian cubes
- Provides an API to clean the cache from the outside (eg: after running etl)
- Provides a view over cluster status
- Supports multiple pentaho servers using the same cluster (eg: stage and production)
- Supports several memory configuration options
One added functionality is the ability to clear the cache of only specific mondrian cubes. Even though Mondrian has a very complete api to control the member's cache, Pentaho only exposes a clean all functionality that ends up being very limited in production environments.
The cache being able to survive server restarts is a design bonus, and supported by CDA out of the box. It will be supported by Mondrian as soon as MONDRIAN-1107 is fixed.
- Mondrian 3.4 or newer (in Pentaho 4.5)
- CDA 12.05.15
- Install CDC using either the installer (soon to be available) or ctools-installer. If you do a manual install, be sure to copy the contents of solution/system/cdc/pentaho/lib to server's WEB-INF/lib
- Download the standalone cache node
- Execute the standalone cache node in the same machine as pentaho or in the same internal network (launch-hazelcast.sh), optionally editing the file and changing the memory settings (defaults to 1Gb, increase at will). You can launch as many nodes as you want.
- Launch pentaho and click on the CDC button:
- Enable cache usage on CDA and Mondrian
- Restart pentaho server
- Check if the settings screen are satisfactory. Usually the defaults work fine.
Open analyzer, jpivot or a CDE dashboard that uses CDA and you should see the cache being populated
Hazelcast has a very good Management Center, so it's outside the scope of CDC to reimplement that kind of features. However, we do support a simple cluster information dashboard gives an overview of the state of the nodes.
Note about lite nodes: Pentaho server is itself a cache node. However, it's configured in such a way that doesn't hold data, thus the term lite node
CDC offers a solution navigator so that we can select a dashboard. When we select that dashboard, all the CDA queries used by that dashboard will be cleaned.
Clicking on the URL button we'll get a url that we can call externally (from an etl job). Be aware that you need to add the user credentials when calling from the outside (eg: &userid=joe&password=password)
This one is very similar to the previous one, but navigates through the available cubes. One can then either clean the entire schema, a specific cube or even the individual cell cache for a specific dimension (use this latest one with care).
Webdetails CDC Project Page