Query and filter data cube OLAP operations PivotData C# examples
PivotData provides simple .NET API to make OLAP cube queries in LINQ-style with
SliceQuery
class:
it accepts any IPivotData
implementation and produces a new in-memory cube as result of OLAP operation.
Complete example code is here: NReco.PivotData github repository
Filter data cube slice and dice
PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "year" var whereQuery = new SliceQuery(pvtData) .Where( "product", // dimension to filter "Product1", "Product2" // several values for IN match ) .Where( "year", 2015 // note: value should match actual dimension key type ); var slicedPvtData = whereQuery.Execute(); // resulted cube is filtered by product and year
Where method has overloads that accept predicate delegate for complex filtering conditions:
var whereQuery = new SliceQuery(pvtData).Where( "product", // dimension to filter (dimKey) => ((string)dimKey).StartsWith("Product") );
Select specific dimensions roll-up
PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "city" var selectQuery = new SliceQuery(pvtData).Dimension("product").Dimension("city"); var resultPvtData = selectQuery.Execute(); // resulted cube has only "product" and "city" dimensions
Dimension
method has
an overload
that allows you to define "derived" dimension (that calculated from dimension keys of the source cube).
For example, you can calculate quarter (Q1-Q4) by the month or date dimension.
Define derived dimension expand hierarchy
It is possible to define derived dimension (= calculated from existing dimension key or keys) in the following way:
var pvtData = new PivotData(new[]{ "creation_date", ... }, ... ); var byMonthCube = new SliceQuery(pvtData).Dimension("creation_date_month", (dimKeys) => { // array of entry dimension keys var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension return creationDateValue.Month; } ); var byYearAndQuarter = new SliceQuery(pvtData).Dimension("creation_date_year_and_quarter", (dimKeys) => { var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension return String.Format("{0} Q{1}", creationDateValue.Year, GetQuarter(creationDateValue.Month) ); } );
This approach may be used for resolving lookup values by ID-based dimensions.
Select specific measures
If data cube has multiple measures (configured with
CompositeAggregatorFactory)
it is possible to select only specific metric with Measure
method:
PivotData pvtData; // lets assume that cube has 3 measures var oneMeasurePvtData = new SliceQuery(pvtData).Measure(0).Execute(); // take only measure at index #0
Measure
method may be called several times; it is possible to specify the same index more than once to duplicate a measure;
This might be usedful if pivot table should display the same metric with different formatting (say, absolute value + percentage or difference value).
Define derived measure
Measure
method can be used for defining derived measures (projected from another measures).
The following code snipper illustrates how to calculate new metric by existing measures:
var derivedAggrQuery = new SliceQuery(pvtData) .Measure( new SumAggregatorFactory("delta_fld"), // factory for new metric (aggr) => { var compositeAggr = aggr.AsComposite(); var aggr0val = Convert.ToDecimal( compositeAggr.Aggregators[0].Value ); var aggr1val = Convert.ToDecimal( compositeAggr.Aggregators[1].Value ); var diffVal = aggr0val - aggr1val; // result should correspond specified aggregator factory return new SumAggregator("delta_fld", // sum aggregator state new object[] { aggr.Count, // elements count diffVal // new aggr sum value }); } );
Derived measure is calculated when SliceQuery
is executed.
Define formula measure
AnotherMeasure
overload can be used for defining a
formula
measure with C# expression:
// average item price = sum of total / sum of quantity var avgItemPriceByYearQuery = new SliceQuery(pvtData) .Dimension("year") .Measure("Avg item price", (aggrArgs) => { // value of first argument (from measure #1) var sumOfTotal = ConvertHelper.ConvertToDecimal( aggrArgs[0].Value, 0M); // value of second argument (from measure #0) var sumOfQuantity = ConvertHelper.ConvertToDecimal( aggrArgs[1].Value, 0M); if (sumOfQuantity==0) return 0M; // prevent div by zero return sumOfTotal/sumOfQuantity; }, new int[] { 1, 0 } // indexes of measures for formula arguments );
Unlike projected measure, formula value is calculated "on-the-fly" (when value is accessed).
Also it is possible to define formula dynamically with a string expression (see "DynamicFormulaMeasure" example for more details).
Keyword-based filter Toolkit component
CubeKeywordFilter
can be used for filtering the cube by simple search-like query string:
IPivotData pvtData; // data cube instance var cubeFilter = new CubeKeywordFilter("2015, Product1"); var filteredPvtData = cubeFilter.Filter(pvtData);Keyword filter tries to find dimension keys that match specified keywords and performs appropriate filtering. The following search syntax is supported:
product | If "product" value is present pefroms exact match; otherwise, works like *product* |
---|---|
country=USA | Exact match filtering for dimension with "country" in the name (or resolved label) |
product="Product 1" | Exact match of value with spaces. |
product* | Match keys that start with "product" |
*es | Match keys that end with "es" |
year:2015 | Filter by "2015" is applied to dimension that contains "year" keyword |
-2015 | Exclude all keys of dimension that has "2015" value (except "2015") |
year>2022 | Filter keys of a dimension that has "year" in the name and keep only values that are greater than 2022. Possible conditions are: < , > , >= , <= |
count>1000 | If cube has "Count" measure this excludes all values of that measure that are less than or equal to 1000. |
year:2015+quarter:1, year:2015+quarter:2 | Filter by: (year=2015 AND quarter=1) OR (year=2015 and quarter=2) |
-"year:2015+quarter:1" | Exclude all values for year=2015 AND quarter=1 |