Query and filter data cube OLAP operations PivotData C# examples


Get PivotData .NET Library

NReco.PivotData implements in-memory data cube (multidimensional dataset) that can be easily accessed from C# code with simple LINQ-style queries.

PivotData provides simple .NET API to make OLAP cube queries in LINQ-style with SliceQuery class. SliceQuery accepts any IPivotData implementation and produces a new cube as result of OLAP operation.

Filter data cube slice and dice

PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "year"
var whereQuery = new SliceQuery(pvtData)
    .Where(
         "product",  // dimension to filter 
         "Product1", "Product2"  // several values for IN match
    )
    .Where(
         "year",
         2015  // note: value should match actual dimension key type
    );
var slicedPvtData = whereQuery.Execute();  // resulted cube is filtered by product and year

Where method has overloads that accept predicate delegate for complex filtering conditions:

var whereQuery = new SliceQuery(pvtData).Where(
  "product",  // dimension to filter
  (dimKey) => ((string)dimKey).StartsWith("Product")
);

Select specific dimensionsroll-up

PivotData pvtData; // lets assume that cube has dimensions: "product", "country", "city"
var selectQuery = new SliceQuery(pvtData).Dimension("product").Dimension("city");
var resultPvtData = selectQuery.Execute(); // resulted cube has only "product" and "city" dimensions

Note that Dimension method has an overload that allows you to define "derived" dimension (that calculated from dimension keys of the source cube). For example, you can calculate quarter (Q1-Q4) by the month or date dimension.

Define derived dimension expand hierarchy

It is possible to define derived dimension - calculated from existing dimension key (or keys) - in the following way:

var pvtData = new PivotData(new[]{ "creation_date", ... }, ... );
var byMonthCube = new SliceQuery(pvtData).Dimension("creation_date_month",
	(dimKeys) => {  // array of entry dimension keys
		var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension
		return creationDateValue.Month;
	}
);
var byYearAndQuarter = new SliceQuery(pvtData).Dimension("creation_date_year_and_quarter",
	(dimKeys) => {
		var creationDateValue = (DateTime)dimKeys[0]; // #0 - index of "creation_date" dimension
		return String.Format("{0} Q{1}", 
			creationDateValue.Year, GetQuarter(creationDateValue.Month) );
	}
);

This approach may be used for resolving lookup values by ID-based dimensions.

Select specific measures

Data cube may have multiple measures (configured with CompositeAggregatorFactory) and it is possible to select only specific metric(s) with Measure method:

PivotData pvtData; // lets assume that cube has 3 measures
var oneMeasurePvtData = new SliceQuery(pvtData).Measure(0).Execute(); // take only measure at index #0

Define derived measures

Measure method may be used for defining derived measures (calculated from another measures). For example, you can calculate some metric as difference between 2 exiting measures:

var derivedAggrQuery = new SliceQuery(pvtData)
	.Measure(
		new SumAggregatorFactory("virtual_fld"),  // define factory for derived aggregator
		(aggr) => {
			var compositeAggr = aggr.AsComposite();
			var aggr0val = Convert.ToDecimal( compositeAggr.Aggregators[0].Value );
			var aggr1val = Convert.ToDecimal( compositeAggr.Aggregators[1].Value );
			var diffVal = aggr0val - aggr1val;
			// note: result should correspond specified aggregator factory
			return new SumAggregator("virtual_fld", 
				// sum aggregator state
				new object[] {
  					aggr.Count, // elements count
					diffVal // new aggr sum value
				});
		}
	);
Also Measure method can define a formula measure with C# expression:
// average item price = sum of total / sum of quantity
var avgItemPriceByYearQuery = new SliceQuery(pvtData)
    .Dimension("year")
    .Measure("Avg item price",
        (aggrArgs) => {
            // value of first argument (from measure #1)
            var sumOfTotal = ConvertHelper.ConvertToDecimal( aggrArgs[0].Value, 0M);  

            // value of second argument (from measure #0)
            var sumOfQuantity = ConvertHelper.ConvertToDecimal( aggrArgs[1].Value, 0M);
 
            if (sumOfQuantity==0)
                return 0M; // prevent div by zero
            return sumOfTotal/sumOfQuantity;
        },
        new int[] { 1, 0 } // indexes of measures for formula arguments
   );

You may download PivotData free package and check "QueryCube" example that illustrates cube queries listed above.