Load pre-aggregated data PivotData C# Examples


Normally PivotData class performs input data aggregation by itself (with ProcessData method), but in some cases data cube should be populated with existing values:

  • when data aggregation is performed on database level (with SQL GROUP BY, MongoDb aggregation pipleline, ElasicSearch aggregations etc)
  • metrics are loaded from OLAP server, or provided by 3rd party API, or pre-calculated in some other way

It is possible to load pre-aggregated data into PivotData with help of GroupedSourceReader component:

class MyModel {
  public string Product { get; set; }
  public string Country { get; set; }
  public int OrdersCount { get; set; }
  public int Qty_Min { get; set; }
  public int Qty_Average { get; set; }
}

IEnumerable<MyModel> inputData = ... ; 

var groupedPvtDataReader = new GroupedSourceReader(
  (processData) => {
    processData(inputData, new ObjectMember().GetValue);
  },
  "OrdersCount"  // field name with aggregated rows count
);
var pvtDataFactory = new PivotDataFactory();
var pvtData = groupedPvtDataReader.Read(
  // instead of using "PivotDataFactory.GetConfiguration" method
  // you can compose and pass "PivotDataConfiguration" directly 
  pvtDataFactory.GetConfiguration(
    new PivotData(new[] {"Product", "Country"}, 
	  new CompositeAggregatorFactory(
        new CountAggregatorFactory(), 
	    new AverageAggregatorFactory("Qty") ) )
  ), 
  pvtDataFactory);
GroupedSourceReader class has a constructor that accepts IPivotDataSource implementations:
  • For SQL data source DbCommandSource: see "ToolkitSqlDbSource" example
  • For CSV data CsvSource, JSON data JsonSource
  • For MongoDb data source: see "ToolkitMongoDbSource" example
  • For SSAS OLAP server data source: see "ToolkitAdomdSource" example

NOTE: field name for "count" should be provided even if you don't use CountAggregatorFactory in the PivotDataConfiguration provided to GroupedSourceReader.Read. If your input data doesn't have this value you can resolve it to "1":

var groupedPvtDataReader = new GroupedSourceReader(
   (processData) => {
     var objMemberAccessor = new ObjectMember();
     processData(query, (row, field) => {
        if (field=="Count") return 1;
        return objMemberAccessor.GetValue(row, field);
     });
   },
   "Count");

By default GroupedSourceReader can load the following aggregator types:

Aggregator Name Aggregator Factory Type Required Fields
Count CountAggregatorFactory "[count]" (integer)
Sum SumAggregatorFactory "[field]_Sum" (number)
Average AverageAggregatorFactory "[field]_Average" (number)
Min MinAggregatorFactory "[field]_Min" (any comparable type)
Max MaxAggregatorFactory "[field]_Max" (any comparable type)
ListUnique ListUniqueAggregatorFactory "[field]_List" (array of values)
CountUnique ListUniqueAggregatorFactory "[field]_CountUnique" (array of values)
Variance VarianceAggregatorFactory "[field]_Average" (number)
"[field]_Variance" (number)

This setup may be changed by manipulations with GroupedSourceReader.AggregatorStateComposers collection. Generic implementations of GroupedSourceReader.IAggregatorStateComposer could be used for registering custom aggregator type:

How to keep dimension keys order

By default GroupedSourceReader.Read method returns PivotData instance that doesn't preserve order of dimensions keys by input data; PivotTable created by this cube will be ordered A-Z. In some cases original input data order should be preserved.
This is possible with help of FixedPivotData class (read-only IPivotData implementation) and GroupedSourceReader.ReadState method:

IPivotData pvtData = new PivotData(new[] {"Product", "Country"}, new CountAggregatorFactory() );
var pvtDataState = groupedPvtDataReader.ReadState( pvtDataFactory.GetConfiguration(pvtData) );
pvtData = new FixedPivotData(pvtData.Dimensions, pvtData.AggregatorFactory, pvtDataState );
PivotTable applies A-Z order by default and special override is needed to prevent that and keep keys order as in input data:
public class NoSortPivotTable : PivotTable {
  public NoSortPivotTable(string[] rows, string[] columns, IPivotData pvtData) 
    : base(rows,columns,pvtData) { }
  protected override void SortKeys(ValueKey[] keys, string[] dimensions, IComparer<ValueKey> comparer) {
    /* do nothing */
  }
}

// create pivot table that preserves order of input data
var pvtTbl = new NoSortPivotTable(new[]{"Product"}, new[]{"Country"}, pvtData);

How to register additional aggregators for GroupedSourceReader

The following code snippet illustrates how to register "CountUnique" aggregator for input dataset that contains list of unique values:

// for test purposes .NET collection is used as a data source
// element object properties are used as fields referenced by dimensions/aggregator configuration
var inputData = new object[] {
  new { 
    dim1 = "A", 
    dim2 = "B", 
    count = 3, 
    col_CountUnique = new string[] {"V1","V2"} 
  }  // field with values for CountUniqueAggregator should have array type
};

var grpSourceRdr = new GroupedSourceReader(
  processData => processData(inputData, new ObjectMember().GetValue),
  "count");

// lets register "CountUnique" for GroupedSourceReader
// state object for "CountUniqueAggregator" is an array with 2 elements
// 1) count value (number of aggregated rows)
// 2) array of unique values
grpSourceRdr.AggregatorStateComposers.Add(
  new GroupedSourceReader.ArrayAggregatorStateComposer(
    "CountUnique",  // matches AggregatorFactoryConfiguration.Name
      new[] {
        "count",  // field for "count" value
        "{0}_CountUnique"  // field name for unique values array. {0} = first aggregator factory parameter
      },
      new[] {
        typeof(uint),  // field type for "count"
        typeof(object[])  // field type for unique values
      }));
var pvtDataFactory = new PivotDataFactory();
// custom aggregator should be registered in PivotDataFactory with RegisterAggregator method
// no needed to do this for "CountUnique" as PivotDataFactory knows about standard aggregators by default

var pvtData = grpSourceRdr.Read(
  new PivotDataConfiguration() {
    Dimensions = new[] { "dim1", "dim2" },
    Aggregators = new[] {
      new AggregatorFactoryConfiguration("CountUnique", new[]{"col"})  // matches "col_CountUnique" field
    }
  },
  pvtDataFactory
);
Console.WriteLine("R: {0}", pvtData["A","B"].Value);  // outputs: 2

NOTE: "CountUnique" is used for sample purposes, you don't need to register it manually as GroupedSourceReader has a state composer for all standard aggregators by default.