Load pre-aggregated data PivotData C# Examples


Normally PivotData class performs input data aggregation by itself (with ProcessData method), but in some cases data cube should be populated with existing values:

  • when data aggregation is performed on database level (with SQL GROUP BY, MongoDb aggregation pipleline, ElasicSearch aggregations etc)
  • metrics are loaded from existing OLAP server, provided by 3rd party API or pre-calculated in some other way

It is possible to load pre-aggregated data into PivotData with help of GroupedSourceReader component:

class MyModel {
  public string Product { get; set; }
  public string Country { get; set; }
  public int OrdersCount { get; set; }
}

IEnumerable<MyModel> inputData = ... ; 

var groupedPvtDataReader = new GroupedSourceReader(
  (processData) => {
    processData(inputData, new ObjectMember().GetValue);
  },
  "OrdersCount"  // field name with aggregated rows count
);
var pvtDataFactory = new PivotDataFactory();
var pvtData = groupedPvtDataReader.Read(
  // instead of using "PivotDataFactory.GetConfiguration" method
  // you can compose and pass "PivotDataConfiguration" directly 
  pvtDataFactory.GetConfiguration(
    new PivotData(new[] {"Product", "Country"}, new CountAggregatorFactory())
  ), 
  pvtDataFactory);
GroupedSourceReader class has a constructor that accepts IPivotDataSource implementations:
  • For SQL data source DbCommandSource: see "ToolkitSqlDbSource" example
  • For CSV data CsvSource, JSON data JsonSource
  • For MongoDb data source: see "ToolkitMongoDbSource" example

Field name for "count" metric should be provided even if you have specified another aggregator instead of CountAggregatorFactory (say, SumAggregatorFactory). If your input data doesn't have this value you can resolve it to "1":

var groupedPvtDataReader = new GroupedSourceReader(
   (processData) => {
     var objMemberAccessor = new ObjectMember();
     processData(query, (row, field) => {
        if (field=="Count") return 1;
        return objMemberAccessor.GetValue(row, field);
     });
   },
   "Count");

By default GroupedSourceReader can load the following aggregator types:

CountAggregatorFactory field for this value is specified as constructor parameter
SumAggregatorFactory by convention field name for this value is "[field]_Sum"
AverageAggregatorFactory by convention field name for this value is "[field]_Average"
MinAggregatorFactory by convention field name for this value is "[field]_Min"
MaxAggregatorFactory by convention field name for this value is "[field]_Max"

This setup may be changed by manipulations with GroupedSourceReader.AggregatorStateComposers collection. You may use the following generic implementations of GroupedSourceReader.IAggregatorStateComposer:

How to keep dimension keys order

By default GroupedSourceReader.Read method returns PivotData instance that doesn't preserve order of dimensions keys by input data; PivotTable created by this cube will be ordered A-Z. In some cases original input data order should be preserved.
This is possible with help of FixedPivotData class (read-only IPivotData implementation) and GroupedSourceReader.ReadState method:

IPivotData pvtData = new PivotData(new[] {"Product", "Country"}, new CountAggregatorFactory() );
var pvtDataState = groupedPvtDataReader.ReadState( pvtDataFactory.GetConfiguration(pvtData) );
pvtData = new FixedPivotData(pvtData.Dimensions, pvtData.AggregatorFactory, pvtDataState );
PivotTable applies A-Z order by default and special override is needed to prevent that and keep keys order as in input data:
public class NoSortPivotTable : PivotTable {
  public NoSortPivotTable(string[] rows, string[] columns, IPivotData pvtData) 
    : base(rows,columns,pvtData) { }
  protected override void SortKeys(ValueKey[] keys, string[] dimensions, IComparer<ValueKey> comparer) {
    /* do nothing */
  }
}

// create pivot table that preserves order of input data
var pvtTbl = new NoSortPivotTable(new[]{"Product"}, new[]{"Country"}, pvtData);

How to register additional aggregators for GroupedSourceReader

The following code snippet illustrates how to register "CountUnique" aggregator for input dataset that contains list of unique values:

// for test purposes .NET collection is used as a data source
// element object properties are used as fields referenced by dimensions/aggregator configuration
var inputData = new object[] {
  new { 
    dim1 = "A", 
    dim2 = "B", 
    count = 3, 
    col_CountUnique = new string[] {"V1","V2"} 
  }  // field with values for CountUniqueAggregator should have array type
};

var grpSourceRdr = new GroupedSourceReader(
  processData => processData(inputData, new ObjectMember().GetValue),
  "count");

// lets register "CountUnique" for GroupedSourceReader
grpSourceRdr.AggregatorStateComposers.Add(
  new GroupedSourceReader.ArrayAggregatorStateComposer(
    "CountUnique",  // matches AggregatorFactoryConfiguration.Name
      new[] {
        "count",  // field for "count" value
        "{0}_CountUnique"  // field for unique values array. {0} refers to the aggregator field name parameter
      },
      new[] {
        typeof(uint),  // field type for "count"
        typeof(object)  // field type for unique values
      }));
var pvtDataFactory = new PivotDataFactory();
// custom aggregator should be registered in PivotDataFactory with RegisterAggregator method
// no needed to do this for "CountUnique" as PivotDataFactory knows about standard aggregators by default

var pvtData = grpSourceRdr.Read(
  new PivotDataConfiguration() {
    Dimensions = new[] { "dim1", "dim2" },
    Aggregators = new[] {
      new AggregatorFactoryConfiguration("CountUnique", new[]{"col"})  // matches "col_CountUnique" field
    }
  },
  pvtDataFactory
);
Console.WriteLine("R: {0}", pvtData["A","B"].Value);  // outputs: 2