Web pivot tables by SQL DB Data Source Setup

Cube schema is specified in the appsettings.json file PivotDataService:Cubes section.

Cube configuration for SQL database should have "SourceType": "sqldb" and an additional "SourceDb" section:

{
  "Id": "orders",
  "Name": "Orders from PostgreSql sample DB",
  "SourceType": "SqlDb",
  "SourceDb": {
    "Connector": "postgresql",
    "ConnectionString": "Host=raja.db.elephantsql.com;Port=5432;Database=eegppxet;User ID=eegppxet;Password=ikIlE1oaTo3Lz_6264iCmj-wCaH73ATZ;",
    "SelectSql": "select o.*, od.unit_price, od.quantity from order_details od left join orders o on (o.order_id=od.order_id)"
  },
  "InferSchema": true,  /* dimensions and measures are determined automatically by columns returned by "SelectSql" */
  "Dimensions": [ /* explicitely defined dimensions */ ],
  "Measures": [ /* explicitely defined measures */ ],
  "Parameters": [ /* report parameters for SQL-level filtering */ ]
}

SelectSql determines 'base' tabular dataset (fact table) that is used to generate report-specific SQL queries.
For most efficient SQL generation it is recommended to specify it in SELECT * FROM ... or SELECT mainTablePrefix.* FROM ... form, in this case engine can avoid wrapping this this base SELECT with an additional outer SELECT (* is replaced with columns list that are needed to load concrete report data).

Also it is much more effective to resolve lookups after grouping stage; "Star-schema support" section explains how to configure conditional post-group-by JOINs.

List of supported SQL connectors:

Connector	Description
`mssql`	MS SQL Server, Azure SQL. ConnectionString sample: Data Source=hostName;Database=db;User=user;Password=password; Complete list of connection string options: System.Data.SqlClient documentation.
`mysql`	MySql, MariaDB, MemSQL. ConnectionString sample: Server=hostName;Database=db;Uid=user;Pwd=password;ConvertZeroDateTime=True; Complete list of connection string options: MySqlConnector Driver documentation.
`postgresql`	PostgreSql, Amazon Redshift. ConnectionString sample: Host=hostName;Port=port;Database=db;User ID=user;Password=password; For Redshift in some cases you'll need to add: `Server Compatibility Mode=Redshift;` Complete list of connection string options: NpgSql .NET Driver documentation.
`clickhouse`	Yandex ClickHouse. ConnectionString for native TCP/IP protocol .NET driver (ClickHouse.Ado): Compress=True;Compressor=lz4;SocketTimeout=15000;Host=hostName;Port=portNumber;Database=default;User=default;Password= ConnectionString for alternative HTTP(S) interface .NET driver (ClickHouse.Client): Driver=ClickHouse.Client;Compression=True;Host=hostName;Port=8123;Protocol=http;Database=default;Username=default;Password=
`sqlite`	Local SQLite database file. ConnectionString sample: Data Source=@CurrentDirectory/App_Data/northwind.db; (you can use @CurrentDirectory token to specify working directory of the microservice.)
`oracle`	Oracle DB. ConnectionString sample: Data Source=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=hostName)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=serviceName))); User Id=user;Password=password;
`snowflake`	Snowflake DB. ConnectionString sample: account=test;user=test;password=userpwd;warehouse=wh;db=snowflake_sample_data;host=test.snowflakecomputing.com;
`odbc`	Database that has ODBC Driver. ConnectionString sample: Driver={any odbc driver's name};OdbcKey=someValue; (Check concrete ODBC driver's documentation on available connection string options) Normally you'll not use "odbc" connector directly because most databases have own SQL dialect and ODBC connector should be configured as custom connector.
`<custom connector>`	Custom connector defined in "PivotDataService.DbConnectors" configuration entry.

NOTE: "InferSchema" is good for testing or development purposes; in production use it is better to switch it off (until you really need this behavior) and specify dimensions and measures explicitely to avoid overhead caused by an excessive query that PivotDataService executes to load sample data.

Dimensions

In SQL-based cubes dimension's definition refers to a table column (or SQL expression) that can be used as a grouping criteria in GROUP BY. The simplest case:

"Dimensions": [
  {
    "Name": "status"  /* or 'o.status' */
  }
]

"Name" is not obligated to reference a column, any valid SQL expression can be specified in this way:

"Dimensions": [
  {
    "Name": "orderDate_year",
    "LabelText": "Order Year",   
    "Params": [ "YEAR(orderDate)" ]
  },
  {
    "Name": "orderDate_month",
    "LabelText": "Order Month",   
    "Params": [ "MONTH(orderDate)" ],
	"Format": "{0:MMM}"  /* formats month number 1-12 as Jan,Feb,etc */
  }
]

By specifying "Type":"Expression" it is possible to define a dimension that is calculated by formula over aggregated resultset (on PivotDataService side).

Measures

In SQL-based cubes measure's definition refers to an aggregate function, for example:

  "Measures": [
     {
       "Type": "Count",  /* other types: "Sum", "Average", "Min", "Max", "FirstValue" */
     },
     {
       "Type": "Sum",
       "Params": [ "orderNumber" ],  /* refers to a column or any valid SQL expression */
       "Format": "{0:0.00}
     }  
  ]

If aggregate function requires arguments (any "Type" except Count) they should be listed in "Params". Column names that are specified as arguments should be valid columns for "SelectSql" query.

Like dimensions each measure should have an unique name; if property "Name" is not specified measure name is generated automatically by its type and list of parameters. For the sample config from above suggested names will be "Count and "SumOforderNumber".

It is possible to use non-standard SQL aggregate function with "FirstValue" measure type:

  "Measures": [
     {
       "Name": "usersCountUnique
       "Type": "FirstValue",
       "Params": [ "COUNT(DISTINCT user_id)" ]  /* custom SQL */
     }  
  ]

You can use report parameters in this custom SQL (starting from v.1.8.5).

Parameters

In SQL data sources report parameters could be used in SELECT query as ADO.NET command parameters, for example:

  "Parameters": [
    {
      "Name": "country",
      "DataType": "string",
      "Multivalue": true
    }
  ]

    "SelectSql": "SELECT o.* FROM orders o LEFT JOIN customers c ON (c.customerNumber=o.customerNumber) WHERE 1=1 @country[ AND c.country IN ({0}) ]"

Token @country[ AND c.country IN ({0}) ] refers to the parameter with name "country". It works in the following way: if report parameter is empty, nothing is added to WHERE; otherwise "AND c.country IN ('Austria')" condition is added.

Parameter's placeholder syntax:

@

identifies that this is a placeholder for the parameter

orderDate

parameter Name

[ ]

SQL part between square brackets is added to SQL command when parameter is defined. Optionally you may define an alternative SQP part that is used when parameter is not defined in this way:

@@paramName[ expression_when_defined ; expression_when_NOT_defined ]

{0}

an inner placeholder for the parameter's value. This value is inserted into the SQL as a command parameter → SQL injections are impossible. This also means that parameter value cannot contain SQL expression or used to provide a part of SQL command (only exception is when parameter's Expression uses Sql.Raw function).

Notes:

multivalue parameters may be used only with SQL IN condition

SELECT * FROM orders WHERE 1=1 @countries[ and o.country IN ({0}) ]

Inside a placeholder several symbols have special meaning, and if you need to use these symbols in the formatting template they should be escaped in this way:
; → ;; or \;
] → ]] or \]
{ → {{ or \{
} → }} or \}
\ → \\
@ → \@

it is possible to use nested placeholders. For example:

@@table_join_enabled[ LEFT JOIN table a ON (a.id=table_id @column_param[and a.column={0}] ) ]

NOTE: Multivalue parameters should be used only with "IN" operator.

Star-schema support

If database has star schema with main facts table and dimension tables it is possible to resolve lookup values without JOINs in the main query ("SelectSql"). Instead of that JOINs could be applied after data grouping only if they are needed for the concrete pivot table report; this is possible with "JoinsAfterGroup" config entry.

Lets assume that 'facts' table has 'company_id' column that refers to 'companies' dimension table with 'id' and 'title' columns. The following config illustrates how to configure conditional JOIN to resolve 'Company Title' dimension:

  "SourceDb": {
    "Connector": "...",
    "ConnectionString": "...",
    "SelectSql": "SELECT * FROM facts",
    "JoinsAfterGroup": [
      {
        "JoinSql": "LEFT JOIN companies c ON (c.id=t.company_id)",  /* "SelectSql" has an alias 't' */
        "ApplyOnFields": [ "c.title" ]  /* join is applied only for specfied dimension names */
      }    
    ]
  },
  "Dimensions": [
    {
      "Name": "c.title",  /* refers to join specified in "JoinsAfterGroup" */
      "LabelText": "Company Title",
      "Params": ["company_id"]  /* FK column in facts table for group-by */
    }
  ],

You can use report parameter tokens (@ParamName) in "JoinSql" template in the same manner as in "SelectSql".

Notice for ClickHouse connector: CH has its own JOIN syntax + it doesn't support table aliases for column names; this means that joined table should have unique column names (it is possible to join sub-query to get an unique column name).

Define custom ODBC-based connector

It is possible to customize default database connector setup by defining custom connector type in appsettings.json config (in "PivotDataService" section):

"DbConnectors" : [
  {
    "BaseConnector": "odbc",
    "Connector": "odbc-access",
    "IdentifierFormat": "[{0}]",
    "ConnectionStringTpl": "Driver={{Microsoft Access Driver (*.mdb)}}; DBQ={0}",
    "SelectWithLimitTpl": "SELECT TOP {1} * FROM ( {0} ) as t",
    "SqlFuncTpl": {
      "Year": "YEAR({0})",
      "Month": "MONTH({0})"
      "DayOfMonth": "DAY({0})"
    }
  }
]

In this case custom connector is based on "odbc" and it defines database-specific options for Access ODBC driver.

Option	Description
`BaseConnector`	Base SQL connector identifier (listed above). For ODBC driver use "odbc".
`Connector`	New customized connector identifier.
`IdentifierFormat`	Format string for database identifiers (table and column names). In most cases this will be `[{0}]` (SQL Server-like), `"{0}"` (SQL-standard), `{0}` (MySql-like)
`ConnectionStringTpl`	Template for the connection string that allows you to incapsulate part of the connection string (say, `Driver=Some ODBC driver name;{0}`). Specify `{0}` to use connection string from cube config without modifications.
`SelectWithLimitTpl`	Template for SQL command that applies database-level LIMIT syntax.
`SqlFuncTpl`	Define SQL expressions for derived dimensions like date parts for dates. They are used by "InferSchema" option and you don't need to specify these templates if you don't use infer schema feature.

How-to for ODBC-based connectors:

Use Google BigQuery with ODBC driver