OLAP cube definition
====================

Concept
-------

An OLAP cube in XLTable consists of measure groups and dimensions.

- Measure groups contain measures (for example, sum, count, average)
- Dimensions contain attributes (for example, regions, stores, time)

To make data available in Excel Pivot Tables, you must define the OLAP cube structure:

- which measures will be included
- which dimensions will be included
- which attributes each dimension contains
- which tables store data for measures and dimensions

If you have experience designing OLAP cubes in Microsoft SQL Server Analysis Services,
the overall logic will feel familiar.

In Analysis Services, cube structure is designed in a graphical environment and then deployed.
In XLTable, cube structure is defined using SQL scripts.

Cube definition storage
-----------------------

XLTable stores cube definitions in a database table.

Each cube definition is a sequence of SQL scripts describing:

- measure groups
- dimensions
- relationships
- calculated fields
- access rules
- Jinja logic

These scripts are written sequentially and stored in the analytical database in a table named ``olap_definition``.

Table olap_definition structure:

- ID — cube identifier
- Definition — SQL script defining cube structure

When a user connects from Excel:

1. XLTable reads cube definitions from this table
2. Displays available cubes
3. After selection, XLTable builds the list of measures and dimensions
4. Excel displays them in Pivot Table fields

Unified example
---------------

Follow this link for an example of creating an OLAP cube for a ClickHouse database: :ref:`unified_example` .

Cube definition rules
---------------------

Cube structure is defined using SQL tags embedded in SQL scripts.

Examples:

- olap_source
- olap_measures
- olap_dimensions

See the full list of tags: :ref:`sql_tags`

Measure group design
^^^^^^^^^^^^^^^^^^^^

The first step is defining the data source for a measure group.

Example:

.. code-block:: sql

   --olap_source Sales
   SELECT
   --olap_measures
       sum(sales.sale_qty) as sales_sum_qty
   FROM db.Sales sales

The order of blocks within an ``olap_source`` section is mandatory:

.. code-block:: text

   --olap_source <Name>         ← 1. source name
   SELECT                       ← 2. SELECT keyword (on its own line)
   --olap_measures              ← 3. section type (or --olap_dimensions)
       <field list>             ← 4. fields with aliases and tags
   FROM <table> <alias>         ← 5. main table
   LEFT JOIN ...                ← 6. joins (optional)

Important rules:

- table aliases must be unique across the cube
- the same table may be reused with a different alias

Measure definition
^^^^^^^^^^^^^^^^^^

A measure consists of:

1. source column
2. aggregation function
3. resulting column alias

Example:

.. code-block:: sql

   sum(sales.sale_qty) as sales_sum_qty

Naming recommendation:

::

   <table_alias>_<aggregation>_<column>

Example:

::

   sales_sum_qty

Measure metadata tags
^^^^^^^^^^^^^^^^^^^^^

Additional tags may be defined on the same line:

- translation — display name in Excel (optional; if omitted, the field alias is used as the display name)
- format — numeric format in Pivot Tables

Example:

.. code-block:: sql

   sum(sales.sale_qty) as sales_sum_qty --translation=`Sales Quantity` --format=`#,##0;-#,##0`

Important: we place each measure on a new line, separated by commas.
Next, use the ``olap_measures`` tag before the list of measures to identify them for the OLAP cube.
This tag must be preceded by a ``SELECT`` statement, creating a standalone script that you can run in your database to see the results. 
Finally, add the ``olap_source`` tag followed by the measure group name on the same line; this tells the system whether the section contains a measure group or a dimension.


Dimension design
^^^^^^^^^^^^^^^^

We suggest reviewing the definition of measure groups first, as they are very similar to dimensions.

Dimensions define the analytical context for measures.

Typical examples:

- stores
- regions
- products
- time

Example:

.. code-block:: sql

   --olap_source Stores
   SELECT
   --olap_dimensions
       stores.id as store_id
      ,stores.store_name as store_name
   FROM db.Stores stores

Dimension metadata tags
^^^^^^^^^^^^^^^^^^^^^^^

Attributes may include tags such as translation (optional; if omitted, the field alias is used as the display name).

Example:

.. code-block:: sql

   stores.store_name as store_name --translation=`Store`

Place the ``olap_dimensions`` tag before the attribute list, preceded by a SELECT statement to make the script executable. 
Above the SELECT statement, add the ``olap_source`` tag followed by the dimension name as it should appear in the Excel PivotTable. 
You can define multiple measure groups and dimensions this way, starting each with olap_source and separating the scripts with a blank line.


Hierarchies
^^^^^^^^^^^

Hierarchy defines parent-child relationships between dimension attributes.

Example:

.. code-block:: sql

   times.year as times_year --hierarchy=`Dates`
   times.quarter as times_quarter --hierarchy=`Dates`
   times.month as times_month --hierarchy=`Dates`   
   times.day as times_day  --hierarchy=`Dates`

Relationships
^^^^^^^^^^^^^

Relationships connect measures and dimensions.

Example:

.. code-block:: sql

   FROM db.Sales sales
   LEFT JOIN db.Stores stores ON sales.store_id = stores.id

Rules:

- always use LEFT JOIN
- joins must be explicit

Measure groups support both direct and indirect dimension relationships. Each link must be defined on a new line.
Indirect connections occur when a dimension links to a measure group via an intermediary dimension.

Special relationship types
^^^^^^^^^^^^^^^^^^^^^^^^^^

many-to-many:

.. code-block:: sql

   LEFT JOIN db.Managers managers ON sales.store_id = managers.store_id --relationship=`many-to-many``

Many-to-many relationships follow the classic Analysis Services model, where dimensions lack a unique key. Instead, a single measure group value maps to multiple dimension rows. For example, multiple managers can be assigned to the same store, causing overlapping results when filtering.

one-table:

.. code-block:: sql

   --olap_source Sales
   SELECT ...
   FROM db.sales sales
   LEFT JOIN db.sales sales --relationship=`one-table`

For denormalized sources like ClickHouse, use the relationship=`one-table` tag to link measures and dimensions within a single table. This bypasses the unique alias rule and the LEFT JOIN operation. The OLAP server will query the flat table directly; no ON clause or join columns are required.

Calculated fields
-----------------

Calculated fields are virtual measures computed from other measures.

Example:

.. code-block:: sql

   --olap_cube
   (sales_qty/stock_avg_qty) as turnover --translation=`Turnover`

CTE
---

CTE scripts define temporary datasets used in cube SQL.

Example:

.. code-block:: sql

   WITH calendar AS (
       SELECT ...
   )

CTEs can serve as data sources for both measure groups and dimensions.

User roles
----------

User roles control access to cube data.

Example:

.. code-block:: sql

   --olap_user_role
   --olap_user_groups
   finance_users

Visibility:

.. code-block:: sql

   --olap_calculated_fields_visible
   all
   --olap_measures_visible
   sales_sum
   --olap_dimensions_visible
   region, store

Access filters:

.. code-block:: sql

   --olap_access_filters
   region = 'EU'

SQL generation logic
--------------------

In short, XLTable works as follows: when a user selects fields in an Excel PivotTable, Excel sends an MDX query to the OLAP server. The server parses the MDX and, based on the cube's definition, generates several SQL queries to the database. To build efficient OLAP cubes, it is essential to understand how these SQL queries are constructed.

When a user selects fields in Excel:

1. Excel sends an MDX query
2. XLTable interprets selected measures and dimensions
3. SQL is generated only for selected elements
4. Queries are executed in the database
5. Results are returned to Excel Pivot Table

If multiple measure groups exist:

- SQL is generated per group
- results are merged using FULL JOIN
- shared dimension attributes are used as join keys

Put simply, SQL generation follows a basic principle: the queries executed are exactly what is defined in the cube metadata.
Enable logging in settings.json → WRITE_LOG to inspect generated SQL.

Jinja scripts
-------------

When working with Big Data, performance and database load are critical. This means your SQL queries must be both accurate and efficient. Furthermore, users often require complex metrics that exceed the standard capabilities of OLAP cube measures. Jinja templates allow you to control SQL syntax without limitations, dynamically adapting the queries based on the user's selected fields and filters in Excel.

Jinja scripts allow modifying generated SQL dynamically.

Use cases:

- performance optimization
- conditional SQL logic
- advanced metrics

Example of a Jinja script modifying SQL:

.. code-block:: sql

   --olap_jinja
   {{ sql_text | replace("salesly.date_sale", "addYears(salesly.date_sale, 1)") }}

The principle is simple: Jinja scripts are defined within the cube and applied to modify the generated SQL query. You can define scripts for specific measure groups, dimensions, or the entire cube. A script assigned to a measure group only affects its specific SQL segment, while a cube-level script applies to the overall query.

Execution order:

1. measure group Jinja
2. dimension Jinja
3. cube-level Jinja

Jinja scripts take the SQL query text and the context data as inputs. This context includes the cube definition, user-selected fields, active filters, and other metadata essential for modifying the query dynamically.
See: :ref:`jinja_var`

Best practices for cube design
------------------------------

Naming conventions:

- measures → <table_alias>_<aggregation>_<column>
- dimensions → <table_alias>_<column>

Aliases must be unique.

Table alias rules:

- every source must have a unique alias
- aliases must remain stable

Dimension strategy:

- use descriptive attributes
- avoid high-cardinality fields

Hierarchy design:

- build logical parent-child structures
- maintain natural ordering

Join strategy:

- always use LEFT JOIN
- define joins explicitly

Measure design:

- keep aggregations simple
- avoid nested SQL

Calculated fields:

- use only when required
- keep readable and testable

Performance:

- minimize joins
- pre-aggregate data in database
- reduce cube complexity

Cardinality:

- avoid using IDs as primary dimensions
- prefer grouped attributes

Jinja:

- use for small SQL adjustments
- avoid complex logic

Security:

- define roles early
- restrict sensitive measures

Maintainability:

- separate blocks clearly
- version control cube definitions

Design philosophy:

SQL first.

Everything in XLTable cubes is defined using SQL:

- structure
- logic
- metadata
- security