Discussion about this post

User's avatar
D-503's avatar

This is a fantastic start to defining the use cases that need to be addressed. I would add one more metric type that seems to be hinted at, but not defined fully. If we go back to Kimball's definition of metrics, we can add semi-additive metrics. These are metrics that can sometimes be aggregated across some dimensions, but not all. For instance, month-end balances can't be summed across months. They represent a running total. They can be averaged across months though. Another example is a budget or limit in a parent table that can't be aggregated across it's children, but needs to be aggregated across other budget records. In these instances, we can do nested sql to get the appropriate aggregations -- sum up line items, join to parent table and sum again. Also, we could join and sum making sure that we divided the parent metric by the # of line items so everything sums up correctly.

I would argue that these solutions are inefficient and a hack that makes self-service analytics by less technical people almost impossible to achieve. It's not that an end user shouldn't have to think about data granularity, but they shouldn't have to create technical solutions on their own for it. A metric store that can address this will be one that has the basic capabilities needed for creating analysis-ready data sets in an organization.

Aside from semi-additive metrics, the other question for metric stores is the reliance on relational models. How can a metric store function on top of a knowledge graph? How do we implement it as a graph or as a virtual graph (i.e. Stardog)? This has to be asked just due to the scalability and robust flexibility of graph modeling.

Expand full comment
YeongHer's avatar

My project: https://github.com/Agile-Data/flat-ql

It compiles flat query language into SQL dialect of various database for data analysis.

Expand full comment
6 more comments...

No posts