This is a fantastic start to defining the use cases that need to be addressed. I would add one more metric type that seems to be hinted at, but not defined fully. If we go back to Kimball's definition of metrics, we can add semi-additive metrics. These are metrics that can sometimes be aggregated across some dimensions, but not all. For instance, month-end balances can't be summed across months. They represent a running total. They can be averaged across months though. Another example is a budget or limit in a parent table that can't be aggregated across it's children, but needs to be aggregated across other budget records. In these instances, we can do nested sql to get the appropriate aggregations -- sum up line items, join to parent table and sum again. Also, we could join and sum making sure that we divided the parent metric by the # of line items so everything sums up correctly.

I would argue that these solutions are inefficient and a hack that makes self-service analytics by less technical people almost impossible to achieve. It's not that an end user shouldn't have to think about data granularity, but they shouldn't have to create technical solutions on their own for it. A metric store that can address this will be one that has the basic capabilities needed for creating analysis-ready data sets in an organization.

Aside from semi-additive metrics, the other question for metric stores is the reliance on relational models. How can a metric store function on top of a knowledge graph? How do we implement it as a graph or as a virtual graph (i.e. Stardog)? This has to be asked just due to the scalability and robust flexibility of graph modeling.

Expand full comment
May 28, 2022·edited May 28, 2022

My project: https://github.com/Agile-Data/flat-ql

It compiles flat query language into SQL dialect of various database for data analysis.

Expand full comment

Great blog post Amit.

At Humanic, we further believe that the metrics layer needs to be integrated within a context of a tool that the business owner (Revenue Leader) uses to take some revenue generating action, otherwise it becomes yet another tool that Data Scientists and Business Analysts will be utilizing.

That by itself is not a bad idea except that much is lost in translation when a business user needs to communicate data requirements to a Data Scientist.

Would love to get a coffee to discuss further! saksena@humanic.ai

Expand full comment

Amit, Excellent article and thanks for sharing your thoughts. I agree that there are benefits of unbundling the metrics layers and solution #3 is promising.

If solution #3 were adopted by a few vendors, the next question is, what will be the bridge between visualization grammar and the query generation tool.

Generation of queries for multiple aggregation levels is non-trivial. What if the tool vendor wishes to optimize the query created by a separate layer?

Expand full comment
Jan 15, 2022·edited Jan 15, 2022

Merits of metrics! 🙌 🎉

Expand full comment