February 14, 2023
NaN minute read
In case you missed it, we hosted a webinar between Mode and dbt Labs on why semantic layers matter, how to think about them, and how Mode integrates with dbt’s semantic layer.
The conversation was between Anna Filippova, Director of Community & Data at dbt Labs, and Benn Stancil, co-founder and Chief Analytics Officer at Mode. We’ve summarized the highlights from the talk below.
For the full recap, download the webinar here.
Anna: “A semantic layer is a business representation of the data you have in your data warehouse. If the model data in the warehouse is a representation of business knowledge, it’s still not accessible to folks inside the business. It’s still very focused on the underlying representation. The semantic layer takes what you have defined as metrics and entities, and translates them into things the business can use consistently.”
The value of the semantic layer is the centralization and standardization of critical business metrics that can be used in other tools across a business. dbt’s Semantic layer should contain the definition of metrics that more than one team needs to use. For example, the definition of Annual Recurring Revenue (ARR).
ARR not only shows the financial performance of a company, but it can affect the company’s valuation, and therefore planning in the years to come. Defining ARR can have nuanced differences from company to company. Companies need to bake into the definition of ARR information like: How do we account for discounts? Does ARR include churned customers? What does churn mean; is it a canceled subscription, or simply a period of inactivity? Wherever a company falls on the debates for reporting ARR, centralizing this definition in the semantic layer allows everyone in the organization to share a true North Star.
Data teams cannot and should not define metrics on their own, they should define them in partnership with their business teams. Use this template to define metrics with stakeholders.
"The most important things that have to be consistent across the business are the things that should be defined in your semantic layer."
A template for data teams and stakeholders to define metrics together.
Dbt’s semantic layer helps data folks avoid having to define critical concepts in multiple places. As someone who used to spend a lot of time in the data code base, Anna would have to define metrics in a dbt model and then in another semantic layer to have them accessible, more broadly to the rest of the organization. Now, with dbt, in most cases you only have to define it once and then it can be used in your BI tools, like Mode.
And because dbt is vendor-agnostic as to what’s on top of it, this logic can live in any BI tool (unlike other semantic layers, like LookML). Anna’s take was that the semantic layer should live close to the models that power the semantic layer.
Anna: “The right number of layers is as few as possible, and those few as possible are very well integrated with each other.
We didn’t really have a standard for a Semantic Layer, we didn’t have a common representation of this that was available to folks at scale and it’s non-trivial to migrate from one to the other— and so, I think that’s why you see a variety of folk focusing on that problem.
I’m biased, I think that the semantic layer should live really close to the models that power the Semantic layer because it’s a much better developer experience and a logical place to do all of your data governance."
The right number of layers is as few as possible, and those few as possible are very well integrated with each other.
Benn: The big differences to me are 1) the Semantic Layer not only promotes it in a table, it actually contains the code that generates it 2) Because it has that, it is more definitionally correct in the sense that it is like configurations code—it tells you what you’re doing. The code itself is the definition.
You don’t look up the way someone defined it when they documented it; you look up the code which is definitionally correct.
Anna: Speaking from being a downstream consumer of this very same thing—one of the things I get really excited about is the ability to say ‘I want to have my KPIs and health measures of the dbt community I track internally to be easily accessible throughout the company. I want those to live in company dashboards, I want those to live in business review ... and be easily accessible to managers, etc.
This is where I get really excited about partnerships like with Mode. The BI layer, the layer that Mode represents, gives my team lots of flexibility with how they approach working with those metrics without having to worry about how they’re defined under the hood. [They’re able to] focus on actually using them to make decisions about where we go in the future as opposed to getting them stood up and defined.”
Mode can be the front end to the dbt layer—or the point of access to the dbt metrics. To see how this works, learn about Mode's dbt Integration.
Dbt is built around helping people move up the stack, or letting folks focus more on the substance of the data problem they’re solving rather than fighting with the tools they have.
Anna: “It allows you to do more complex things in a more simple way. How do you make management of the underlying complexity of your data easier over time? How do you make it easier for someone who is downstream to participate in that process in the future? How do you make it easier for a CEO who doesn’t want to query things from a table in the warehouse?”
Benn and Anna agree there’s a long-term role AI can play in this, but it’s not the most critical one. Businesses should always have their best human minds defining the critical business concepts in the semantic layer. AI tools (like ChatGPT, the example given) can help you stop thinking about repetitive things, like writing a very standard sort of email or document, so they can allow you to think more broadly about the concept rather than the details of its execution.
A machine can work to do tasks like automatically segmenting things, but it would struggle to infer what it means for a business to deliver value to its customers with all the nuances of human needs and expectation.
Anna: “I think that your AI or deep learning model is only going to be able to surface things that already exist. Things that are already common patterns—and may not necessarily be good patterns… That level of importance, of things that are most critical to the business, still has to come from the humans because it’s humans who are making decisions about where to go next.”
Learn more about the differences between dbt’s Semantics Layer and Mode’s Datasets.
Work-related distractions for data enthusiasts.