Additional Resources

template

Defining Metrics: A Template  for Working With Stakeholders

Guide

10 Things Modern Business  Intelligence Should Enable Today

video

Watch a Product Tour of Mode

Get started with Mode

Creating Persistent Derived Tables

Joel Carron, Data Scientist at Mode

February 8, 2017

NaN minute read

Now that the Mode SQL editor supports running multiple SQL statements, you can do so much more than pull data back from the database. One door this opens is the ability to create persistent derived tables, which are tables generated when the results of a SQL query are written back into your database. Once in your database, these tables can be queried just like any other table.

Why persistent derived tables are useful

Persistent derived tables provide performance gains by enabling you to query against data that has already been summarized, rather than performing transformations at the time of each query execution.

Suppose you have a raw event table called page_visits that contains a record of every page visit on your website. You'd like to report on the count of unique site visitors by day over the last year. Executing such a query against your raw event table directly could be very expensive, as you would have to aggregate across every row in page_visits for the last year. Instead, you can pre-calculate this aggregation by creating a persistent derived table with only the data you need (in this case, day and unique_visitor_count). Querying this table instead would be much more performant, as the expensive aggregation would have already taken place.

How to create persistent derived tables in Mode

You'll only be able to create a persistent derived table using Mode if you're authorized to write to at least one schema in your database.

Optional: Set up a dedicated schema for persistent derived tables

While it's not necessary, creating a dedicated schema can help you manage database permissions and keep persistent derived tables organized. Rather than granting write-access to your entire database, where people might inadvertently alter important tables or cause other issues, you can instead grant write-access to a single schema. The unique schema will also ensure your persistent derived tables can be easily discovered and are distinguishable from raw tables.

Step 1: Write a query

The first step is to write a query that returns the data you want to write back to your database as a table.

In this example, information about orders is aggregated by month and sales rep.

SELECT DATE_TRUNC('month',o.occurred_at) AS month,
         sr.name AS sales_rep,
         SUM(o.total) AS units_sold,
         SUM(o.total_amt_usd) AS sales_usd
    FROM demo.orders o
    JOIN demo.accounts a
      ON o.account_id = a.id
    JOIN demo.sales_reps sr
      ON sr.id = a.sales_rep_id
GROUP BY 1,2

Step 2: Create a persistent table from your results

Creating a table from your results will typically consist of three statements— beginning a transaction, creating the table, and committing the transaction. A database transaction is a block of code used to generate some change in a database. Each component of a transaction must execute successfully in order for the operation to complete. If a transaction fails midway through, all the changes made to that point get rolled back, ensuring your data is consistent.

The specific syntax you'll use to carry out each statement will depend on the database you're using. The table below has information on the syntax for these statements, with links to the database documentation.

*BigQuery does not require beginning and committing a transaction. Simply use a Create Table statement.

The example below (written for Postgres) creates a persistent derived table from the results of the previous query:

 BEGIN;

  CREATE TABLE monthly_sales AS (
  SELECT DATE_TRUNC('month',o.occurred_at) AS month,
         sr.name AS sales_rep,
         SUM(o.total) AS units_sold,
         SUM(o.total_amt_usd) AS sales_usd
    FROM demo.orders o
    JOIN demo.accounts a
      ON o.account_id = a.id
    JOIN demo.sales_reps sr
      ON sr.id = a.sales_rep_id
GROUP BY 1,2
        );

  COMMIT;

To update the persistent derived table, you’ll first need to drop it, as most databases won’t allow you to create a table that already exists. You can accomplish this by adding a DROP TABLE IF EXISTS transaction to the beginning of your query.

BEGIN;

   DROP TABLE IF EXISTS demo.monthly_sales;

 COMMIT;

  BEGIN;

  CREATE TABLE demo.monthly_sales AS (
  SELECT DATE_TRUNC('month',o.occurred_at) AS month,
           sr.name AS sales_rep,
           SUM(o.total) AS units_sold,
           SUM(o.total_amt_usd) AS sales_usd
    FROM demo.orders o
    JOIN demo.accounts a
      ON o.account_id = a.id
    JOIN demo.sales_reps sr
      ON sr.id = a.sales_rep_id
GROUP BY 1,2
        );

  COMMIT;

Step 3: Set up a schedule

Now that you've created a persistent derived table, schedule it to update on a regular cadence using Mode's report schedule feature.

Keep in mind that the query used to generate your persistent derived table will still be hitting your database whenever the schedule runs, so it's a good idea not to run it more frequently than necessary. Scheduling persistent derived tables to update at off-peak times is a good way to manage your database's resources. To find low-impact times to schedule these, Mode users on the Plus Plan can view a visualization of their query runs by hour of week on the Stats page.

Managing permissions

Mode organizations on the Plus Plan have some options for managing permissions for persistent derived tables.

You can limit who is able to create persistent derived tables on a schema by limiting access to the database connection that has write access. This will also prevent people who do not have access from editing the queries.

If you don't want to limit write-access to your database but you do want to restrict edit-access to the reports you're using to generate persistent derived tables, you can do so by moving those reports into a limited Space.

Pro Tip: Create an additional schema for scratch-work tables

Here at Mode, we set up an additional schema specifically for scratch-work, which includes persistent derived tables that are not intended for wide use across our organization. Because all of these tables exist in one schema that's designated for scratch-work, we can easily purge it in case people forget to drop stuff. It also enables us to grant everyone the ability to create this sort of work, without being overly restrictive.

Get our weekly data newsletter

Work-related distractions for data enthusiasts.

Additional Resources

template

Defining Metrics: A Template  for Working With Stakeholders

Guide

10 Things Modern Business  Intelligence Should Enable Today

video

Watch a Product Tour of Mode

Mode is the intelligence layer for your data stack

Get a firsthand look at Mode's key features & workflows

Get started with Mode

Creating Persistent Derived Tables

Joel Carron, Data Scientist at Mode

Why persistent derived tables are useful

How to create persistent derived tables in Mode

Optional: Set up a dedicated schema for persistent derived tables

Step 1: Write a query

Step 2: Create a persistent table from your results

Step 3: Set up a schedule

Managing permissions

Pro Tip: Create an additional schema for scratch-work tables

Get our weekly data newsletter

Get started with Mode

Related Articles

What is Tableau & What is it Used For? An Analyst's Guide

Use Common Table Expressions (CTE) to Keep Your SQL Clean