SQL LEFT JOIN

Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.

The LEFT JOIN command

{{ page.seo-title }}

Let's start by running an INNER JOIN on the Crunchbase dataset and taking a look at the results. We'll just look at company-permalink in each table, as well as a couple other fields, to get a sense of what's actually being joined.

SELECT companies.permalink AS companies_permalink,
       companies.name AS companies_name,
       acquisitions.company_permalink AS acquisitions_permalink,
       acquisitions.acquired_at AS acquired_date
  FROM tutorial.crunchbase_companies companies
  JOIN tutorial.crunchbase_acquisitions acquisitions
    ON companies.permalink = acquisitions.company_permalink

You may notice that "280 North" appears twice in this list. That is because it has two entries in the tutorial.crunchbase_acquisitions table, both of which are being joined onto the tutorial.crunchbase_companies table.

Now try running that query as a LEFT JOIN:

SELECT companies.permalink AS companies_permalink,
       companies.name AS companies_name,
       acquisitions.company_permalink AS acquisitions_permalink,
       acquisitions.acquired_at AS acquired_date
  FROM tutorial.crunchbase_companies companies
  LEFT JOIN tutorial.crunchbase_acquisitions acquisitions
    ON companies.permalink = acquisitions.company_permalink

You can see that the first two companies from the previous result set, #waywire and 1000memories, are pushed down the page by a number of results that contain null values in the acquisitions_permalink and acquired_date fields.

This is because the LEFT JOIN command tells the database to return all rows in the table in the FROM clause, regardless of whether or not they have matches in the table in the LEFT JOIN clause.

Sharpen your SQL skills

You can explore the differences between a LEFT JOIN and a JOIN by solving these practice problems:

Write a query that performs an inner join between the tutorial.crunchbase_acquisitions table and the tutorial.crunchbase_companies table, but instead of listing individual rows, count the number of non-null rows in each table.
Modify the query above to be a LEFT JOIN. Note the difference in results.

Now that you've got a sense of how left joins work, try this harder aggregation problem:

Count the number of unique companies (don't double-count companies) and unique acquired companies by state. Do not include results for which there is no state data, and order by the number of acquired companies from highest to lowest.

Next Lesson

SQL RIGHT JOIN

Looks like you've got a thing for cutting-edge data news.

Get the latest.

Get started