Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.
This lesson uses the same data from previous lessons, which was pulled from Crunchbase on Feb. 5, 2014. Learn more about this dataset.
Joining on multiple keys
There are couple reasons you might want to join tables on multiple foreign keys. The first has to do with accuracy.
The second reason has to do with performance. SQL uses “indexes” (essentially pre-defined joins) to speed up queries. This will be covered in greater detail the lesson on making queries run faster, but for all you need to know is that it can occasionally make your query run faster to join on multiple fields, even when it does not add to the accuracy of the query. For example, the results of the following query will be the same with or without the last line. However, it is possible to optimize the database such that the query runs more quickly with the last line included:
FROM tutorial.crunchbase_companies companies
LEFT JOIN tutorial.crunchbase_investments_part1 investments
ON companies.permalink = investments.company_permalink
AND companies.name = investments.company_name
It’s worth noting that this will have relatively little effect on small datasets.