mode helix
NEWIntroducing Helix—the first instant, responsive data engine.Learn more

SQL INNER JOIN

Starting here? This lesson is part of a full-length tutorial in using SQL for Data Analysis. Check out the beginning.

INNER JOIN

In the previous lesson, you learned the basics of SQL joins using a data about college football players. All of the players in the players table match to one school in the teams table. But what if the data isn't so clean? What if there are multiple schools in the teams table with the same name? Or if a player goes to a school that isn't in the teams table?

If there are multiple schools in the teams table with the same name, each one of those rows will get joined to matching rows in the players table. Returning to the previous example with Michael Campanaro, if there were three rows in the teams table where school_name = 'Wake Forest', the join query above would return three rows with Michael Campanaro.

It's often the case that one or both tables being joined contain rows that don't have matches in the other table. The way this is handled depends on whether you're making an inner join or an outer join.

We'll start with inner joins, which can be written as either JOIN benn.college_football_teams teams or INNER JOIN benn.college_football_teams teams. Inner joins eliminate rows from both tables that do not satisfy the join condition set forth in the ON statement. In mathematical terms, an inner join is the intersection of the two tables.

SQL Inner Join

Therefore, if a player goes to a school that isn't in the teams table, that player won't be included in the result from an inner join. Similarly, if there are schools in the teams table that don't match to any schools in the players table, those rows won't be included in the results either.

Joining tables with identical column names

When you join two tables, it might be the case that both tables have columns with identical names. In the below example, both tables have columns called school_name:

SELECT players.*,
       teams.*
  FROM benn.college_football_players players
  JOIN benn.college_football_teams teams
    ON teams.school_name = players.school_name

The results can only support one column with a given name—when you include 2 columns of the same name, the results will simply show the exact same result set for both columns even if the two columns should contain different data. You can avoid this by naming the columns individually. It happens that these two columns will actually contain the same data because they are used for the join key, but the following query technically allows these columns to be independent:

SELECT players.school_name AS players_school_name,
       teams.school_name AS teams_school_name
  FROM benn.college_football_players players
  JOIN benn.college_football_teams teams
    ON teams.school_name = players.school_name

Sharpen your SQL skills

Write a query that displays player names, school names and conferences for schools in the "FBS (Division I-A Teams)" division.

Next Lesson

SQL Outer Joins

Looks like you've got a thing for cutting-edge data news.

Get the latest.

Get started