Python Tutorial
Learn Python for business analysis using real-world data. No coding experience necessary.
Start Now
Mode Studio
The Collaborative Data Science Platform
SQL Data Types
In this lesson we'll cover:
Data types
In previous lessons, you learned that certain functions work on some data types, but not others. For example, COUNT
works with any data type, but SUM
only works for numerical data (if this doesn't sound familiar, you should revisit this lesson. This is actually more complicated than it appears: in order to use SUM
, the data must appear to be numeric, but it must also be stored in the database in a numeric form.
You might run into this, for example, if you have a column that appears to be entirely numeric, but happens to contain spaces or commas. Yes, it turns out that numeric columns cannot contain commas—If you upload data to Mode with commas in a column full of numbers, Mode will treat that column as non-numeric. Generally, numeric column types in various SQL databases do not support commas or currency symbols. To make things more complicated, SQL databases can store data in many different formats with different levels of precision.
The INTEGER
data type, for example, only stores whole numbers—no decimals. The DOUBLE PRECISION
data type, on the other hand, can store between 15 and 17 significant decimal digits (almost certainly more than you need unless you're a physicist). There are a lot of data types, so it doesn't make sense to list them all here. For the complete list, click here.
Here is the list of exact data types stored in Mode:
Imported as | Stored as | With these rules |
---|---|---|
String | VARCHAR(1024) | Any characters, with a maximum field length of 1024 characters. |
Date/Time | TIMESTAMP | Stores year, month, day, hour, minute and second values as YYYY-MM-DD hh:mm:ss. |
Number | DOUBLE PRECISION | Numerical, with up to 17 significant digits decimal precision. |
Boolean | BOOLEAN | Only TRUE or FALSE values. |
"Imported as" refers to the types that is selected in the import flow (see image below), "Stored as" refers to the official SQL data type, and the third column explains the rules associated with the SQL data type.
Changing a column's data type
It's certainly best for data to be stored in its optimal format from the beginning, but if it isn't, you can always change it in your query. It's particularly common for dates or numbers, for example, to be stored as strings. This becomes problematic when you want to sum a column and you get an error because SQL is reading numbers as strings. When this happens, you can use CAST
or CONVERT
to change the data type to a numeric one that will allow you to perform the sum.
You can actually achieve this with two different type of syntax. For example, CAST(column_name AS integer)
and column_name::integer
produce the same result.
You could replace integer
with any other data type that would make sense for that column—all values in a given column must fit with the new data types.
Mode Community (the site you're using to complete this tutorial) performs implicit conversion in certain circumstances, so data types are rarely likely to be problematic. However, if you're accessing an internal database (your employer's, for example), you may need to be careful about managing data types for some functions.
Sharpen your SQL skills
Practice Problem
Convert the funding_total_usd
and founded_at_clean
columns in the tutorial.crunchbase_companies_clean_date
table to strings (varchar format) using a different formatting function for each one.
Next Lesson
SQL Date Format