Python Tutorial
Learn Python for business analysis using real-world data. No coding experience necessary.
Start Now
Mode Studio
The Collaborative Data Science Platform
Python Methods, Functions, & Libraries
Starting here? This lesson is part of a full-length tutorial in using Python for Data Analysis. Check out the beginning.
Goals of this lesson
In this lesson, you’ll learn about:
- Methods of dictionaries, specifically
.keys()
and.values()
- Functions, specifically print, type(), and len()
- Importing libraries with
import
Using Mode Python notebooks
Mode is an analytics platform that brings together a SQL editor, Python notebook, and data visualization builder. Throughout this tutorial, you can use Mode for free to practice writing and running Python code.
- Log into Mode or create an account.
- Navigate to this report and click 'Duplicate'. This will take you to the SQL Query Editor.
- Click 'Python Notebook' under 'Notebook' in the left navigation panel. This will open a new notebook.
Now you’re all ready to go.
Methods
Let's use a dictionary you created in the previous lesson:
city_population = {
'Tokyo': 13350000, # a key-value pair
'Los Angeles': 18550000,
'New York City': 8400000,
'San Francisco': 1837442,
}
For a quick refresher of dictionaries, click here.
Say you want to see all the keys in the dictionary (in this case, cities). You can use a method called .keys()
to get a list of the dictionary's keys. A method is an action that you can take on an object. If a real-world window
was an object, its methods might be .open()
or .close()
.
In this case, the object is the dictionary, and the action is to return the .keys()
:
city_population.keys()
['New York City', 'San Francisco', 'Los Angeles', 'Tokyo']
Remember how everything in Python is an object? That output is a list object! You can double-check this using type()
as in the previous lesson.
type(city_population.keys())
list
To further illustrate this, you can get the third item in the list of keys by referencing its index:
city_population.keys()[2]
'Los Angeles'
You can also get the values of the dictionary using the values()
method:
city_population.values()
[8400000, 1837442, 18550000, 13350000]
Practice Problem
To reiterate what you learned last lesson, get the population for Los Angeles.
View SolutionUsing methods on combined objects
You can use methods on combined objects as well.
In the last lesson, you created a dictionary of lists by nesting individual lists of cities and their municipalities in a dictionary object. Here it is again:
municipalities = {
'New York City': [
'Manhattan',
'The Bronx',
'Brooklyn',
'Queens',
'Staten Island'
],
'Tokyo': [
'Akihabara',
'Harajuku',
'Shimokitazawa',
'Nakameguro',
'Shibuya',
'Ebisu/Daikanyama',
'Shibuya District',
'Aoyama',
'Asakusa/Ueno',
'Bunkyo District',
'Ginza',
'Ikebukuro',
'Koto District',
'Meguro District',
'Minato District',
'Roppongi',
'Shinagawa District',
'Shinjuku',
'Shinjuku District',
'Sumida District',
'Tsukiji',
'Tsukishima']
}
As a reminder, to get a single municipality, you would enter the dictionary’s variable name—municipalities
—and the list index—['Tokyo'][3]
.
municipalities['Tokyo'][3]
'Nakameguro'
The keys of the dictionary are:
municipalities.keys()
['New York City', 'Tokyo']
The values associated with the keys in that dictionary are:
municipalities.values()
[['Manhattan', 'The Bronx', 'Brooklyn', 'Queens', 'Staten Island'],
['Akihabara',
'Harajuku',
'Shimokitazawa',
'Nakameguro',
'Shibuya',
'Ebisu/Daikanyama',
'Shibuya District',
'Aoyama',
'Asakusa/Ueno',
'Bunkyo District',
'Ginza',
'Ikebukuro',
'Koto District',
'Meguro District',
'Minato District',
'Roppongi',
'Shinagawa District',
'Shinjuku',
'Shinjuku District',
'Sumida District',
'Tsukiji',
'Tsukishima']]
Remember that lists are denoted with square brackets? Notice that there are three sets of []
in the output. That’s because the .values()
output is a list containing New York’s municipalities and Tokyo’s municipalities, both of which are lists, as well.
In other words, municipalities.values()
is a list containing two lists.
Functions
So far, you've learned about objects and methods. Now it's time to look at a few functions. They're very similar to methods in that they perform an action, but unlike methods, functions are not tied to specific objects. To relate this concept to our earlier window example, the function throw_rock()
could be used with a real-world object window
, but also with vase
or cow
or a number of other objects.
Functions typically go in front of an object name (with the object wrapped in parentheses), whereas a method is appended to the end of an object name. For example, compare throw_rock(window) with window.open().
Type
You've already used a function—type()
—several times in this tutorial. It can be used on any object. Try it out here using different items within the municipalities
object:
print type(municipalities)
print type(municipalities['Tokyo'])
print type(municipalities['Tokyo'][0])
<type 'dict'>
<type 'list'>
<type 'str'>
In the output above, 'dict'
stands for dictionary, 'list'
stands for list, and 'str'
stands for string. In other words, municipalities
is a dictionary, municipalities['Tokyo']
is a list, and municipalities['Tokyo'][0]
is a string.
You may have noticed that we used the print
function for every line of code (yes, print
is a function! It can also be written as print("object")
). If you want more than one line to print output, you need to insert print at the beginning of each line. This is because, by default, Python will only print the output associated with your last line of code in the cell. You can see this in action by removing the print
part of the statements above:
type(municipalities)
type(municipalities['Tokyo'])
type(municipalities['Tokyo'][0])
str
Practice Problem
What type are the values of
city_population
?
Make sure you look at the answer to this problem, as it contains important concepts.
View SolutionThe resulting type, int
, means integer, or a number without decimal places. You'll learn more about integers in a later lesson.
You should also take a minute to understand how Python is evaluating the second answer. Everything is executed from the inside out, similar to the order of operations in math (remember PEMDAS?). Here's the order in which Python runs the above line of code:
city_population.values()
city_population.values()
[0]
type(
city_population.values()[0]
)
Length
It looks like Tokyo has a large number of administrative districts. How many exactly?
The len()
function will describe how long an object is:
len(municipalities['Tokyo'])
22
len()
returns something different depending on the type of object for which you're getting the length. For a list, it returns the number of items in the list, as demonstrated above.
For a string, len()
returns the number of characters in the string (including spaces):
print municipalities['Tokyo'][2]
len(municipalities['Tokyo'][2])
Shimokitazawa
13
Libraries
A library is a bundle of code made to help you accomplish routine tasks more quickly. Seaborn, for example, allows you to create visualizations with as little as one line of code. Without such a library, you'd have to write a ton of code to take an object and render a chart. Python is a popular language for data analysis in part because it has extremely robust libraries for data manipulation, visualization, machine learning, and a host of other applications.
Libraries are usually maintained by groups of contributors and made available online to anyone who wants to use them. This kind of collaboratively maintained software is often referred to as open source.
You need to import a library in order to access the things in it (like its objects and methods). To import a library, you usually have to download the code to the computer where Python is running, and then import it. In Mode, a number of libraries are already available to import, so you don’t have to download anything. Let's import NumPy, a library that is commonly used for mathematical methods:
import numpy # name of the library
Now that you have the list of values, you can take their mean (also known as the average) using a method called .mean()
from the NumPy
library. First, you’ll assign the list of population values—city_population.values()
—a new variable name—population_values
.
population_values = city_population.values()
population_values
[8400000, 1837442, 18550000, 13350000]
As you can see, this produces the expected result—a perfect match if we just run city_population.values()
on its own:
city_population.values()
[8400000, 1837442, 18550000, 13350000]
It's good practice to create variables that refer to objects you'll need later. Naming things will help you keep track of your work and will make your code much more readable to others.
Now that population_values
is stored, you can use the numpy.mean()
method on population_values
:
numpy.mean(population_values)
10534360.5
The mean value of the city populations is about 10.5 million.
Lesson summary
In this lesson, you learned about:
- Methods of dictionaries, specifically
.keys()
and.values()
- Functions, specifically
print
,type()
, andlen()
- Importing libraries with
import
In the next lesson, you'll learn how to work with a DataFrame (a powerful tabular data structure from the pandas library) to view and process data.
Next Lesson
Creating Pandas DataFrames & Selecting Data