ThoughtSpot acquires Mode to define the next generation of collaborative BI >>Learn More

Mode Studio

The Collaborative Data Science Platform

Python Methods, Functions, & Libraries

Starting here? This lesson is part of a full-length tutorial in using Python for Data Analysis. Check out the beginning.

Goals of this lesson

In this lesson, you’ll learn about:

Using Mode Python notebooks

Mode is an analytics platform that brings together a SQL editor, Python notebook, and data visualization builder. Throughout this tutorial, you can use Mode for free to practice writing and running Python code.

  1. Log into Mode or create an account.
  2. Navigate to this report and click 'Duplicate'. This will take you to the SQL Query Editor.
  3. Click 'Python Notebook' under 'Notebook' in the left navigation panel. This will open a new notebook.

Now you’re all ready to go.

Methods

Let's use a dictionary you created in the previous lesson:

Input

city_population = {
  'Tokyo': 13350000, # a key-value pair
  'Los Angeles': 18550000,
  'New York City': 8400000,
  'San Francisco': 1837442,
}
    

For a quick refresher of dictionaries, click here.

Say you want to see all the keys in the dictionary (in this case, cities). You can use a method called .keys() to get a list of the dictionary's keys. A method is an action that you can take on an object. If a real-world window was an object, its methods might be .open() or .close().

In this case, the object is the dictionary, and the action is to return the .keys():

Input

city_population.keys()
    
Output

['New York City', 'San Francisco', 'Los Angeles', 'Tokyo']
    

Remember how everything in Python is an object? That output is a list object! You can double-check this using type() as in the previous lesson.

Input

type(city_population.keys())
    
Output

list
    

To further illustrate this, you can get the third item in the list of keys by referencing its index:

Input

city_population.keys()[2]
    
Output

'Los Angeles'
    

You can also get the values of the dictionary using the values() method:

Input

city_population.values()
    
Output

[8400000, 1837442, 18550000, 13350000]
    

Practice Problem

To reiterate what you learned last lesson, get the population for Los Angeles.

View Solution

Using methods on combined objects

You can use methods on combined objects as well.

In the last lesson, you created a dictionary of lists by nesting individual lists of cities and their municipalities in a dictionary object. Here it is again:

Input

municipalities = {
    'New York City': [
        'Manhattan',
        'The Bronx',
        'Brooklyn',
        'Queens',
        'Staten Island'
    ],
    'Tokyo': [
        'Akihabara',
        'Harajuku',
        'Shimokitazawa',
        'Nakameguro',
        'Shibuya',
        'Ebisu/Daikanyama',
        'Shibuya District',
        'Aoyama',
        'Asakusa/Ueno',
        'Bunkyo District',
        'Ginza',
        'Ikebukuro',
        'Koto District',
        'Meguro District',
        'Minato District',
        'Roppongi',
        'Shinagawa District',
        'Shinjuku',
        'Shinjuku District',
        'Sumida District',
        'Tsukiji',
        'Tsukishima']
}
    

As a reminder, to get a single municipality, you would enter the dictionary’s variable name—municipalities—and the list index—['Tokyo'][3].

Input

municipalities['Tokyo'][3]
    
Output

'Nakameguro'
    

The keys of the dictionary are:

Input

municipalities.keys()
    
Output

['New York City', 'Tokyo']
    

The values associated with the keys in that dictionary are:

Input

municipalities.values()
    
Output

[['Manhattan', 'The Bronx', 'Brooklyn', 'Queens', 'Staten Island'],
 ['Akihabara',
  'Harajuku',
  'Shimokitazawa',
  'Nakameguro',
  'Shibuya',
  'Ebisu/Daikanyama',
  'Shibuya District',
  'Aoyama',
  'Asakusa/Ueno',
  'Bunkyo District',
  'Ginza',
  'Ikebukuro',
  'Koto District',
  'Meguro District',
  'Minato District',
  'Roppongi',
  'Shinagawa District',
  'Shinjuku',
  'Shinjuku District',
  'Sumida District',
  'Tsukiji',
  'Tsukishima']]
    

Remember that lists are denoted with square brackets? Notice that there are three sets of [] in the output. That’s because the .values() output is a list containing New York’s municipalities and Tokyo’s municipalities, both of which are lists, as well.

In other words, municipalities.values() is a list containing two lists.

Functions

So far, you've learned about objects and methods. Now it's time to look at a few functions. They're very similar to methods in that they perform an action, but unlike methods, functions are not tied to specific objects. To relate this concept to our earlier window example, the function throw_rock() could be used with a real-world object window, but also with vase or cow or a number of other objects.

Functions typically go in front of an object name (with the object wrapped in parentheses), whereas a method is appended to the end of an object name. For example, compare throw_rock(window) with window.open().

Type

You've already used a function—type()—several times in this tutorial. It can be used on any object. Try it out here using different items within the municipalities object:

Input

print type(municipalities)
print type(municipalities['Tokyo'])
print type(municipalities['Tokyo'][0])
    
Output

<type 'dict'>
<type 'list'>
<type 'str'>
    

In the output above, 'dict' stands for dictionary, 'list' stands for list, and 'str' stands for string. In other words, municipalities is a dictionary, municipalities['Tokyo'] is a list, and municipalities['Tokyo'][0] is a string.

You may have noticed that we used the print function for every line of code (yes, print is a function! It can also be written as print("object")). If you want more than one line to print output, you need to insert print at the beginning of each line. This is because, by default, Python will only print the output associated with your last line of code in the cell. You can see this in action by removing the print part of the statements above:

Input

type(municipalities)
type(municipalities['Tokyo'])
type(municipalities['Tokyo'][0])
    
Output

str
    

Practice Problem

What type are the values of city_population?

Make sure you look at the answer to this problem, as it contains important concepts.

View Solution

The resulting type, int, means integer, or a number without decimal places. You'll learn more about integers in a later lesson.

You should also take a minute to understand how Python is evaluating the second answer. Everything is executed from the inside out, similar to the order of operations in math (remember PEMDAS?). Here's the order in which Python runs the above line of code:

  1. city_population.values()
  2. city_population.values()[0]
  3. type(city_population.values()[0])

Length

It looks like Tokyo has a large number of administrative districts. How many exactly?

The len() function will describe how long an object is:

Input

len(municipalities['Tokyo'])
    
Output

22
    

len() returns something different depending on the type of object for which you're getting the length. For a list, it returns the number of items in the list, as demonstrated above.

For a string, len() returns the number of characters in the string (including spaces):

Input

print municipalities['Tokyo'][2]
len(municipalities['Tokyo'][2])
    
Output

Shimokitazawa

13

Practice Problem

What is the "length" of municipalities? What does that mean?

View Solution

Libraries

A library is a bundle of code made to help you accomplish routine tasks more quickly. Seaborn, for example, allows you to create visualizations with as little as one line of code. Without such a library, you'd have to write a ton of code to take an object and render a chart. Python is a popular language for data analysis in part because it has extremely robust libraries for data manipulation, visualization, machine learning, and a host of other applications.

Libraries are usually maintained by groups of contributors and made available online to anyone who wants to use them. This kind of collaboratively maintained software is often referred to as open source.

You need to import a library in order to access the things in it (like its objects and methods). To import a library, you usually have to download the code to the computer where Python is running, and then import it. In Mode, a number of libraries are already available to import, so you don’t have to download anything. Let's import NumPy, a library that is commonly used for mathematical methods:

Input

import numpy # name of the library
    

Now that you have the list of values, you can take their mean (also known as the average) using a method called .mean() from the NumPy library. First, you’ll assign the list of population values—city_population.values()—a new variable name—population_values.

Input

population_values = city_population.values()
population_values
    
Output

[8400000, 1837442, 18550000, 13350000]
    

As you can see, this produces the expected result—a perfect match if we just run city_population.values() on its own:

Input

city_population.values()
    
Output

[8400000, 1837442, 18550000, 13350000]
    

It's good practice to create variables that refer to objects you'll need later. Naming things will help you keep track of your work and will make your code much more readable to others.

Now that population_values is stored, you can use the numpy.mean() method on population_values:

Input

numpy.mean(population_values)
    
Output

10534360.5
    

The mean value of the city populations is about 10.5 million.

Lesson summary

In this lesson, you learned about:

  • Methods of dictionaries, specifically .keys() and .values()
  • Functions, specifically print, type(), and len()
  • Importing libraries with import

In the next lesson, you'll learn how to work with a DataFrame (a powerful tabular data structure from the pandas library) to view and process data.

Next Lesson

Creating Pandas DataFrames & Selecting Data

Get more from your data

Your team can be up and running in 30 minutes or less.