SQL Analytics Training
Learn Python for business analysis using real-world data. No coding experience necessary.
The Collaborative Data Science Platform
Understanding Search Functionality
In this lesson we'll cover:
Before starting, be sure to read the overview to learn a bit about Yammer as a company.
The product team is determining priorities for the next development cycle and they are considering improving the site's search functionality. It currently works as follows:
- There is a search box in the header the persists on every page of the website. It prompts users to search for people, groups, and conversations.
- When a user begins to type in the search box, a dropdown list with the most relevant results appears. The results are separated by category (people, conversations, files, etc.). There is also an option to view all results.
- When the user hits enter or selects “view all results” from the dropdown, she is taken to a results page, with results separated by tabs for different categories (people, conversations, etc.). Each tab is order by relevance and chronology (more recent posts surface higher).
- The search results page also has an “advanced search” box that allows the user to search again within a specific Yammer group or date range.
Before tackling search, the product team wants to make sure that the engineering team's time will be well-spent in doing so. After all, each new feature comes at the expense of some other potential feature(s). The product team is most interested in determining whether they should even work on search in the first place and, if so, how they should modify it.
Before looking at the data, develop some hypotheses about how users might interact with search. What is the purpose of search? How would you know if it is fulfilling that purpose? How might you (quantitatively) understand the general quality of an individual user's search experience?
If you're looking for some hints, check out the first part of the answer key.
There are two tables that are relevant to this problem. Most critically, there are certain events that you will want to look into in the events table below:
- search_autocomplete: This is logged when a user clicks on a search option from autocomplete
- search_run: This is logged when a user runs a search and sees the search results page.
- search_click_X: This is logged when a user clicks on a search result. X, which ranges from 1 to 10, describes which search result was clicked.
The tables names and column definitions are listed below—click a table name to view information about that table.
Table 1: Users
|This table includes one row per user, with descriptive information about that user's account.
This table name in Mode is tutorial.yammer_users
|user_id:||A unique ID per user. Can be joined to user_id in either of the other tables.|
|created_at:||The time the user was created (first signed up)|
|state:||The state of the user (active or pending)|
|activated_at:||The time the user was activated, if they are active|
|company_id:||The ID of the user's company|
|language:||The chosen language of the user|
Table 2: Events
|This table includes one row per event, where an event is an action that a user has taken on Yammer. These events include login events, messaging events, search events, events logged as users progress through a signup funnel, events around received emails.
This table name in Mode is tutorial.yammer_events
|user_id:||The ID of the user logging the event. Can be joined to user\_id in either of the other tables.|
|occurred_at:||The time the event occurred.|
|event_type:||The general event type. There are two values in this dataset: "signup_flow", which refers to anything occuring during the process of a user's authentication, and "engagement", which refers to general product usage after the user has signed up for the first time.|
|event_name:||The specific action the user took. Possible values include: create_user: User is added to Yammer's database during signup process enter_email: User begins the signup process by entering her email address enter_info: User enters her name and personal information during signup process complete_signup: User completes the entire signup/authentication process home_page: User loads the home page like_message: User likes another user's message login: User logs into Yammer search_autocomplete: User selects a search result from the autocomplete list search_run: User runs a search query and is taken to the search results page search_click_result_X: User clicks search result X on the results page, where X is a number from 1 through 10. send_message: User posts a message view_inbox: User views messages in her inbox|
|location:||The country from which the event was logged (collected through IP address).|
|device:||The type of device used to log the event.|
Making a recommendation
Once you have an understanding of the data, try to validate some of the hypotheses you formed earlier. In particular, you should seek to answer the following questions:
- Are users' search experiences generally good or bad?
- Is search worth working on at all?
- If search is worth working on, what, specifically, should be improved?
Come up with a brief presentation describing the state of search at Yammer. Display your findings graphically. You should be prepared to recommend what, if anything, should be done to improve search. If you determine that you do not have sufficient information to test anything you deem relevant, discuss the caveats.
Finally, determine a way to understand whether your feature recommendations are actually improvements over the old search (assuming that anything you recommend will be completed).
If you want to check your work against the answer key, click here.
Understanding Search Functionality: Answers