Top Data Science Articles
Each quarter we round up the most popular data science articles, videos, and podcasts from Mode's weekly newsletter, the Analytics Dispatch.
Simple Tricks To Up-level Your Analytics Reports
This is what happens if your report is missing the proper context: “Most executives will take a look, realize the difficulty in interpretation in 15 – 20 seconds, and go back to shooting from the gut. Even if the report has hidden gold.”-Occam’s Razor
Your Data Scientist Does Not Need a STEM Ph.D.
Here’s a list of rebuttals for almost every argument you’ll hear for using a STEM Ph.D. as hiring screening criteria.-Towards Data Science
What SQL Analysts Need to Know About Python
Here's some info on the importance of Python and how to use it in day-to-day analysis.-Segment
How Instacart Uses Redshift to Drive Growth
In this interview, Fareed Mosavat, growth PM at Instacart, shares how his team combines behavior, shipping, and fulfillment data to inform product decisions. Check out how his team uses SQL to define internal metrics, conduct A/B tests, and discover how many touches it takes before a user makes their first order.-Segment
8 Data Science Skills That Every Employee Needs
A nice primer to share with your colleagues.-Amplitude
You’re Measuring Daily Active Users Wrong
A high number of daily active users (DAU) may sound impressive, but does it actually mean anything? To make your DAU metric actionable, you need to measure how often users are getting core value out of your product, not how many times they log in.-Amplitude
The Three Faces of Bayes
The term “Bayesian” can refer to a variety of philosophies and ideas. Read this article before the next quant-heavy cocktail party you attend, so you’ll know what’s what.-Slackpropagation
My Experience as a Freelance Data Scientist
Itching to strike out on your own? Read up on the pros and cons before you give your two weeks notice.-Greg Reda
Ten Ways Your Data Project is Going to Fail
“Many companies seem to go through a pattern of hiring a data science team only for the entire team to quit or be fired around 12 months later. Why is the failure rate so high?”-Martin Goodson
What I Wish I Knew About Data For Startups
One entrepreneur reflects on his learnings from four years of working with data at a startup. It’s a goldmine of advice on building a strong, scaleable data culture. Don’t skip this one. Seriously.-Jean-Nicholas Hould
How statistics lost their power – and why we should fear what comes next
“Not only are statistics viewed by many as untrustworthy, there appears to be something almost insulting or arrogant about them. Reducing social and economic issues to numerical aggregates and averages seems to violate some people’s sense of political decency.”-Guardian
Practical advice for analysis of large, complex data sets
“This document has been read more than anything else I’ve done at Google over the last eleven years. Even four years after the last major update, I find that there are multiple Googlers with the document open any time I check.”-The Unofficial Google Data Science Blog
What I Learned Recreating One Chart Using 24 Tools
An incredibly insightful and nuanced lay of the charting tools land.-Source
Star Wars, In One Chart
How does the most fearsome military force in the galaxy get whittled down from 6.8 million troops to 700k? This chart chronicles the casualties sustained by the Empire, from Rogue One to Return of the Jedi.-FiveThirtyEight
R Psychologist
Puzzled by p-values? Confounded by confidence intervals? Stumped by significance testing? This site is a bevy of interactive visualizations illustrating tricky statistical concepts. Even if you’re a statistical genius, it’s worth a visit to play around.-Kristoffer Magnusson
What’s the state of the job market in data science and machine learning?
“Th[e] proliferation of courses, resources, books and startups would hint that machine learning is becoming more and more accessible to the average programmer and that the market is on track to getting saturated quickly. Is this the current trend?”-Hacker News
How To (Actually) Calculate CAC
Quick! What’s the difference between customer acquisition cost (CAC) and cost per acquisition (CPA)? If you hesitated, this post is for you.-Brian Balfour
How These Three Women Made Mid-Career Pivots Into Data Science
How do we narrow the gender gap in data science? Early STEM education for girls isn’t the only solution. Here are the journeys of three women who switched from creative jobs to data roles mid-career.-Fast Company
Building & Maintaining a Master Data Dictionary: Part 2
Check out these ideas for structuring key metric definitions to keep everyone at your organization on the same page.-The Data Point
Choosing a Database for Analytics
A comprehensive rundown of criteria to consider when you’re ready to dedicate a database to analytics. Use this guide to evaluate your options depending on the type and size of your data, the state of your engineering resources, and your need to analyze data in real-time.-Segment
A visual guide to Bayesian thinking
The best single source we’ve found for demystifying how Bayes’ Rule works, the intuition behind it, and how you can use it to inform your thinking.-Julia Galef
Escaping Excel Hell with Python & Pandas
A great presentation on the problems that arise from spreadsheet analysis and how you can ditch Excel by learning some Python.-Chris Moffitt
The Theorem Every Data Scientist Should Know
Quick! Define the Central Limit Theorem. Scratching your head? You’re not alone. And yet, this theorem is key to what data scientists do every day: make statistical inferences about data.-Jean-Nicholas Hould
Two Alternatives to Using a Second Y-Axis
“Almost as often as I see a pie chart with a hundred tiny slivers, I see line graphs using two y-axes. And it is just as bad.”-Stephanie Evergreen
Building Thumbtack’s Data Infrastructure
In this post, Thumbtack data engineer Nate Kupp sheds light on the company’s process for evaluating tools to add to their tech stack. It’s a goldmine for startups contemplating how to build a sustainable data infrastructure.-Thumbtack Engineering
Visualizing Hundreds of My Favorite Songs on Spotify
A deep statistical dive into defining songs with attributes—such as tempo, energy, and valence.-Cuepoint
Awesome visualization research
A curated list of data visualization research papers, books, blog posts, and other readings. It’s pretty fresh, so submit a pull request and contribute!-Matthew Conlen
The Five-Step Guide to Robust Help Center Metrics
When a documentation manager set out to revamp her company’s help site content, she was surprised to find very few resources on how to measure her project. Thankfully, she documented her journey so we can all learn from it. Great tips in here for anyone looking to make their help center more, well… helpful.-RJMetrics
Scaling Data Science at Stitchfix
Not many companies can say they employ 80 data scientists. The folks at Stitchfix share their tactics for making data and compute resources more accessible—which in turn keeps data scientists happy and infrastructure healthy.-MultiThreaded
Data Science for Beginners
“These videos are basic but useful, whether you’re interested in doing data science or you work with data scientists.”-Microsoft Azure
Let's Chart: stop those lying line charts
In a quest for connected points and smoothed lines, we may be implying continuity where it doesn’t exist.-Signal v. Noise
Building a data science portfolio
Much like writers and designers, data scientists are now expected to provide portfolios when they apply for jobs. Here’s what you need to know to get started.-Dataquest
Asking good questions is hard (but worth it)
Although this framework is written from a programmer’s perspective, it’s a great read for analysts and the folks who ask them questions day-in and day-out.-Julia Evans
The Simpsons by the Data
America’s favorite family has been around for 27 years, providing plenty of data to analyze. Find out who’s the most talkative side character in Springfield, if Homer was always the star, and how much longer the show’s ratings can last.-Todd Schneider
Visualizing Distributions
16 ways to display distributions, from the barcode chart to the bean plot.-Darkhorse Analytics
To the point: 7 reasons you should use dot graphs
The pros of dot plots (illustrated with real-world examples) and why they’re often a better choice over bar and line charts.-Maarten Lambrechts
415 Data Visualization Tools
This collection of tools might seem overwhelming at first. Fear not! Filtering by features, data types, cost, and several other variables will help you find what you need, fast.-Adil Yalçın
Our nine-point guide to spotting a dodgy statistic
Numbers might appear unwavering and objective, but they’re easily manipulated—especially by politicians. Here are some common ways people spin numbers to support their agenda, with real-life examples from Brexit, the U.S. presidential election, and more.-The Guardian
The best R package for learning to “think about visualization”
Spoiler alert: it’s ggplot2.-Sharp Sight Labs
The Data Driven Daily
This newsletter provides definitions of business KPIs, how to calculate them for your business. This week they’re covering how to determine the size of your potential customer market. The archive is well worth perusing; past segments include revenue calculation and pricing strategy.-Outlier
The State of Data Engineering
What makes a data engineer, well, a data engineer? And why does it feel like everyone is looking to hire one? This new study of LinkedIn data reveals that the number of data engineers doubled from 2013-2015, but demand still far outpaces supply.-Stitch Data
Non-Mathematical Feature Engineering techniques for Data Science
This article is worth Pocketing for the straightforward, plain-English explanation of feature engineering alone. (And the best practices for pre-processing data ain’t bad either.)-Sachin Joglekar
3 Reasons Counting is the Hardest Thing in Data Science
Counting isn’t technically difficult; the real challenge lies in managing relationships and office politics that surround the task.-Dayne Batten
Thinking in SQL vs Thinking in Python
Using a new language requires a new mindset. Our chief analyst shares his learnings from adding Python to his SQL workflow.-Mode
Real-world data cleanup with Pandas and Python
Cleaning data is a tedious yet essential part of every analyst’s day. Learn how to use Python and Pandas to ensure that their data is clean, without worrying about overlooking any potential issues.-TrendCT
10 Significant Visualisation Developments: January to June 2016
Every six months, visualization expert Andy Kirk puts together a list of people and projects he feels have impacted the field. This roundup includes climate spiral plots, #MakeoverMonday, and a talk from the Deputy Graphics Editor at The New York Times.-Visualising Data