Webinar: Why logical layers matter, and how to use them -Watch now

Machine Learning Articles

Machine learning, deep learning, artificial intelligence... The science of getting machines to perform actions without explicitly programming them to do so can be intimidating for the uninitiated. These machine learning articles aim to unpack the black box for beginners, with introductions to overall concepts and tutorials for training a model of their own.

Thoughts on ML Engineering After a Year of My PhD

“People keep talking about how ML engineering (MLE) is a subset of software engineering or should be treated as such. But over the last 15 months of graduate school, I’ve been thinking about MLE through the lens of data engineering.”-Shreya Shankar

Moving Beyond Mimicry in Artificial Intelligence

What makes pre-trained AI models so impressive—and potentially harmful.-Nautilus

Physiognomic Artificial Intelligence

This paper covers how computer vision is a central vector for physiognomic AI technologies and unpacks how computer vision reanimates physiognomy in conception, form, and practice and the dangers this trend presents for civil liberties.-Fordham Intellectual Property, Media, & Entertainment Law Journal

Supervised Machine Learning for Text Analysis in R

This book is designed to provide practical guidance and directly applicable knowledge for data scientists and analysts who want to integrate text into their modeling pipelines.-Emil Hvitfeldt and Julia Silge

Davis Summarizes Papers

Every week, Davis reads all the machine learning arXiv submissions and summarizes 10 to 20 of his favorites.-Davis Blalock

Creating Confidence Intervals for Machine Learning Classifiers

Confidence intervals are no silver bullet, but at the very least, they can offer an additional glimpse into the uncertainty of the reported accuracy and performance of a model.-Sebastian Raschka

All Roads Lead to Rome: The Machine Learning Job Market in 2022

In the next decade we may see software companies adopt an “Artificial General Intelligence strategy” as a means to make their software more adaptive and generally useful.-Eric Jang

Bayesian Rock Climbing Rankings

Spoiler: this model built on an outdoor climbing dataset ends up being wrong. But the journey to get there is worth it.-Ethan Rosenthal

Artificial Intelligence Is Creating a New Colonial World Order

“The more users a company can acquire for its products, the more subjects it can have for its algorithms, and the more resources—data—it can harvest from their activities, their movements, and even their bodies.”-MIT Technology Review

Machine Learning Has a Validity Problem

“One of the central tenets of machine learning warns the more times you run experiments with the same test set, the more you overfit to that test set. This conventional wisdom is mostly wrong and prevents machine learning from reconciling its inductive nihilism with the rest of the empirical sciences.”-arg min blog

Redesigning Etsy’s Machine Learning Platform

In 2017, Etsy built in-house solutions for their small data science team. In 2020, with a scaled-up team, it was time to cut the cord and make use industry standard tooling and managed solutions such as Google Cloud.-Code as Craft

Startup Opportunities in Machine Learning Infrastructure

From the perspective of an investor and previous early-stage operator in the space.-Leigh Marie Braswell

ML and NLP Research Highlights of 2021

Dig into the 15 most interesting papers and research advancements of last year.-Sebastian Ruder

My Machine Learning Process (Mistakes Included)

Many of the tutorial blog posts out there show perfect, unblemished code. This post shows the actual process of creating a model, warts and all.-David Neuzerling

How to Keep Learning About Machine Learning

There’s lots of great advice in here about learning about anything, really: do a personal project that stretches you, adopt a beginner’s mind, and find mentors who are a few steps ahead.-Eugene Yan

Real-time Machine Learning: Challenges and Solutions

“In the last year, I’ve talked to ~30 companies in different industries about their challenges with real-time machine learning. This post outlines the solutions for (1) online prediction and (2) continual learning, with step-by-step use cases, considerations, and technologies required for each level.”-Chip Huyen

How to Read Research Papers: A Pragmatic Approach for ML Practitioners

Is it necessary for data scientists or machine-learning experts to read research papers? Yes. But don’t worry if you lack a formal academic background. This hands-on tutorial will make the endeavor far less intimidating.-NVIDIA Developer Blog

The Steep Cost of Capture

Tech firms are startlingly well positioned to shape what we do—and do not—know about AI and the business behind it, at the same time that their AI products are working to shape our lives and institutions.-Interactions

Transformers from Scratch

If you've always wondered how transformers work but know nothing about machine learning, here's a peek behind the curtain.-Brandon Rohrer

Why Machine Learning Hates Vegetables

Also inspired by the Zillow discussion, this post details a bad use case for machine learning: automated dietary advice.-Emily Riederer

How Wadhwani AI Uses PyTorch to Empower Cotton Farmers

By using PyTorch, Wadhwani AI researchers have been able to create a model that is able to accurately predict the location of pests within cotton crops.-PyTorch

AI Research: The Unreasonably Narrow Path and How Not to Be Miserable

Apparently, there are two paths for AI research: industry and academia. And they’re all hiring the same kind of people, with the same rigid rubric.-Roseanne Liu

Reaching MLE (Machine Learning Enlightenment)

“This is the job, writing and gluing together the code that makes drastically different systems speak to each other in data-oriented language.”-Vicki Boykis

Red Hot: The 2021 Machine Learning, AI and Data (MAD) Landscape

Every company is becoming not just a software company, but also a data company.-Matt Turck

Participatory Data Stewardship

This framework rejects practices of data collection, storage, sharing, and use in ways that are opaque or seek to manipulate people, in favor of practices that empower people to help inform, shape, and govern their own data.-Ada Lovelace Institute

Bad Labels

“The issue here isn’t just that we might have bad labels in our training set, the issue is that it appears in the validation set. If a machine learning model can become state of the art by squeezing another 0.5% out of a validation set one has to wonder. Are we really making a better model?”-Vincent D. Warmerdam

Challenges and Opportunities in NLP Benchmarking

Natural language processing models have become so powerful over the last few years that we need new benchmarks for measuring their performance.-Sebastian Ruder

Introduction to Causal Inference from a Machine Learning Perspective

Here’s Sean Taylor on why causal inference is important for working data scientists: “Typically in causal inference, researchers try to estimate some quantity of interest one time for a publication. In industry, we must build systems to reliably estimate quantities, at scale, over time, for a variety of contexts.”-Brady Neal

Mitigating Dataset Harms Requires Stewardship: Lessons From 1000 Papers

Efforts to implement higher ethical standards and transparency in the machine learning dataset creation process can be more effective if we understand of how datasets are used in practice in the research community.-arXiv.org

Is GitHub Copilot a Blessing, or a Curse?

To see real improvements in program synthesis, we’ll need to go beyond just language models, to a more holistic solution that incorporates best practices around human-computer interaction, software engineering, testing, and many other disciplines.-fast.ai

Homemade Machine Learning

This repository of popular machine learning algorithms is implemented in Python and explains the math behind each algorithm. It’s a great place to get started with machine learning.-Oleksii Trekhleb

Machine Learning Cohorts

What is a "data scientist" or "machine learning engineer," really? This analysis uses tools, libraries, and frameworks to cluster engineers into cohort groups, which might be more indicative of their day-to-day work than overloaded job titles.-Paige Bailey

Data Capitalism

“Data capitalism, like capitalism itself, reinforces dynamics of power and profits. More power creates more profits, and more profits creates more power. Inequality ensures companies get richer and more influential, while everyday people wield less and less power.”-Data for Black Lives

Introducing mltrace

mltrace is designed for collaborative teams working on production machine learning pipelines. It makes it easy to follow a prediction or model’s output back to its most upstream, raw data file.-Shreya Shankar

Māori Are Trying to Save Their Language From Big Tech

With just its first 320 hours of Māori language data, Te Hiku, a small non-profit radio station in New Zealand, was able to build a speech-to-text engine with an initial word error rate of 14 percent. Developing language tools, themselves, is a way for Māori to decolonize the sound of their language.-Wired UK

The Rise of HuggingFace

There’s a lot machine learning startups can learn from HuggingFace about community building.-Breaking the Stagnation

Paper Notes by Vitaly Kurin

Every weekday, this Oxford student reads a machine learning paper, takes notes, and shares them. And now you get to benefit from their studious habits.-Vitaly Kurin

The Giant Leaps In Language Technology — and Who’s Left Behind

What happens when a language is omitted from the digital landscape? And what can be gained when technology acts as a bridge instead of a barrier?-TED

Finding Structure in Users’ Evolving Listening Preferences

This dynamic model of 100K Spotify users shows how music tastes change over time.-Spotify Research

A Recipe for Training Neural Networks

“Suffering is a perfectly natural part of getting a neural network to work well, but it can be mitigated by being thorough, defensive, paranoid, and obsessed with visualizations of basically every possible thing. The qualities that in my experience correlate most strongly to success in deep learning are patience and attention to detail.”-Andrej Karpathy

Moving Beyond “Algorithmic Bias Is a Data Problem”

The model is a problem, too. Recognizing that allows for new ways to reduce harm that are easier than comprehensive data collection.-Patterns

Reducing Toxicity in Language Models

Many large pre-trained models unavoidably acquire certain toxic behavior and biases from the online data they’re trained on. Here are some ways to limit unsafe and harmful content in the finished product.-Lil’Log

Who Is Making Sure the A.I. Machines Aren’t Racist?

When Google forced out two well-known artificial intelligence experts, a long-simmering research controversy burst into the open.-The New York Times

Artificial Intelligence and Inclusion: Formerly Gang-Involved Youth as Domain Experts for Analyzing Unstructured Twitter Data

Folks who train algorithms, take note: this mixture of social work and data science is a shining example of how to create AI with a positive social impact.-SAGE Journals

Introduction to Machine Learning Reliability Engineering

Like site reliability engineers (SREs), machine learning reliability engineers (MLREs) are responsible for reducing toil, keeping costs within budget, and ensuring smooth releases, but with more data savvy.-TestDriven.io

Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities

“Advances in algorithmic fairness have largely omitted sexual orientation and gender identity. We explore queer concerns in privacy, censorship, language, online safety, health, and employment to study the positive and negative effects of artificial intelligence on queer communities.”-arXiv.org

AI Incident Database

This repository of problems caused by AI is intended to help researchers and developers prevent similar harms from happening again.-AI Incident Database

My Friend Radicalized. This Made Me Rethink How I Build AI

“As I watched a mob batter down the windows of the Capitol, inspired by online echo chambers in tweets and newsfeeds, I thought of my work in machine learning, thought of that drone strike, and wondered ‘did I unwittingly help create this?’”-Jaan Altosaar

Why You Should Do NLP Beyond English

7000+ languages are spoken around the world but NLP research has mostly focused on English. It’s time to change that.-Sebastian Ruder

Baking with Machine Learning

A ML-powered tool for understanding the science behind what differentiates a cake from a bread or a cookie.-Sara Robinson

Corporate Reporting in the Era of Artificial Intelligence

“Companies have long seen annual reports and other corporate disclosures as opportunities to portray their business health in a positive light. Increasingly, the audience for these disclosures is not just humans, but also machine readers that process the information as an input to investment recommendations."-National Bureau of Economic Research

The Underlying Values of Machine Learning Research

An analysis of the 68 most highly-cited, influential ML papers of the last few years finds that the majority don’t mention societal need or negative consequences. Instead, these papers assess themselves according to a set of values internal to the field.-Resistance AI Workshop

Machine Learning Is Going Real-time

What does real-time machine learning even mean? How is it done? What are the use cases?-Chip Huyen

Data and Its (Dis)contents: A Survey of Dataset Development and Use in Machine Learning Research

“In this paper, we survey the many concerns raised about the way we collect and use data in machine learning and advocate that a more cautious and thorough understanding of data is necessary to address several of the practical and ethical issues of the field.”-arXiv.org

The Most Important Thing

In machine learning research, it’s easy to fall into the trap of only working with benchmark data sets. You can guard against this by working on concrete problems. You don’t need a grand goal, either, just a concrete one.-Brandon Rohrer

Datasheets for Datasets

“The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains... [W]e propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on.”-arXiv.org

Building a Gigascale ML Feature Store with Redis, Binary Serialization, String Hashing, and Compression

“Our benchmarking results indicated that Redis was the best option, so we decided to optimize our feature storage mechanism, tripling our cost reduction. Additionally, we also saw a 38% decrease in Redis latencies, helping to improve the runtime performance of serving models.”-DoorDash Engineering

Indigenous AI Position Paper

This paper is a starting place for those who want to design and create AI from an ethical position that centers Indigenous concerns. It is an attempt to capture multiple layers of a discussion that happened over 20 months, across 20 timezones, during two workshops, and between Indigenous people...-Indigenous AI

Creating a Custom Corpus

If you want to use natural language processing (NLP), you need a corpus (a collection of texts) to work with. Here’s how to build a corpus of one’s own.-Python in Plain English

Is Facial Recognition Too Biased to Be Let Loose?

“Amba Kak and others support a moratorium on any use of facial recognition, not just because the technology isn’t good enough yet, but also because there needs to be a broader discussion of how to prevent it from being misused.-Nature

Who Am I to Decide When Algorithms Should Make Important Decisions?

“Technical know-how, whether in government or in the technology industry, cannot substitute for contextual understanding and lived experiences in determining whether it’s appropriate to apply AI systems in sensitive social domains.”-The Boston Globe

Weird AI Yankovic: Generating Parody Lyrics

Syllable and rhyme scheme go in, parody lyrics come out.-arXiv.org

To Apply AI for Good, Think Form Extraction

Form extraction is a hard and important problem. Healthcare professionals, climate scientists, and human rights workers all rely on data that’s often locked away in non-digitized documents.-Jonathan Stray

Characterising Bias in Compressed Models

It is not just the data. Popular compression techniques can amplify bias in deep neural networks.-arXiv.org

Human Learn

What if you could draw a machine learning model? This new library makes it easier to create rule based systems that are scikit-learn compatible.-Vincent D. Warmerdam

Target Didn’t Figure Out a Teenager Was Pregnant Before Her Father Did, and That One Article That Said They Did Was Silly and Bad

“[T]his story in 2012 launched the idea into public consciousness that companies can create algorithms that can diagnose, solve, predict the future, and generally model the human situation better than humans can.” But for the most part, that’s just not true, even seven years later.-Colin Fraser

Machine Learning: The High Interest Credit Card of Technical Debt

Machine learning allows you to build complex systems fast, but it’s dangerous to think of these quick wins as coming for free.-Google Research

Effective Testing for Machine Learning Systems

What’s the difference between testing a machine learning system and a traditional software system? Between model testing and model evaluation? And how do you write a model test, period?-Jeremy Jordan


A straightforward example of deploying a sklearn model using Flask and a Docker container. You’ll need some basic knowledge of Docker to get started.-Chris Albon

Traffic Prediction with Advanced Graph Neural Networks

How DeepMind and Google Maps teamed up to improve the accuracy of ETAs by up to 50% in some cities.-DeepMind

How Salesforce Infuses Ethics into Its AI

To support ethical AI, you need to build a culture where employees have the right mindset to create ethical products. Salesforce has a number of processes and programs for doing just that.-Salesforce

An A.I. Training Tool Has Been Passing Its Bias to Algorithms for Almost Two Decades

The CoNLL-2003 dataset is one of the most widely used open source datasets for building natural language processing systems. Over the past 17 years, it’s been cited more than 2,500 times in research literature. The problem? It includes five times as many male names as female names.-OneZero

AI for AG: Production Machine Learning for Agriculture

How one company trained a neural network model that identifies crops and weeds, and then deployed that model to robots in the fields.-PyTorch

The Whiteness of AI

Across the board—in the voices of chatbots and virtual assistants, in stock images, and in film and television—AI is portrayed as white people, and that’s a problem.-Philosophy & Technology

Algorithmic Colonization of Africa

“The continent would do well to adopt a dose of critical appraisal when regulating, deploying, and reporting AI. This requires challenging the mindset that portrays AI with God-like power and as something that exists and learns independent of those that create it. People create, control, and are responsible for any system."-SCRIPTed

Categorizing Products at Scale

How the Shopify team implemented a model to categorize all their products at Shopify, and enabled cross-platform teams to deliver personalized insights to business owners.-Shopify Engineering

Thoughts on Genderify, Gender Discrimination, Transphobia, and (Un)ethical AI

“Products like Genderify are harmful. They’re built on top of biased and inaccurate data, by people who seem to have no interest in risk management or the societal impact of their product, and released for basically no cost into the open for anyone to use.”-Sarah L. Fossheim


A curated collection of blogs, articles, and papers describing how different companies have been using machine learning in production.-Eugene Yan

Combating Anti-Blackness in the AI Community

“The aim of this work is to help community members better identify and understand the scale and scope of anti-Black bias within our AI community and illustrate some concrete steps that members can take to help mitigate these issues and build a more just community."-Devin Guillory

MIT Apologizes, Permanently Pulls Offline Huge Dataset That Taught AI Systems to Use Racist, Misogynistic Slurs

If you’re using 80 Million Tiny Images to benchmark computer-vision algorithms, it’s time to find another data set.-The Register

Reflecting on a Year of Making Machine Learning Actually Useful

“I discuss how working at Viaduct opened my eyes to the challenges of operationalizing machine learning, and how neither my classes nor research forced me to consider these challenges.”-Shreya Shankar

Ethics in NLP

And while you’re learning about natural language processing, add this bibliography to your reading list to make sure you’re mindful about the real harm NLP can cause, no matter your intentions.-Association for Computational Linguistics

NLP Roadmap

For visual learners especially, this mind map of keywords can guide you through what natural language processing topic to study next.-Tae-Hwan Jung

Using GitHub Actions for MLOps & Data Science

GitHub offers almost no features for data science, but they’re trying to change that. This series of GitHub Actions integrate parts of the data science and machine learning workflow with a software development workflow.-The Github Blog

Getting Machine Learning to Production

This guide covers the process of creating an end-to-end proof-of-concept machine learning product, from start to finish.-Vicki Boykis

Modern Rules-Based Models

Deep learning models may be all the rage, but let’s shine a light on some of their less-well-known cousins: rules-based models.-R Views

Q&A: Sabelo Mhlambi on What AI can Learn from Ubuntu Ethics

“To think we can come up with a value system to guide AI without looking into other cultures’ value systems, and then to call it universal, is off.”-People + AI Research

Tonks: Building One (Multi-Task) Model to Rule Them All!

What’s even cooler than a multi-task deep learning library? Reading the story of how two co-workers built this library together.-ShopRunner

Tidymodels: Tidy Machine Learning in R

The tidyverse's take on machine learning is finally here.-Rebecca Barter

Anti-patterns in Open Sourced ML Research Code

Unlike most places on the internet, the comments section on this Reddit post have a lot of constructive, helpful advice.-Jari Safi

Masakhane — Machine Translation for Africa

Africa has over 2,000 languages, but African languages account for a small portion of available resources and publications in Natural Language Processing. To fix that problem, an open-source, continent-wide research effort was formed.-Cornell University

Going Beyond SQuAD

The Stanford Question Answering Dataset (SQuAD) has become the archetypal QA dataset for NLP modeling. The emergence of non-English SQuAD replicas means that NLP can be truly democratized. And it’s a good reminder that English is neither synonymous nor representative of natural language.-Towards Data Science

The Big Bad NLP Database

300 (and counting!) datasets for training your natural language processing models.-Quantum Stat

“Just What I Needed”: Making Machine Learning Scalable and Accessible at Grubhub

Before Grubhub had a suite of tools to help with machine learning model deployments, the difficulty of getting scheduled jobs in production resulted in multiple bespoke solutions, duplicated code, and lots of model maintenance overhead.-Grubhub Bytes

Deep Learning Book Series

These notes with code, examples, and drawings are great for beginners who want to understand enough linear algebra to be comfortable with machine learning and deep learning.-Hadrien Jean

You’re Not Paid to Model

If you want to know how data science projects at companies actually work, there’s no better talk to watch than this one.-Jacqueline Nolis

Flow Fields

“Flow fields are something that many programmers reach for early on when they first get into creating algorithmic artwork, but few take the time to polish their use and explore the crazy variety of ways they can be used."-Tyler Hobbs

The Most Important Sorting Algorithm You Need to Know

How and why the default sorting algorithm for Python became widely-used, yet relatively unrecognizable.-Dev

ML and NLP Publications in 2019

Who was the most prolific author of machine learning and natural language processing papers in 2019? The most prolific company? The most prolific country?-Marek Rei

Curriculum for Reinforcement Learning

Curriculums help humans progressively go from understanding simple concepts to solving hard problems. They might do they same for reinforcement learning models.-Lil’Log

Albert Learns to Read

Is this deep learning model as smart as a first grader? You might just while away an hour or two feeding it stories and asking it questions.-Albert Learns to Read

Interactive Tools for Machine Learning, Deep Learning and Math

Sometimes the best way to unpack a complicated concept is to play with it, visually.-Machine Learning Tokyo

The Twelve Truths of Machine Learning for the Real World

“In the Real World, the distribution of input more likely changes than not, ‘curve balls’ from long-tails come out of nowhere, and you don’t always have an answer.”-Delip Rao

Machine Unlearning

Once users have shared their data online, it is generally difficult for them to revoke access and ask for the data to be deleted. Machine learning exacerbates this problem because any model trained with said data may have memorized it, putting users at risk of a successful privacy attack exposing their information. Yet, having models unlearn is notoriously difficult.-Cornell University

Supervised Machine Learning Case Studies in R

This free course covers exploratory data analysis, preparing data so it’s ready for predictive modeling, training supervised machine learning models, and evaluating those models—all using real-world data.-Julia Silge

Powered by AI: Instagram’s Explore Recommender System

Recommending the most relevant content out of billions of options in real time at scale introduces a ton of machine learning engineering problems. Here’s a detailed look at the system Instagram built to provide its users with personalized content.-Instagram Engineering

Biased Algorithms Are Easier to Fix Than Biased People

Changing algorithms is easier than changing people: software on computers can be updated; the “wetware” in our brains has so far proven much less pliable.-The New York Times

AI Dungeon 2: Creating Infinitely Generated Text Adventures with Deep Learning Language Models

This completely AI generated text adventure will respond logically to just about any command you enter, such as “Eat the moon,” “Summon a giraffe,” or “Join the Great British Bakeoff.”-Perception, Control, Cognition

A Guide to Production Level Deep Learning

No matter how many deep learning models you’ve successfully trained, deploying those models in production is a whole different ballgame.-Alireza Dirafzoon

Managing Bias and Risk at Every Step of the AI-Building Process

“As the field is still young, many machine-learning developers lack experience in building enterprise applications, and many business stakeholders have insufficient knowledge of machine learning to know what questions to ask as they scope and manage projects."-Harvard Business Review

Machine Learning Systems Design

Case studies, resources, and 27 exercises for learning how to deploy a machine learning system in the real world.-Chip Huyen

Machine Learning Engineering

This book is offered as free to read on the “read first, buy later” principle. The first three chapters are available now, and you can sign up for the mailing list to get updated when more are added.-Andriy Burkov

Algorithms Were Supposed to Make Virginia Judges Fairer. What Happened Was Far More Complicated.

“[A] formula designed to reduce prison populations in Virginia led some judges to impose harsher sentences for young or black defendants, and more lenient ones for rapists.”-The Washington Post

DReCon: Data-Driven Responsive Control of Physics-Based Characters

This article will be a treat for those interested in video games, deep reinforcement learning, AND physics.-Ubisoft Montreal

We Looked Into Why Our Subscribers Churned–with the Help of Machine Learning

A Swedish media company discovered many correlations between churn and reader activity, including reader gender, number of images viewed, and number of push notifications sent.-MittMedia

The Abstraction and Reasoning Corpus (ARC)

Brandon Rohrer offers high praise for this long read: “If you like to think deeply about building and measuring intelligence, I highly recommend this read. It's a cogent review of intelligence measurement, its challenges, past approaches, and their limitations. [And it] takes the brave step of proposing a path forward.”-François Chollet

Detecting Audio Deepfakes With AI

Earlier this year, thieves used AI to impersonate a CEO’s voice and successfully demand a fraudulent transfer of nearly $250,000. To defend against similar cyber crimes, one company built a detector system to discern between real and fake audio examples—and you can too.-Dessa News

The Problem With Metrics is a Big Problem for AI

“I am not opposed to metrics; I am alarmed about the harms caused when metrics are overemphasized, a phenomenon that we see frequently with AI, and which is having a negative, real-world impact.”-fast.ai

The State of Machine Learning Frameworks in 2019

The war for machine learning frameworks has two remaining big contenders: PyTorch and TensorFlow. Researchers prefer one, while industry practitioners prefer the other.-The Gradient

Sis: Simple Image Search Engine

You can launch this open-source, visual similarity engine just by running two Python scripts.-Yusuke Matsui

Neural Nets Are Just People All the Way Down

“Every single piece of decision-making in a high-tech neural network initially rests on a human being manually putting something together and making a choice.”-Normcore Tech

AI Deserts

“Open any magazine, click randomly on any article on Medium, visit any public event at a think tank; chances are, concerns raised by the age of AI is the topic. Some of it will be bunk, some of it very thoughtful, but the topic is not exactly under-discussed. What is under-discussed is how unevenly this change will happen, because we misunderstand and overestimate the preconditions for AI outside the private sector."-Code for America Blog

When is a Neural Net Too Big for Production?

This post shows examples of successfully shipping large natural language processing transformer models.-Towards Data Science

Designing Your Neural Networks

What’s a good learning rate? How many hidden layers should your network have? Is dropout actually useful? Why are your gradients vanishing?-Towards Data Science

Dungeon Crawling or Lucid Dreaming?

With a neural net as your Dungeon Master, you can conjure anything simply by referring to it and teleport anywhere by just saying the word.-AI Weirdness

A Beginner’s Guide to the Mathematics of Neural Networks

This resource is made for non-experts, and the illustrations are really helpful for wrapping your mind around new concepts.-Kings College London

Simple Beginner’s Guide to Reinforcement Learning & Its Implementation

One of the research scientists (https://twitter.com/iamtrask/status/1163737631395651584) at DeepMind gives this tutorial gets a ringing endorsement: “This is one of the first truly introductory tutorials of RL & Deep Learning I've seen. My kind of tutorial... lots of toy code examples and simple analogies!!!”-Analytics Vidhya

Machine Learning, Faster

“Speed is not a word that is regularly associated with machine learning teams. When we talk and write about accomplishments in machine learning, there is often a focus on the problem, the algorithmic approach, and the results—but no mention of the time that it took to get there.”-Neal Lathia

DeepMind's Losses and the Future of Artificial Intelligence

Alphabet’s DeepMind lost $572 million last year. Does this mean AI is falling apart? No. But considering DeepMind’s strategy does raise some interesting questions about how much it can offer society beyond mastering the game of Go.-Wired

Intro to Pyenv for Machine Learning

Escape Python dependency hell! Instead of configuring a unique Docker container for each project, how about a unique Python environment?-Weights & Biases

How I Became a Machine Learning Practitioner

“I wasn’t quite prepared for just how much I would feel like a beginner. You need to give yourself the space and time to fail. If you learn from enough failures, you’ll succeed.”-Greg Brockman

Rules of Machine Learning: Best Practices for ML Engineering

This reference guide gets a ringing endorsement from Senior Research Scientist Andrew: “Machine learning in a company is 10% data science and 90% other challenges. It's VERY hard. Everything in this guide is ON POINT, and it's stuff you won't learn in an ML book.”-Google

Awesome Production Machine Learning

This repository contains a curated list of open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning.-The Institute for Ethical Machine Learning

A Code-First Introduction to Natural Language Processing

Learn topic modeling, classification, language modeling, and translation, completely free.-fast.ai

Generate custom Magic: The Gathering cards from an AI using GPT-2

Input a card name, card type, and card mana cost... get a custom card image and card text back!-Max Woolf

Learning to Traverse Latent Spaces for Musical Score Inpainting

This paper shows how to train a deep learning-based model to fill in missing or lost information in a piece of music.-Georgia Tech

Mutual Exclusivity as a Challenge for Neural Networks

Children use the mutual exclusivity bias to learn new words. Standard neural nets show the opposite bias, making it harder for them to learn in common scenarios.-Cornell University

Word2vec: fish + music = bass

Get ready to chuckle at gems like “yeti – snow + economics = homo economicus” and “American Idol – singers = Project Runway.”-graceavery

The BS-Industrial Complex of Phony A.I.

“For a while, we resisted the A.I. label, understanding that our platform wasn’t going to make Watson sweat anytime soon. But eventually, we gave up and just decided to kind of go along with the hype. The market wanted us to be an A.I. company so we chuckled and decided to call ourselves one.”-Gen

Bayesian Cyber Risk Quantification With Industry-Specific Models

The machine learning community is obsessed with deep learning on big dense datasets, but problems like cyber insurance with small sparse data require Bayesian methods.-Tower Street

AI Adoption is Being Fueled by an Improved Tool Ecosystem

In 2010, the ratio of AI scientific papers to patents filed was 8:1. In 2016, it was 3:1. We’re in the implementation phase now.-O’Reilly

18 Impressive Applications of Generative Adversarial Networks (GANs)

It feels like GANs pop up everywhere these days. If you’re looking for a fundamental understanding of what GANs can do, this is a great overview.-Machine Learning Mastery

Deepfake Propaganda Is Not a Real Problem

There’s real damage being done by deepfake techniques, but it’s happening in pornography, not politics.-The Verge

Once Again, a Neural Net Tries to Name Cats

Start off your Monday morning with a good chuckle.-Janelle Shane

GANs And Deepfakes Could Revolutionize The Fashion Industry

GAN's impact on fashion goes way deeper than virtual fitting rooms and creating avatars to customer's measurements.-Forbes

Machine Learning Product Management: Lessons Learned

Product management for ML projects can be difficult because engineering changes from a deterministic process to a probabilistic one. It requires “an approach that involves learning from data instead of programmatically following a set of human rules.”-Domino Data Lab

Railyard: How We Rapidly Train Machine Learning Models With Kubernetes

Stripe trains hundreds of new models each day, each powered by billions of data points. Running infrastructure at this scale poses a very practical data science and ML problem: how do you give every team the tools they need to train their models without requiring them to operate their own infrastructure?-Stripe

Notes on AI Bias

Bias in AI doesn’t mean just bias against people. Sometimes it just picking up the wrong signal, period. Case in point: a system built to detect skin cancer was detecting rulers instead.-Benedict Evans

Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection

Not only could a simple color printout hide someone from an AI video surveillance system for nefarious purposes, it could offer protection to people who don't want to be tracked in everyday life.-arXiv

This YouTube Channel Streams AI-Generated Death Metal 24/7

Dadabots was developed by two music technologists who wanted to prove that a neural network was capable of capturing the subtle stylistic differences between Death Metal, Math Rock, and other lesser-known genres.-Motherboard

One Model to Rule Them All

This post discusses the obsession with finding the best model and emphasizes what should be done instead: Take a step back and see the bigger picture in which the machine learning model is embedded.-bentoML

Discriminating Systems: Gender, Race and Power in AI

“The field of research on bias and fairness needs to go beyond technical debiasing to include a wider social analysis of how AI is used in context. This necessitates including a wider range of disciplinary expertise.”-AI Now Institute

Open Questions about Generative Adversarial Networks

Practical improvements to image synthesis models are happening almost too quickly to keep up with, but there are still several open research problems left to tackled.-Distill

Scaling Uber’s Customer Support Ticket Assistant (COTA) System with Deep Learning

“Our online tests validate that the COTA v2 deep learning system performs significantly better than the COTA v1 system in terms of key metrics, including model performance, ticket handling time, and customer satisfaction.”-Uber Engineering

My Machine Learning Research Jobhunt

One PhD graduate shares their experience trying to find an AI research position in Europe, from the application process through to salary negotiations.-Generalized Error

Active Learner

Supervised machine learning, while powerful, needs labeled data to be effective. This visualization shows how active learning data labeling strategies can improve your models.-Fast Forward Labs

A Framework for Understanding Unintended Consequences of Machine Learning

The concept of biased data is often too broad to be useful. This framework includes 5 ways of categorizing bias: historical, representation, measurement, evaluation, and aggregation.-Cornell University

A Gentle Introduction to Learning Curves for Diagnosing Machine Learning Model Performance

Discover learning curves and how they can be used to diagnose the learning and generalization behavior of machine learning models.-Machine Learning Mastery

Money Machines: An Interview with an Anonymous Algorithmic Trader

An insider explains how algorithms are rewiring finance.-Logic

Unsolved Research Problems vs. Real-world Threat Models

“I personally think adversarial examples are highly worth studying, and should inspire serious concern. However, most of the justifications for why exactly they’re worrisome strike me as overly literal. I think much of the confusion comes from conflating an unsolved research problem with a real-world threat model.”-Catherine Olson

Game of Thrones Reigns Supreme Among AT&T’s Assets. Here’s How We Used Wikidata’s Entities and Ontology to Find That Out.

Six companies own most U.S. media. Given that each of these companies owns thousands of these types of assets, how do you determine which ones are the most important?-Parse.ly Engineering

Tackling Bias in Machine Learning

This article digs into the hows and whys of the Python package Fair Classifier, which quantifies the fairness of a model and uses an adversarial network to help ensure equity in machine learning models.-Insight

Coconet: the ML model behind today’s Bach Doodle

Last week, Google celebrated J.S. Bach’s 334th birthday with “the first AI-powered Google Doodle.” Here's how their team built a model that takes a user-created melody and harmonizes it in Bach’s style.-Magenta

How I Eat For Free in NYC Using Python, Automation, Artificial Intelligence, and Instagram

That's one way to save money in the Big Apple! Here's how a data engineer created a 100% automated Instagram account to earn free meals at restaurants looking for promotion.-Chris Buetti

What's the difference between data science, machine learning, and artificial intelligence?

Whip this out the next time you tell someone you're a data scientist, and they ask “Does that mean you work on artificial intelligence?”-Variance Explained

Modeling Censored Time-to-Event Data Using Pyro, an Open Source Probabilistic Programming Language

When churn models just weren't cutting it for Uber, they created their own language in Python to properly model the time from a user's first ride to their second.-Uber Engineering

Using Deep Learning to “Read Your Thoughts” — With Keras and EEG

Saying a word in one’s mind, even if not spoken aloud, can result in the firing of the nerves controlling the muscles involved in speech. With some readily available equipment, you can train a model to classify these sub-vocalized words in less than a day.-Justin Alvey

AI Interprets What Rodents Are Saying

With a deep learning-based system for detection and analysis of rodent vocalizations, researchers can better understand their test subjects. And it's adorably named “DeepSqueak.”-Psychology Today

Cocktail Similarity

Thanks to a difference algorithm, you now have the perfect guide for figuring out your minimum-viable at-home bar setup.-Tom MacWright

The Limitations of Deep Learning for Vision and How We Might Fix Them

“Now it is difficult to publish anything that is not neural network related. This is not a good development. We suspect that the field would progress faster if researchers pursued a diversity of approaches and techniques instead of chasing the current vogue.”-The Gradient

Data Versioning

The degrees of freedom in versioning machine learning systems poses a unique challenge. Each broad approach to tackle this problem has pros and cons to keep in mind.-Emily Gorcenski

We Analyzed 16,625 Papers to Figure Out Where AI Is Headed Next

This study of 25 years of artificial-intelligence research suggests that deep learning may soon be on its way out.-MIT Technology Review

The Best Defense Against Deepfake AI Might Be . . . Blinking

Researchers can now detect AI-generated fake video with a 95% success rate. Because few images are available online showing people with their eyes closed, there's less training data available for deepfakes to get natural blinking right.-Fast Company

Why Are Machine Learning Projects So Hard to Manage?

“I’ve watched lots of companies attempt to deploy machine learning—some succeed wildly and some fail spectacularly. One constant is that machine learning teams have a hard time setting goals and setting expectations. Why is this?”-Lukas Biewald

What Can Neural Networks Learn?

It’s tricky to know what neural networks are actually learning as they're trained. This post does a good job breaking down what's going on inside.-Data Science and Robots

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions Through the Paired Open-Ended Trailblazer

“Sometimes… we do not just want to solve known problems, because unknown problems are also important. Consequently, we are exploring algorithms that continually invent both problems and solutions of increasing complexity and diversity.”-Uber Engineering

Most Impactful AI Trends of 2018: the Rise of ML Engineering

Was 2018 an important inflection point for the machine learning industry? Checkout this roundup of key trends and the impact they may have on ML this year.-Emmanuel Ameisen

lazydata: Scalable Data Dependencies for Python projects

This library might help out when you need data version control for your next machine learning project.-rstojnic

Gender and Jobs in Online Image Searches

A really cool example of using machine vision to spot gender bias in Google Image search results.-Pew Research Center

Kernel Density Estimation

Kernel density estimation (KDE) a useful statistical tool that’s way less scary than it sounds. This interactive shows how KDE lets you create a smooth curve given a set of data.-Matthew Conlen

Concepts in object detection

Naming and locating several objects at once in an image with no prior information about how many objects are supposed to be detected is much harder than identifying a single object. Here’s how to do it using TensorFlow and R.-Tensorflow for R Blog

The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence

“In accepting the existing narratives about A.I., vast zones of contest and imagination are relinquished. What is achieved is resignation — the normalization of massive data capture, a one-way transfer to technology companies, and the application of automated, predictive solutions to each and every societal problem.”-Medium

AI Art Gallery

Check out this collection of art, music and design using machine learning from a NeurlIPS 2018 workshop.-Neural Information Processing Systems

These incredibly realistic fake faces show how algorithms can now mess with us

The latest advance in generative adversarial networks allowed researchers to generate fake images of faces with an previously unknown level of control over elements like age, race, gender—even freckles.-MIT Technology Review

State of Deep Learning : H2 2018 Review

“The growth rate of machine learning papers has been around 3.5% a month since July — which is around a 50% growth rate annually. This means around 2,200 machine learning papers a month and that we can expect around 30,000 new machine learning papers next year.”-Atlas ML

Public Attitudes Toward Computer Algorithms

“58% of Americans feel that computer programs will always reflect some level of human bias – although 40% think these programs can be designed in a way that is bias-free.”-Pew Research Center

Beating the State-of-the-art in NLP With HMTL

Learn how Multi-Task Learning—a general method in which a single architecture is trained towards learning several different tasks at the same time—can be applied to natural language processing.-Hugging Face

Is this AI? We drew you a flowchart to work it out

It's a bit hard to read, but if you squint hard enough this flowchart should help you discern if something's truly AI—or just hyped up and mislabeled.-MIT Technology Review

AI adoption advances, but foundational barriers remain

One highlight from this global survey about how AI is used at companies: those working in manufacturing and risk see AI as more valuable than those in other fields like marketing and sales or human resources.-McKinsey & Company

What You Have To Fear From Artificial Intelligence

Vicki Boykis describes this long read best: “Great piece the real, practical concerns of deep learning applications (aka not robots killing us): fake images, text, and soundbytes that we won't know aren't human-generated.”-Current Affairs

Reinforcement Learning with Prediction-Based Rewards

When a reinforcement learning agent was incentivized to be curious and avoid "boredom” while playing Mario, it discovered warp levels, how to defeat bosses, and more.-OpenAI

Deepfake-busting apps can spot even a single pixel out of place

Speaking of AI-generated imagery... it's so easy to use that anyone can make a fake video or image, no matter their motives. Luckily, technology for discerning true images from manipulated creations is catching up.-MIT Technology Review

Generating custom photo-realistic faces using AI

Generating realistic images based on descriptions is much harder than describing an image—for humans and computers. But this new generative model is making that task easier.-Insight

How do you like your ML career?

“Over the last few years ML has lost some of its luster in my mind - the hype around deep learning and ML has added a lot of noise into the system, and for someone who cares about doing good science that's been hard for me.”-r/MachineLearning

Mask R-CNN Benchmark

A fast and modular implementation for Faster R-CNN and Mask R-CNN written entirely in PyTorch 1.0. It's 30% quicker than mmdetection during training.-Facebook Research

How Three French Students Used Borrowed Code to Put the First AI Portrait in Christie’s

The code used to generate this portrait is mostly the work of another artist and programmer. This raises a question about attribution in the open and collaborate AI art community, which is taking its first steps into mainstream attention.-The Wall Street Journal

Deepfake Videos Are Getting Real and That's a Problem

Changing photos used to be tedious and time-consuming. Fast-forward to now: nearly anyone can use deep learning and AI to generate incredibly realistic “fake videos”—President Obama saying something he never said, for instance.-The Wall Street Journal

Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups

How can you train your model on large batches when your GPU can’t hold more than a few samples? Let's find out.-Hugging Face

Artwork Personalization at Netflix

Ever notice how the preview image for the same show or movie on Netflix changes whenever you log back in? Here's a peek into the system that figures out which piece of artwork is the best for convincing a particular member why that title is “for them.”-The Netflix Tech Blog

Data: A key requirement for your Machine Learning (ML) product

For all the PMs out there: here are some tips for how to talk about data in your Product Requirement Document for a machine learning product.-The Lever

A Review of the Neural History of Natural Language Processing

It's kind of crazy that neural network NLP is now old enough to have its own historical timeline. This post condenses about 15 years’ of work into eight milestones that impacted how these technologies are used today.-aylien

Introduction to Machine Learning for Coders: Launch

This new course uses modern tools and libraries, including python, pandas, scikit-learn, and pytorch. Unlike many educational materials in the field, this approach is “code first” rather than “math first.”-fast.ai

Why building your own Deep Learning Computer is 10x cheaper than AWS

Avoid hefty cloud GPU costs by building a computer from scratch.-The Mission

Tabular Data in Scikit-Learn and Dask-ML

Take advantage of Scikit-Learn's latest improvements for working with tabular data.-datas-frame

Help! I can’t reproduce a machine learning project!

Reproducibility breaks down in three main places: the code, the data and the environment. This guide should help you narrow down where your reproducibility problems are, so you can focus on fixing them.-No Free Hunch

Anatomy of an AI System

“The stack that is required to interact with an Amazon Echo goes well beyond the multi-layered ‘technical stack’ of data modeling, hardware, servers and networks. The full stack reaches much further into capital, labor and nature, and demands an enormous amount of each. The true costs of these systems – social, environmental, economic, and political – remain hidden and may stay that way for some time.”-Anatomy of an AI System

Retracing your steps in Machine Learning: Versioning

New prediction systems are fragile things. Change one thing, and the accuracy of the model can drop dramatically, leading to a long troubleshooting process to find the root cause. Skip the headache with this guide to building a robust versioning system for your ML projects.-The Lever

No Machine Learning in your product? Start here

Just how much does a product owner need to know about machine learning? A Google PM shares his experience integrating machine learning into an existing product: Google Forms.-The Lever

Human translators are still on top—for now

Machine translation works well for sentences. For full documents? Not so much.-MIT Technology Review

VerbiAge: Using NLP to help writers craft age-specific writing

This app for tailoring a book’s description for a target K-12 age is a nice example of how machine learning can aid in creative tasks.-Insight

What HBR Gets Wrong About Algorithms and Bias

This post injects some much-needed nuance into the biased algorithms discussion: humans vs machines is not a helpful framing and most critics of unjust bias aren’t anti-algorithm.-fast.ai

Learning Meaning and Semantics in Natural Language Processing

A few weeks ago, data science Twitter spun out a fascinating mega-thread on NLP meaning and semantics. Since Twitter threads can be tricky to parse after-the-fact, this summary, interactive tweet tree, and commented map provide three entry points into the discussion.-Hugging Face

Differentiable Image Parameterizations

This powerful, under-explored tool for neural network visualizations and art produces vibrant images that look like they came straight out of Annihilation.-Distill

ACL 2018 Highlights: Understanding Representations and Evaluation in More Challenging Settings

This post digs into two themes of the Association for Computational Linguistics 2018 conference: gaining a better understanding what NLP models capture and to expose them to more challenging settings.-Sebastian Ruder

Machine Learning Glossary

Find yourself dragged under by wave after wave of machine learning jargon? Part of Google's Machine Learning Crash Course, this glossary provides plain-English descriptions of the terms you've heard thrown around by ML experts, without sacrificing accuracy.-Google

Reinforcement learning’s foundational flaw

“Does it really make sense to start learning a new skill based only on its reward signal, with neither prior experience nor higher-level instruction?”-The Gradient

Feature-wise transformations

Many real-world problems require integrating multiple sources of information. Feature-wise transformations offer a way to effectively capture and leverage the relationship of various sources, across a wide range of problem settings like image recognition, reinforcement learning, and style transfer.-Distill

What do machine learning practitioners actually do?

“Any solution to the shortage of machine learning expertise requires answering this question: whether it’s so we know what skills to teach, what tools to build, or what processes to automate.”-fast.ai

Papers with Code

A searchable site that links machine learning papers on ArXiv with code on GitHub.-Papers with Code

Model Tuning and the Bias-Variance Tradeoff

This visual intro to machine learning covers how errors can arise due to assumptions that are overly simple (bias) or overly complex (variance).-R2D3

Gender Shades

This evaluation compares how well IBM, Microsoft, and Face++ products are able to classify gender across skin types. All companies perform better on lighter subjects as a whole than on darker subjects as a whole with an 11.8% - 19.2% difference in error rates, and all companies perform worst on darker females.-Joy Buolamwini

Why the Future of Machine Learning is Tiny

“I’m convinced that machine learning can run on tiny, low-power chips, and that this combination will solve a massive number of problems we have no solutions for right now.”-Pete Warden

Machine learning predicts World Cup winner

Researchers have predicted the outcome after simulating the entire soccer tournament 100,000 times. (Good news awaits if you’re pulling for Brazil, Germany, or Spain!)-MIT Technology Review

A Developer’s Guide to Building AI Applications

O'Reilly and Microsoft collaborated on a free e-book that walks you through the process of building intelligent cloud-based bots (with relevant code samples available on GitHub).-Microsoft Machine Learning Blog

How The New York Times Uses Software To Recognize Members of Congress

The most interesting part of this project isn't the models used (Amazon's Rekognition API), but the practical considerations the team faced when introducing the “Who the Hill” app to the real world: poor lighting for photos in the Capitol halls, bad cell phone reception, and celebrity doppelgängers.-Times Open

Why you need to improve your training data, and how to do it

When you use deep learning as part of an application, getting better training data is vastly more effective than making model adjustments.-Pete Warden

Launching Cutting Edge Deep Learning for Coders: 2018 edition

Part 2 of fast.ai’s free deep learning course is here! All you need is high school math and 1 year of coding experience.-fast.ai

Smart Compose: Using Neural Networks to Help Write Emails

The engineers behind Smart Compose—a Gmail feature that offers sentence completion suggestions as you type—dig into how they tackled the challenges of fairness and privacy, latency, and scale.-Google AI Blog

Feature Engineering and Selection: A Practical Approach for Predictive Models

This book on predictive modeling is about 60% done and the authors are looking for feedback. The section on Engineering Numeric Predictors alone is fantastic.-Max Kuhn and Kjell Johnson

Qualitative before Quantitative: How Qualitative Methods Support Better Data Science

“Have you ever been embarrassed by the first iteration of one of your machine learning projects, where you didn’t include obvious and important features? In the practical hustle and bustle of trying to build models, we can often forget about the observation step in the scientific method and jump straight to hypothesis testing.”-Indeed Data Science

Picking Trending Topics and Celebrities Using Machine Learning

The machine learning engineers at Conde Nast applied their expertise to help Vanity Fair’s writers and editors better craft stories that have a broad, meaningful impact.-Conde Nast Technology

Get Started with Eager Execution in TensorFlow

The folks at TensorFlow are putting their tutorials directly into Google Collab notebooks (which requires zero setup to run!). If you've ever wanted to learn more about machine learning, this time is now. Especially since a recent survey suggests that most data scientists lack advanced machine learning expertise.-TensorFlow

Artist + AI

Here's a new Twitter account for you to follow. This artist combines her hand-drawn work with generative adversarial networks (GANs) to create something completely new.-Helena Sarin

Demystifying Docker for Data Scientists – A Docker Tutorial for Your Deep Learning Projects

Is Docker really the best thing since sliced bread? Find out in this tutorial, which covers the basics of how to interact with Docker containers and create custom Docker images for your AI workloads.-Microsoft's Machine Learning Blog

The Building Blocks of Interpretability

This article really gets you inside a neural network's “head” by explaining the thought process as it decides between two labels for an image, like a bowtie and a pair of sunglasses.-Distill

The Malicious Use of Artificial Intelligence

This 101-page report “surveys the landscape of potential security threats from malicious uses of artificial intelligence technologies, and proposes ways to better forecast, prevent, and mitigate these threats.” Divvy it out across your commutes and moments of downtime this week.-maliciousaireport.com

Descriptive mAchine Learning EXplanations (DALEX)

Unpack some black boxes with this handy cheatsheet for understanding how complex ML models work.-Przemyslaw Biecek

Manifesto for Data Practices

Give this a read, whether you sign it or not.-DataPractices.org

So, How Many ML Models You Have NOT Built?

“What will put us out of our job is Machine Learning Overkill. I have seen implementation of Machine Learning algorithms to very frivolous problems and worse still the companies have invested heavily into the idea. It is a ticking time bomb. The moment the companies realize that the ROI is negative, they will shun the Data Science practice altogether.”-Towards Data Science

THREAD: How computer vision and natural-language processing systems reflect societal stereotypes

A rabbit hole worthy of your time: various types of machine learning bias as tracked by academic papers.-Arvind Narayanan

Exploring Recommendation Systems

In practice, recommenders don’t always work as well as we’d like them to. This post sets out to discover why.-FastForward Labs

Turning Design Mockups Into Code With Deep Learning

Ever wish you could automate the front-end engineering process? Here’s how to teach a neural network to code a basic HTML and CSS website from a design mockup.-FloydHub

Learning Curves for Machine Learning

How do you diagnose bias and variance? And what actions should you take once you’ve detected these errors?-Dataquest

Machine Learning: The High-Interest Credit Card of Technical Debt

There’s no such thing as a free machine learning project. Avoid or refactor these risk factors and design patterns to keep technical debt from piling up.-Research at Google

2017: The year AI beat us at all our own games

“Over the past 12 months AI crossed a series of new thresholds, finally beating human players in a variety of different games, from the ancient game of Go to the dynamic and interactive card game, Texas Hold-Em Poker.”-New Atlas

How many images do you need to train a neural network?

The technically correct answer is: “It depends.” The ballpark answer is: “1,000 representative images for each class.” (With some caveats of course.)-Pete Warden

Deep Learning Achievements Over the Past Year

Carve out some time in your holiday schedule to explore 2017's most exciting developments in text, voice, and computer vision technologies.-Stats & Bots

The U.S. Leads in Artificial Intelligence, but for How Long?

Government policies such as the tax bill, reduced funding, and tightening of rules on immigration for international researchers threaten the U.S.’s advantage in AI.-MIT Technology Review

NIPS 2017 — Highlights

If you didn’t attend the conference on Neural Information Processing Systems last week, never fear! Catch up on the latest in AI with these day-by-day summaries.-Insight Data

Improving Palliative Care with Deep Learning

80% of Americans prefer to spend their final days in their home, but only 20% actually do. This 18-layer deep neural network identifies hospitalized patients with a high risk of death in the next 3-12 months, so they can get access to palliative care sooner.-Standford ML Group

Innovating Faster on Personalization Algorithms at Netflix Using Interleaving

“The interleaving approach allows us to quickly prune down the initial set of ranking algorithms to the most promising candidates, enabling us to conduct experiments a rate much faster than traditional A/B testing to identify winning ideas.”-Netflix Technology Blog

[VIDEO] Livecoding Madness: Let’s Build a Deep Learning Library

This is interesting on two levels: “how to build a deep learning library” and “how someone who’s not me writes Python” (in this case, the answer is: incredibly fast).-Joel Grus

Fairness Measures

Awareness of the bias of algorithms is important, but here’s a way to actually do something about it. Run your dataset through this Python package and you’ll get back a measure that quantifies discrimination within that dataset.-Fairness Measures

The era of easily faked, AI-generated photos is quickly emerging

Nvidia’s researchers trained algorithms on 30,000 images of celebrities, and it’s nearly impossible to tell the generated images from the real ones.-Quartz

Scalable Machine Learning (Part 1)

What do you do when your training dataset fits in memory, but the dataset you're making predictions on doesn't? This post identifies where the usual pandas and scikit-learn for in-memory analytics workflow breaks down and offers some solutions for scaling out to larger problems.-Tom Augspurger

Can Neural Nets Detect Sexual Orientation? A Data Scientist’s Perspective

Dig into the data behind Stanford's controversial paper Deep Neural Networks Can Detect Sexual Orientation From Faces.-fast.ai

My Neural Network isn't working! What should I do?

11 mistakes you may make while implementing a neural network—and how to fix them.-Daniel Holden

Train, Score, Repeat, Watch Out! Zillow's Andrew Martin on modeling pitfalls in a dynamic world.

One of Zillow's data scientists addresses the challenges that don’t crop up in standard textbook problems or most ML competitions: feedback loops, dynamic datasets, and temporal consistency. A great read for Kagglers and non-Kagglers alike.-No Free Hunch

Switching to a Probabilistic Model for Venue Search in Foursquare

How Foursquare’s engineering team improved the accuracy and user experience of their location intelligence by switching from a search ranking algorithm to regression trees and probabilities.-Foursquare Engineering

BuzzFeed News Trained A Computer To Search For Hidden Spy Planes. This Is What We Found.

Learn how BuzzFeed trained a random forest algorithm to spot planes flown by the FBI and DHS.-BuzzFeed

Improving the Realism of Synthetic Images

Producing a large, diverse, and accurate training set for machine learning models is a pricey endeavor. Apple provides a rare behind-the-scenes look at how they cut costs and improved their models by making simulated images look more realistic.-Apple Machine Learning Journal

Technical Debt in Machine Learning

What do feedback loops, correction cascades, and hobo-features have in common? They’re all machine learning anti-patterns that can slowly creep into your infrastructure and create a ticking time bomb.-Towards Data Science

Inside Facebook’s AI Workshop

When Joaquin Candela first started at Facebook, he worked on an ad-targeting algorithm with a handful of engineers. Five years later, he runs the Applied Machine Learning team, which comprises hundreds of employees running thousands of experiments a day. Here’s how he scaled up Facebook’s AI factory at breakneck speed.-Harvard Business Review

Using Machine Learning to Predict Value of Homes On Airbnb

How Airbnb used internal and open-source tools (like Python!) to lower the overall development costs of customer lifetime value (LTV) modeling. Code examples abound.-Airbnb Engineering and Data Science

Human-Centered Machine Learning

For UX folks: A 7-step guide to stay focused on human needs when designing with machine learning.-Google Design

Visualizing High Dimensional Data In Augmented Reality

When you’re trying to understand the relationships in a really big dataset (three-million-grocery-orders big), a 2D scatterplot might not cut it. This immersive 3D visualization technique offers a way to make sense of data with multiple attributes and improve machine learning features and models.-Inside Machine Learning

How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native

The use-case may be farcical, but the deep learning and edge computing behind it are very real.-Hacker Noon

Predicting the Success of a Reddit Submission with Deep Learning and Keras

It all comes down to two things: the time of day and a catchy title.-Max Woolf

Vertical AI Startups: Solving Industry-specific Problems by Combining AI and Subject Matter Expertise

“While most of the machine learning talent works in big tech companies, massive and timely problems are lurking in every major industry outside tech.”-Bradford Cross

J.P. Morgan’s massive guide to machine learning and big data jobs in finance

Get the key takeaways from this 280-page report, including essential data analysis packages, hiring tips, and which machine learning techniques to apply to which problems.-efinancialcareers

Is Your Organization Ready for ML?

Don’t make this mistake: “[M]any organizations rush to hire ML experts without laying the proper foundation to ensure their success, including creating proper database architecture, building out essential data science technology, establishing data governance, and instilling data-driven decision-making throughout the organization.”-RE•WORK


Save this hashtag for the moments when you need to jog your memory on some basic concepts.-Chris Albon

Machine Learning for Product Managers

A brilliant, non-technical read for anyone who designs, supports, manages, or plans for products that use machine learning.-Hacker Noon

Distill: An Interactive, Visual Journal for Machine Learning Research

This new online publication is bringing academic journals into the 21st century: “A Distill article… isn’t just a paper. It’s an interactive medium that lets users – 'readers' is no longer sufficient – work directly with machine learning models.”-Y Combinator

Tips & Tricks for Feature Engineering / Applied Machine Learning

One commenter put it best: 'Probably the best feature engineering slides I have found [on] the internet.' Need we say more?-HJ van Veen

Learning about Machine Learning with an Earthquake Example

How well can we predict whether or not someone is prepared for an earthquake?-Simply Statistics

How Fitbit’s data science team scales machine learning

Workout regimens need to be tailored to each individual. Directional correctness isn’t enough. Fitbit’s head of data science shares how his team builds a model for every user to increase motivation and prevent injuries.-Mixpanel

Fake News Challenge

This grassroots effort is inviting teams to harness AI technologies to help human fact checkers identify hoaxes and deliberate misinformation in news stories. The top three teams get a cash prize, so grab a couple of friends and check out the training dataset.-Fake News Challenge

Machine Learning Videos

More of a visual learner? Here’s a repository of recorded talks at machine learning conferences, workshops, seminars, and more.-Dustin Tran

What is artificial intelligence? A three part definition

“As soon as it works, no one calls it AI anymore.”-Simply Statistics


You could be a poet, and not know it. Feed the works of your favorite author through this new Python library to generate as many lines of verse as you want.-Anthony Federico

What I Learned Implementing a Classifier from Scratch in Python

With libraries like scikit-learn, it’s easy to run an algorithm on some data and automagically get an answer—without understanding exactly how you arrived there. Prepare to unpack the black box.-Jean-Nicholas Hould

What’s the state of the job market in data science and machine learning?

“Th[e] proliferation of courses, resources, books and startups would hint that machine learning is becoming more and more accessible to the average programmer and that the market is on track to getting saturated quickly. Is this the current trend?”-Hacker News

20 Weird & Wonderful Datasets for Machine Learning

Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model.-Oliver Cameron

Deep-Fried Data

Opening your data can lead to unpredictable benefits, but requires being open to unexpected uses of your data.-Idle Words

Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math

This essay is a godsend for those of us who have trouble understanding or explaining what exactly deep learning is.-WIRED

Boosting Sales With Machine Learning

One developer shares how his team used natural language processing and machine learning in Python to pre-qualify sales leads so reps don’t have to spend hours doing it manually.-Xeneta

Hybrid Intelligence: How Artificial Assistants Work

When humans and machines work together, they accomplish a lot more than either could on their own. This is known as hybrid intelligence—a pretty intimidating term for those unfamiliar with machine learning. Here’s a breakdown.-Clare Corthell

The real prerequisite for machine learning isn’t math, it’s data analysis

Machine learning amateurs, take heart. Proficiency with high level math may be essential for machine learning theory. But with out-of-the-box tools like R’s gmodels package or Python’s scikit-learn library, you don’t need to know linear algebra or calculus to build a successful predictive model. You do, however, need to know your way around a dataset.-Sharp Sight Labs

How Kalman Filters Work, Part 1

This article unpacks different filtering algorithms in an incredibly intuitive way. It’s a long read, but you’ll come away having learned a ton (did you know that NASA used Kalman filters to help Apollo spacecraft navigate to the moon?).-An Uncommon Lab

Microsoft’s Tay is an Example of Bad Design

0r Why Interaction Design Matters, and so does QA-ing.-Caroline Sinders

Here's How We Prevent The Next Racist Chatbot

Tay.ai is the consequence of poor training-Popular Science

Why Microsoft Accidentally Unleashed a Neo-Nazi Sexbot

It’s not surprising that Microsoft’s chatbot spewed racist invective, but here’s how it could have been avoided.-MIT Technology Review

Explained Visually

This website is an incredible collection of interactive visualizations aimed at making tricky concepts like Markov chains and regression easy to understand. Schedule a few hours to explore this one—you’re gonna need them.-Explained Visually

Lift analysis - A data scientist’s secret weapon

Learn how to spot flaws in machine learning models with lift analysis (and why you should add it to your list of evaluation metrics).-Andy Goldschmidt

We Now Have Algorithms To Predict Police Misconduct

You’ve probably heard of predictive policing, but what about predictive policing for the police? One police department teamed up with researchers to test an algorithm that detects troublesome behavior of officers early on.-FiveThirtyEight

Are Your Predictive Models like Broken Clocks?

How can you ensure you’ve picked the “right model” for a very big and very complex dataset?-Rocket-Powered Data Science

Startups Aim to Exploit a Deep-Learning Skills Gap

What do you do when every company wants to build a deep-learning network, but the experts are in short supply? Launch a product, of course. Some startups have created computer chips and software libraries that can accelerate algorithm training, all without having to hire an experienced team of deep-learning experts.-MIT Technology Review

Georgia Tech Researchers Demonstrate How the Brain Can Handle So Much Data

Random projection is frequently used in machine learning to make sense of big, diverse data. It turns out this method could be one of the ways that humans learn, too.-Georgia Tech

The current state of machine intelligence 2.0

These days, it feels like every other article in our newsfeeds is touting the potential of machine intelligence. This article cuts through the hype and presents this year’s major accomplishments in two categories—“(1) the emergence of autonomous systems in both the physical and virtual world and (2) startups shifting away from building broad technology platforms to focusing on solving specific business problems.”-O'Reilly

Get our weekly data newsletter

Work-related distractions for every data enthusiast.