Machine Learning Articles

Machine learning, deep learning, artificial intelligence... The science of getting machines to perform actions without explicitly programming them to do so can be intimidating for the uninitiated. These machine learning articles aim to unpack the black box for beginners, with introductions to overall concepts and tutorials for training a model of their own.

Bayesian Cyber Risk Quantification With Industry-Specific Models

The machine learning community is obsessed with deep learning on big dense datasets, but problems like cyber insurance with small sparse data require Bayesian methods. - Tower Street

AI Adoption is Being Fueled by an Improved Tool Ecosystem

In 2010, the ratio of AI scientific papers to patents filed was 8:1. In 2016, it was 3:1. We’re in the implementation phase now. - O’Reilly

18 Impressive Applications of Generative Adversarial Networks (GANs)

It feels like GANs pop up everywhere these days. If you’re looking for a fundamental understanding of what GANs can do, this is a great overview. - Machine Learning Mastery

Deepfake Propaganda Is Not a Real Problem

There’s real damage being done by deepfake techniques, but it’s happening in pornography, not politics. - The Verge

Once Again, a Neural Net Tries to Name Cats

Start off your Monday morning with a good chuckle. - Janelle Shane

GANs And Deepfakes Could Revolutionize The Fashion Industry

GAN's impact on fashion goes way deeper than virtual fitting rooms and creating avatars to customer's measurements. - Forbes

Machine Learning Product Management: Lessons Learned

Product management for ML projects can be difficult because engineering changes from a deterministic process to a probabilistic one. It requires “an approach that involves learning from data instead of programmatically following a set of human rules.” - Domino Data Lab

Railyard: How We Rapidly Train Machine Learning Models With Kubernetes

Stripe trains hundreds of new models each day, each powered by billions of data points. Running infrastructure at this scale poses a very practical data science and ML problem: how do you give every team the tools they need to train their models without requiring them to operate their own infrastructure? - Stripe

Notes on AI Bias

Bias in AI doesn’t mean just bias against people. Sometimes it just picking up the wrong signal, period. Case in point: a system built to detect skin cancer was detecting rulers instead. - Benedict Evans

Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection

Not only could a simple color printout hide someone from an AI video surveillance system for nefarious purposes, it could offer protection to people who don't want to be tracked in everyday life. - arXiv

This YouTube Channel Streams AI-Generated Death Metal 24/7

Dadabots was developed by two music technologists who wanted to prove that a neural network was capable of capturing the subtle stylistic differences between Death Metal, Math Rock, and other lesser-known genres. - Motherboard

One Model to Rule Them All

This post discusses the obsession with finding the best model and emphasizes what should be done instead: Take a step back and see the bigger picture in which the machine learning model is embedded. - bentoML

Discriminating Systems: Gender, Race and Power in AI

“The field of research on bias and fairness needs to go beyond technical debiasing to include a wider social analysis of how AI is used in context. This necessitates including a wider range of disciplinary expertise.” - AI Now Institute

Open Questions about Generative Adversarial Networks

Practical improvements to image synthesis models are happening almost too quickly to keep up with, but there are still several open research problems left to tackled. - Distill

Scaling Uber’s Customer Support Ticket Assistant (COTA) System with Deep Learning

“Our online tests validate that the COTA v2 deep learning system performs significantly better than the COTA v1 system in terms of key metrics, including model performance, ticket handling time, and customer satisfaction.” - Uber Engineering

My Machine Learning Research Jobhunt

One PhD graduate shares their experience trying to find an AI research position in Europe, from the application process through to salary negotiations. - Generalized Error

Active Learner

Supervised machine learning, while powerful, needs labeled data to be effective. This visualization shows how active learning data labeling strategies can improve your models. - Fast Forward Labs

A Framework for Understanding Unintended Consequences of Machine Learning

The concept of biased data is often too broad to be useful. This framework includes 5 ways of categorizing bias: historical, representation, measurement, evaluation, and aggregation. - Cornell University

A Gentle Introduction to Learning Curves for Diagnosing Machine Learning Model Performance

Discover learning curves and how they can be used to diagnose the learning and generalization behavior of machine learning models. - Machine Learning Mastery

Money Machines: An Interview with an Anonymous Algorithmic Trader

An insider explains how algorithms are rewiring finance. - Logic

Game of Thrones Reigns Supreme Among AT&T’s Assets. Here’s How We Used Wikidata’s Entities and Ontology to Find That Out.

Six companies own most U.S. media. Given that each of these companies owns thousands of these types of assets, how do you determine which ones are the most important? - Parse.ly Engineering

Unsolved Research Problems vs. Real-world Threat Models

“I personally think adversarial examples are highly worth studying, and should inspire serious concern. However, most of the justifications for why exactly they’re worrisome strike me as overly literal. I think much of the confusion comes from conflating an unsolved research problem with a real-world threat model.” - Catherine Olson

Tackling Bias in Machine Learning

This article digs into the hows and whys of the Python package Fair Classifier, which quantifies the fairness of a model and uses an adversarial network to help ensure equity in machine learning models. - Insight

Coconet: the ML model behind today’s Bach Doodle

Last week, Google celebrated J.S. Bach’s 334th birthday with “the first AI-powered Google Doodle.” Here's how their team built a model that takes a user-created melody and harmonizes it in Bach’s style. - Magenta

How I Eat For Free in NYC Using Python, Automation, Artificial Intelligence, and Instagram

That's one way to save money in the Big Apple! Here's how a data engineer created a 100% automated Instagram account to earn free meals at restaurants looking for promotion. - Chris Buetti

What's the difference between data science, machine learning, and artificial intelligence?

Whip this out the next time you tell someone you're a data scientist, and they ask “Does that mean you work on artificial intelligence?” - Variance Explained

Modeling Censored Time-to-Event Data Using Pyro, an Open Source Probabilistic Programming Language

When churn models just weren't cutting it for Uber, they created their own language in Python to properly model the time from a user's first ride to their second. - Uber Engineering

Using Deep Learning to “Read Your Thoughts” — With Keras and EEG

Saying a word in one’s mind, even if not spoken aloud, can result in the firing of the nerves controlling the muscles involved in speech. With some readily available equipment, you can train a model to classify these sub-vocalized words in less than a day. - Justin Alvey

AI Interprets What Rodents Are Saying

With a deep learning-based system for detection and analysis of rodent vocalizations, researchers can better understand their test subjects. And it's adorably named “DeepSqueak.” - Psychology Today

Cocktail Similarity

Thanks to a difference algorithm, you now have the perfect guide for figuring out your minimum-viable at-home bar setup. - Tom MacWright

Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

Make room for this on your reading list. - Christoph Molnar

SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color

This gives a whole new meaning to “Photoshopped.” - Youngjoo Jo

The Limitations of Deep Learning for Vision and How We Might Fix Them

“Now it is difficult to publish anything that is not neural network related. This is not a good development. We suspect that the field would progress faster if researchers pursued a diversity of approaches and techniques instead of chasing the current vogue.” - The Gradient

Data Versioning

The degrees of freedom in versioning machine learning systems poses a unique challenge. Each broad approach to tackle this problem has pros and cons to keep in mind. - Emily Gorcenski

We Analyzed 16,625 Papers to Figure Out Where AI Is Headed Next

This study of 25 years of artificial-intelligence research suggests that deep learning may soon be on its way out. - MIT Technology Review

Why Are Machine Learning Projects So Hard to Manage?

“I’ve watched lots of companies attempt to deploy machine learning—some succeed wildly and some fail spectacularly. One constant is that machine learning teams have a hard time setting goals and setting expectations. Why is this?” - Lukas Biewald

The Best Defense Against Deepfake AI Might Be . . . Blinking

Researchers can now detect AI-generated fake video with a 95% success rate. Because few images are available online showing people with their eyes closed, there's less training data available for deepfakes to get natural blinking right. - Fast Company

What Can Neural Networks Learn?

It’s tricky to know what neural networks are actually learning as they're trained. This post does a good job breaking down what's going on inside. - Data Science and Robots

POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions Through the Paired Open-Ended Trailblazer

“Sometimes… we do not just want to solve known problems, because unknown problems are also important. Consequently, we are exploring algorithms that continually invent both problems and solutions of increasing complexity and diversity.” - Uber Engineering

lazydata: Scalable Data Dependencies for Python projects

This library might help out when you need data version control for your next machine learning project. - rstojnic

Most Impactful AI Trends of 2018: the Rise of ML Engineering

Was 2018 an important inflection point for the machine learning industry? Checkout this roundup of key trends and the impact they may have on ML this year. - Emmanuel Ameisen

Gender and Jobs in Online Image Searches

A really cool example of using machine vision to spot gender bias in Google Image search results. - Pew Research Center

Kernel Density Estimation

Kernel density estimation (KDE) a useful statistical tool that’s way less scary than it sounds. This interactive shows how KDE lets you create a smooth curve given a set of data. - Matthew Conlen

Concepts in object detection

Naming and locating several objects at once in an image with no prior information about how many objects are supposed to be detected is much harder than identifying a single object. Here’s how to do it using TensorFlow and R. - Tensorflow for R Blog

The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence

“In accepting the existing narratives about A.I., vast zones of contest and imagination are relinquished. What is achieved is resignation — the normalization of massive data capture, a one-way transfer to technology companies, and the application of automated, predictive solutions to each and every societal problem.” - Medium

AI Art Gallery

Check out this collection of art, music and design using machine learning from a NeurlIPS 2018 workshop. - Neural Information Processing Systems

These incredibly realistic fake faces show how algorithms can now mess with us

The latest advance in generative adversarial networks allowed researchers to generate fake images of faces with an previously unknown level of control over elements like age, race, gender—even freckles. - MIT Technology Review

State of Deep Learning : H2 2018 Review

“The growth rate of machine learning papers has been around 3.5% a month since July — which is around a 50% growth rate annually. This means around 2,200 machine learning papers a month and that we can expect around 30,000 new machine learning papers next year.” - Atlas ML

Public Attitudes Toward Computer Algorithms

“58% of Americans feel that computer programs will always reflect some level of human bias – although 40% think these programs can be designed in a way that is bias-free.” - Pew Research Center

Beating the State-of-the-art in NLP With HMTL

Learn how Multi-Task Learning—a general method in which a single architecture is trained towards learning several different tasks at the same time—can be applied to natural language processing. - Hugging Face

Is this AI? We drew you a flowchart to work it out

It's a bit hard to read, but if you squint hard enough this flowchart should help you discern if something's truly AI—or just hyped up and mislabeled. - MIT Technology Review

AI adoption advances, but foundational barriers remain

One highlight from this global survey about how AI is used at companies: those working in manufacturing and risk see AI as more valuable than those in other fields like marketing and sales or human resources. - McKinsey & Company

What You Have To Fear From Artificial Intelligence

Vicki Boykis describes this long read best: “Great piece the real, practical concerns of deep learning applications (aka not robots killing us): fake images, text, and soundbytes that we won't know aren't human-generated.” - Current Affairs

Reinforcement Learning with Prediction-Based Rewards

When a reinforcement learning agent was incentivized to be curious and avoid "boredom” while playing Mario, it discovered warp levels, how to defeat bosses, and more. - OpenAI

Deepfake-busting apps can spot even a single pixel out of place

Speaking of AI-generated imagery... it's so easy to use that anyone can make a fake video or image, no matter their motives. Luckily, technology for discerning true images from manipulated creations is catching up. - MIT Technology Review

Generating custom photo-realistic faces using AI

Generating realistic images based on descriptions is much harder than describing an image—for humans and computers. But this new generative model is making that task easier. - Insight

How do you like your ML career?

“Over the last few years ML has lost some of its luster in my mind - the hype around deep learning and ML has added a lot of noise into the system, and for someone who cares about doing good science that's been hard for me.” - r/MachineLearning

Mask R-CNN Benchmark

A fast and modular implementation for Faster R-CNN and Mask R-CNN written entirely in PyTorch 1.0. It's 30% quicker than mmdetection during training. - Facebook Research

How Three French Students Used Borrowed Code to Put the First AI Portrait in Christie’s

The code used to generate this portrait is mostly the work of another artist and programmer. This raises a question about attribution in the open and collaborate AI art community, which is taking its first steps into mainstream attention. - The Wall Street Journal

Deepfake Videos Are Getting Real and That's a Problem

Changing photos used to be tedious and time-consuming. Fast-forward to now: nearly anyone can use deep learning and AI to generate incredibly realistic “fake videos”—President Obama saying something he never said, for instance. - The Wall Street Journal

Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups

How can you train your model on large batches when your GPU can’t hold more than a few samples? Let's find out. - Hugging Face

Data: A key requirement for your Machine Learning (ML) product

For all the PMs out there: here are some tips for how to talk about data in your Product Requirement Document for a machine learning product. - The Lever

Artwork Personalization at Netflix

Ever notice how the preview image for the same show or movie on Netflix changes whenever you log back in? Here's a peek into the system that figures out which piece of artwork is the best for convincing a particular member why that title is “for them.” - The Netflix Tech Blog

A Review of the Neural History of Natural Language Processing

It's kind of crazy that neural network NLP is now old enough to have its own historical timeline. This post condenses about 15 years’ of work into eight milestones that impacted how these technologies are used today. - aylien

Introduction to Machine Learning for Coders: Launch

This new course uses modern tools and libraries, including python, pandas, scikit-learn, and pytorch. Unlike many educational materials in the field, this approach is “code first” rather than “math first.” - fast.ai

Why building your own Deep Learning Computer is 10x cheaper than AWS

Avoid hefty cloud GPU costs by building a computer from scratch. - The Mission

Tabular Data in Scikit-Learn and Dask-ML

Take advantage of Scikit-Learn's latest improvements for working with tabular data. - datas-frame

Anatomy of an AI System

“The stack that is required to interact with an Amazon Echo goes well beyond the multi-layered ‘technical stack’ of data modeling, hardware, servers and networks. The full stack reaches much further into capital, labor and nature, and demands an enormous amount of each. The true costs of these systems – social, environmental, economic, and political – remain hidden and may stay that way for some time.” - Anatomy of an AI System

Help! I can’t reproduce a machine learning project!

Reproducibility breaks down in three main places: the code, the data and the environment. This guide should help you narrow down where your reproducibility problems are, so you can focus on fixing them. - No Free Hunch

Retracing your steps in Machine Learning: Versioning

New prediction systems are fragile things. Change one thing, and the accuracy of the model can drop dramatically, leading to a long troubleshooting process to find the root cause. Skip the headache with this guide to building a robust versioning system for your ML projects. - The Lever

Human translators are still on top—for now

Machine translation works well for sentences. For full documents? Not so much. - MIT Technology Review

No Machine Learning in your product? Start here

Just how much does a product owner need to know about machine learning? A Google PM shares his experience integrating machine learning into an existing product: Google Forms. - The Lever

VerbiAge: Using NLP to help writers craft age-specific writing

This app for tailoring a book’s description for a target K-12 age is a nice example of how machine learning can aid in creative tasks. - Insight

VerbiAge: Using NLP to help writers craft age-specific writing

This app for tailoring a book’s description for a target K-12 age is a nice example of how machine learning can aid in creative tasks. - Insight

What HBR Gets Wrong About Algorithms and Bias

This post injects some much-needed nuance into the biased algorithms discussion: humans vs machines is not a helpful framing and most critics of unjust bias aren’t anti-algorithm. - fast.ai

Learning Meaning and Semantics in Natural Language Processing

A few weeks ago, data science Twitter spun out a fascinating mega-thread on NLP meaning and semantics. Since Twitter threads can be tricky to parse after-the-fact, this summary, interactive tweet tree, and commented map provide three entry points into the discussion. - Hugging Face

ACL 2018 Highlights: Understanding Representations and Evaluation in More Challenging Settings

This post digs into two themes of the Association for Computational Linguistics 2018 conference: gaining a better understanding what NLP models capture and to expose them to more challenging settings. - Sebastian Ruder

Differentiable Image Parameterizations

This powerful, under-explored tool for neural network visualizations and art produces vibrant images that look like they came straight out of Annihilation. - Distill

Machine Learning Glossary

Find yourself dragged under by wave after wave of machine learning jargon? Part of Google's Machine Learning Crash Course, this glossary provides plain-English descriptions of the terms you've heard thrown around by ML experts, without sacrificing accuracy. - Google

Reinforcement learning’s foundational flaw

“Does it really make sense to start learning a new skill based only on its reward signal, with neither prior experience nor higher-level instruction?” - The Gradient

Feature-wise transformations

Many real-world problems require integrating multiple sources of information. Feature-wise transformations offer a way to effectively capture and leverage the relationship of various sources, across a wide range of problem settings like image recognition, reinforcement learning, and style transfer. - Distill

What do machine learning practitioners actually do?

“Any solution to the shortage of machine learning expertise requires answering this question: whether it’s so we know what skills to teach, what tools to build, or what processes to automate.” - fast.ai

AdamW and Super-convergence is now the fastest way to train neural nets

It’s time to give Adam another go. - fast.ai

Papers with Code

A searchable site that links machine learning papers on ArXiv with code on GitHub. - Papers with Code

Model Tuning and the Bias-Variance Tradeoff

This visual intro to machine learning covers how errors can arise due to assumptions that are overly simple (bias) or overly complex (variance). - R2D3

Gender Shades

This evaluation compares how well IBM, Microsoft, and Face++ products are able to classify gender across skin types. All companies perform better on lighter subjects as a whole than on darker subjects as a whole with an 11.8% - 19.2% difference in error rates, and all companies perform worst on darker females. - Joy Buolamwini

Why the Future of Machine Learning is Tiny

“I’m convinced that machine learning can run on tiny, low-power chips, and that this combination will solve a massive number of problems we have no solutions for right now.” - Pete Warden

Machine learning predicts World Cup winner

Researchers have predicted the outcome after simulating the entire soccer tournament 100,000 times. (Good news awaits if you’re pulling for Brazil, Germany, or Spain!) - MIT Technology Review

How The New York Times Uses Software To Recognize Members of Congress

The most interesting part of this project isn't the models used (Amazon's Rekognition API), but the practical considerations the team faced when introducing the “Who the Hill” app to the real world: poor lighting for photos in the Capitol halls, bad cell phone reception, and celebrity doppelgängers. - Times Open

A Developer’s Guide to Building AI Applications

O'Reilly and Microsoft collaborated on a free e-book that walks you through the process of building intelligent cloud-based bots (with relevant code samples available on GitHub). - Microsoft Machine Learning Blog

Why you need to improve your training data, and how to do it

When you use deep learning as part of an application, getting better training data is vastly more effective than making model adjustments. - Pete Warden

Launching Cutting Edge Deep Learning for Coders: 2018 edition

Part 2 of fast.ai’s free deep learning course is here! All you need is high school math and 1 year of coding experience. - fast.ai

Smart Compose: Using Neural Networks to Help Write Emails

The engineers behind Smart Compose—a Gmail feature that offers sentence completion suggestions as you type—dig into how they tackled the challenges of fairness and privacy, latency, and scale. - Google AI Blog

Feature Engineering and Selection: A Practical Approach for Predictive Models

This book on predictive modeling is about 60% done and the authors are looking for feedback. The section on Engineering Numeric Predictors alone is fantastic. - Max Kuhn and Kjell Johnson

Qualitative before Quantitative: How Qualitative Methods Support Better Data Science

“Have you ever been embarrassed by the first iteration of one of your machine learning projects, where you didn’t include obvious and important features? In the practical hustle and bustle of trying to build models, we can often forget about the observation step in the scientific method and jump straight to hypothesis testing.” - Indeed Data Science

Picking Trending Topics and Celebrities Using Machine Learning

The machine learning engineers at Conde Nast applied their expertise to help Vanity Fair’s writers and editors better craft stories that have a broad, meaningful impact. - Conde Nast Technology

Get Started with Eager Execution in TensorFlow

The folks at TensorFlow are putting their tutorials directly into Google Collab notebooks (which requires zero setup to run!). If you've ever wanted to learn more about machine learning, this time is now. Especially since a recent survey suggests that most data scientists lack advanced machine learning expertise. - TensorFlow

Artist + AI

Here's a new Twitter account for you to follow. This artist combines her hand-drawn work with generative adversarial networks (GANs) to create something completely new. - Helena Sarin

Demystifying Docker for Data Scientists – A Docker Tutorial for Your Deep Learning Projects

Is Docker really the best thing since sliced bread? Find out in this tutorial, which covers the basics of how to interact with Docker containers and create custom Docker images for your AI workloads. - Microsoft's Machine Learning Blog

The Building Blocks of Interpretability

This article really gets you inside a neural network's “head” by explaining the thought process as it decides between two labels for an image, like a bowtie and a pair of sunglasses. - Distill

The Malicious Use of Artificial Intelligence

This 101-page report “surveys the landscape of potential security threats from malicious uses of artificial intelligence technologies, and proposes ways to better forecast, prevent, and mitigate these threats.” Divvy it out across your commutes and moments of downtime this week. - maliciousaireport.com

Descriptive mAchine Learning EXplanations (DALEX)

Unpack some black boxes with this handy cheatsheet for understanding how complex ML models work. - Przemyslaw Biecek

Manifesto for Data Practices

Give this a read, whether you sign it or not. - DataPractices.org

So, How Many ML Models You Have NOT Built?

“What will put us out of our job is Machine Learning Overkill. I have seen implementation of Machine Learning algorithms to very frivolous problems and worse still the companies have invested heavily into the idea. It is a ticking time bomb. The moment the companies realize that the ROI is negative, they will shun the Data Science practice altogether.” - Towards Data Science

THREAD: How computer vision and natural-language processing systems reflect societal stereotypes

A rabbit hole worthy of your time: various types of machine learning bias as tracked by academic papers. - Arvind Narayanan

Exploring Recommendation Systems

In practice, recommenders don’t always work as well as we’d like them to. This post sets out to discover why. - FastForward Labs

Turning Design Mockups Into Code With Deep Learning

Ever wish you could automate the front-end engineering process? Here’s how to teach a neural network to code a basic HTML and CSS website from a design mockup. - FloydHub

Learning Curves for Machine Learning

How do you diagnose bias and variance? And what actions should you take once you’ve detected these errors? - Dataquest

Machine Learning: The High-Interest Credit Card of Technical Debt

There’s no such thing as a free machine learning project. Avoid or refactor these risk factors and design patterns to keep technical debt from piling up. - Research at Google

2017: The year AI beat us at all our own games

“Over the past 12 months AI crossed a series of new thresholds, finally beating human players in a variety of different games, from the ancient game of Go to the dynamic and interactive card game, Texas Hold-Em Poker.” - New Atlas

Deep Learning Achievements Over the Past Year

Carve out some time in your holiday schedule to explore 2017's most exciting developments in text, voice, and computer vision technologies. - Stats & Bots

How many images do you need to train a neural network?

The technically correct answer is: “It depends.” The ballpark answer is: “1,000 representative images for each class.” (With some caveats of course.) - Pete Warden

Deep Learning Achievements Over the Past Year

Carve out some time in your holiday schedule to explore 2017's most exciting developments in text, voice, and computer vision technologies. - Stats & Bots

The U.S. Leads in Artificial Intelligence, but for How Long?

Government policies such as the tax bill, reduced funding, and tightening of rules on immigration for international researchers threaten the U.S.’s advantage in AI. - MIT Technology Review

NIPS 2017 — Highlights

If you didn’t attend the conference on Neural Information Processing Systems last week, never fear! Catch up on the latest in AI with these day-by-day summaries. - Insight Data

[VIDEO] Livecoding Madness: Let’s Build a Deep Learning Library

This is interesting on two levels: “how to build a deep learning library” and “how someone who’s not me writes Python” (in this case, the answer is: incredibly fast). - Joel Grus

Innovating Faster on Personalization Algorithms at Netflix Using Interleaving

“The interleaving approach allows us to quickly prune down the initial set of ranking algorithms to the most promising candidates, enabling us to conduct experiments a rate much faster than traditional A/B testing to identify winning ideas.” - Netflix Technology Blog

Improving Palliative Care with Deep Learning

80% of Americans prefer to spend their final days in their home, but only 20% actually do. This 18-layer deep neural network identifies hospitalized patients with a high risk of death in the next 3-12 months, so they can get access to palliative care sooner. - Standford ML Group

Fairness Measures

Awareness of the bias of algorithms is important, but here’s a way to actually do something about it. Run your dataset through this Python package and you’ll get back a measure that quantifies discrimination within that dataset. - Fairness Measures

The era of easily faked, AI-generated photos is quickly emerging

Nvidia’s researchers trained algorithms on 30,000 images of celebrities, and it’s nearly impossible to tell the generated images from the real ones. - Quartz

Scalable Machine Learning (Part 1)

What do you do when your training dataset fits in memory, but the dataset you're making predictions on doesn't? This post identifies where the usual pandas and scikit-learn for in-memory analytics workflow breaks down and offers some solutions for scaling out to larger problems. - Tom Augspurger

Can Neural Nets Detect Sexual Orientation? A Data Scientist’s Perspective

Dig into the data behind Stanford's controversial paper Deep Neural Networks Can Detect Sexual Orientation From Faces. - fast.ai

My Neural Network isn't working! What should I do?

11 mistakes you may make while implementing a neural network—and how to fix them. - Daniel Holden

Train, Score, Repeat, Watch Out! Zillow's Andrew Martin on modeling pitfalls in a dynamic world.

One of Zillow's data scientists addresses the challenges that don’t crop up in standard textbook problems or most ML competitions: feedback loops, dynamic datasets, and temporal consistency. A great read for Kagglers and non-Kagglers alike. - No Free Hunch

Switching to a Probabilistic Model for Venue Search in Foursquare

How Foursquare’s engineering team improved the accuracy and user experience of their location intelligence by switching from a search ranking algorithm to regression trees and probabilities. - Foursquare Engineering

BuzzFeed News Trained A Computer To Search For Hidden Spy Planes. This Is What We Found.

Learn how BuzzFeed trained a random forest algorithm to spot planes flown by the FBI and DHS. - BuzzFeed

Technical Debt in Machine Learning

What do feedback loops, correction cascades, and hobo-features have in common? They’re all machine learning anti-patterns that can slowly creep into your infrastructure and create a ticking time bomb. - Towards Data Science

Inside Facebook’s AI Workshop

When Joaquin Candela first started at Facebook, he worked on an ad-targeting algorithm with a handful of engineers. Five years later, he runs the Applied Machine Learning team, which comprises hundreds of employees running thousands of experiments a day. Here’s how he scaled up Facebook’s AI factory at breakneck speed. - Harvard Business Review

Using Machine Learning to Predict Value of Homes On Airbnb

How Airbnb used internal and open-source tools (like Python!) to lower the overall development costs of customer lifetime value (LTV) modeling. Code examples abound. - Airbnb Engineering and Data Science

Improving the Realism of Synthetic Images

Producing a large, diverse, and accurate training set for machine learning models is a pricey endeavor. Apple provides a rare behind-the-scenes look at how they cut costs and improved their models by making simulated images look more realistic. - Apple Machine Learning Journal

Human-Centered Machine Learning

For UX folks: A 7-step guide to stay focused on human needs when designing with machine learning. - Google Design

Visualizing High Dimensional Data In Augmented Reality

When you’re trying to understand the relationships in a really big dataset (three-million-grocery-orders big), a 2D scatterplot might not cut it. This immersive 3D visualization technique offers a way to make sense of data with multiple attributes and improve machine learning features and models. - Inside Machine Learning

How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native

The use-case may be farcical, but the deep learning and edge computing behind it are very real. - Hacker Noon

Predicting the Success of a Reddit Submission with Deep Learning and Keras

It all comes down to two things: the time of day and a catchy title. - Max Woolf

Vertical AI Startups: Solving Industry-specific Problems by Combining AI and Subject Matter Expertise

“While most of the machine learning talent works in big tech companies, massive and timely problems are lurking in every major industry outside tech.” - Bradford Cross

J.P. Morgan’s massive guide to machine learning and big data jobs in finance

Get the key takeaways from this 280-page report, including essential data analysis packages, hiring tips, and which machine learning techniques to apply to which problems. - efinancialcareers

“Many enterprise ‘AI products’ and ‘machine intelligence’ products built today have limited appeal or impact”

One investor’s self-described “unpopular” opinion - Sarah Guo

Is Your Organization Ready for ML?

Don’t make this mistake: “[M]any organizations rush to hire ML experts without laying the proper foundation to ensure their success, including creating proper database architecture, building out essential data science technology, establishing data governance, and instilling data-driven decision-making throughout the organization.” - RE•WORK

#machinelearningflashcards

Save this hashtag for the moments when you need to jog your memory on some basic concepts. - Chris Albon

Machine Learning for Product Managers

A brilliant, non-technical read for anyone who designs, supports, manages, or plans for products that use machine learning. - Hacker Noon

Distill: An Interactive, Visual Journal for Machine Learning Research

This new online publication is bringing academic journals into the 21st century: “A Distill article… isn’t just a paper. It’s an interactive medium that lets users – 'readers' is no longer sufficient – work directly with machine learning models.” - Y Combinator

Tips & Tricks for Feature Engineering / Applied Machine Learning

One commenter put it best: 'Probably the best feature engineering slides I have found [on] the internet.' Need we say more? - HJ van Veen

Learning about Machine Learning with an Earthquake Example

How well can we predict whether or not someone is prepared for an earthquake? - Simply Statistics

How Fitbit’s data science team scales machine learning

Workout regimens need to be tailored to each individual. Directional correctness isn’t enough. Fitbit’s head of data science shares how his team builds a model for every user to increase motivation and prevent injuries. - Mixpanel

Fake News Challenge

This grassroots effort is inviting teams to harness AI technologies to help human fact checkers identify hoaxes and deliberate misinformation in news stories. The top three teams get a cash prize, so grab a couple of friends and check out the training dataset. - Fake News Challenge

Machine Learning Videos

More of a visual learner? Here’s a repository of recorded talks at machine learning conferences, workshops, seminars, and more. - Dustin Tran

What is artificial intelligence? A three part definition

“As soon as it works, no one calls it AI anymore.” - Simply Statistics

Poesy

You could be a poet, and not know it. Feed the works of your favorite author through this new Python library to generate as many lines of verse as you want. - Anthony Federico

What I Learned Implementing a Classifier from Scratch in Python

With libraries like scikit-learn, it’s easy to run an algorithm on some data and automagically get an answer—without understanding exactly how you arrived there. Prepare to unpack the black box. - Jean-Nicholas Hould

What’s the state of the job market in data science and machine learning?

“Th[e] proliferation of courses, resources, books and startups would hint that machine learning is becoming more and more accessible to the average programmer and that the market is on track to getting saturated quickly. Is this the current trend?” - Hacker News

20 Weird & Wonderful Datasets for Machine Learning

Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model. - Oliver Cameron

Deep-Fried Data

Opening your data can lead to unpredictable benefits, but requires being open to unexpected uses of your data. - Idle Words

Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math

This essay is a godsend for those of us who have trouble understanding or explaining what exactly deep learning is. - WIRED

Boosting Sales With Machine Learning

One developer shares how his team used natural language processing and machine learning in Python to pre-qualify sales leads so reps don’t have to spend hours doing it manually. - Xeneta

Hybrid Intelligence: How Artificial Assistants Work

When humans and machines work together, they accomplish a lot more than either could on their own. This is known as hybrid intelligence—a pretty intimidating term for those unfamiliar with machine learning. Here’s a breakdown. - Clare Corthell

The real prerequisite for machine learning isn’t math, it’s data analysis

Machine learning amateurs, take heart. Proficiency with high level math may be essential for machine learning theory. But with out-of-the-box tools like R’s gmodels package or Python’s scikit-learn library, you don’t need to know linear algebra or calculus to build a successful predictive model. You do, however, need to know your way around a dataset. - Sharp Sight Labs

How Kalman Filters Work, Part 1

This article unpacks different filtering algorithms in an incredibly intuitive way. It’s a long read, but you’ll come away having learned a ton (did you know that NASA used Kalman filters to help Apollo spacecraft navigate to the moon?). - An Uncommon Lab

Explained Visually

This website is an incredible collection of interactive visualizations aimed at making tricky concepts like Markov chains and regression easy to understand. Schedule a few hours to explore this one—you’re gonna need them. - Explained Visually

Lift analysis - A data scientist’s secret weapon

Learn how to spot flaws in machine learning models with lift analysis (and why you should add it to your list of evaluation metrics). - Andy Goldschmidt

Here's How We Prevent The Next Racist Chatbot

Tay.ai is the consequence of poor training - Popular Science

Why Microsoft Accidentally Unleashed a Neo-Nazi Sexbot

It’s not surprising that Microsoft’s chatbot spewed racist invective, but here’s how it could have been avoided. - MIT Technology Review

Microsoft’s Tay is an Example of Bad Design

0r Why Interaction Design Matters, and so does QA-ing. - Caroline Sinders

We Now Have Algorithms To Predict Police Misconduct

You’ve probably heard of predictive policing, but what about predictive policing for the police? One police department teamed up with researchers to test an algorithm that detects troublesome behavior of officers early on. - FiveThirtyEight

Are Your Predictive Models like Broken Clocks?

How can you ensure you’ve picked the “right model” for a very big and very complex dataset? - Rocket-Powered Data Science

Startups Aim to Exploit a Deep-Learning Skills Gap

What do you do when every company wants to build a deep-learning network, but the experts are in short supply? Launch a product, of course. Some startups have created computer chips and software libraries that can accelerate algorithm training, all without having to hire an experienced team of deep-learning experts. - MIT Technology Review

Georgia Tech Researchers Demonstrate How the Brain Can Handle So Much Data

Random projection is frequently used in machine learning to make sense of big, diverse data. It turns out this method could be one of the ways that humans learn, too. - Georgia Tech

The current state of machine intelligence 2.0

These days, it feels like every other article in our newsfeeds is touting the potential of machine intelligence. This article cuts through the hype and presents this year’s major accomplishments in two categories—“(1) the emergence of autonomous systems in both the physical and virtual world and (2) startups shifting away from building broad technology platforms to focusing on solving specific business problems.” - O'Reilly