Machine Learning Articles
Machine learning, deep learning, artificial intelligence... The science of getting machines to perform actions without explicitly programming them to do so can be intimidating for the uninitiated. These machine learning articles aim to unpack the black box for beginners, with introductions to overall concepts and tutorials for training a model of their own.
Thoughts on ML Engineering After a Year of My PhD
“People keep talking about how ML engineering (MLE) is a subset of software engineering or should be treated as such. But over the last 15 months of graduate school, I’ve been thinking about MLE through the lens of data engineering.”-Shreya Shankar
Moving Beyond Mimicry in Artificial Intelligence
What makes pre-trained AI models so impressive—and potentially harmful.-Nautilus
Physiognomic Artificial Intelligence
This paper covers how computer vision is a central vector for physiognomic AI technologies and unpacks how computer vision reanimates physiognomy in conception, form, and practice and the dangers this trend presents for civil liberties.-Fordham Intellectual Property, Media, & Entertainment Law Journal
Supervised Machine Learning for Text Analysis in R
This book is designed to provide practical guidance and directly applicable knowledge for data scientists and analysts who want to integrate text into their modeling pipelines.-Emil Hvitfeldt and Julia Silge
Davis Summarizes Papers
Every week, Davis reads all the machine learning arXiv submissions and summarizes 10 to 20 of his favorites.-Davis Blalock
Creating Confidence Intervals for Machine Learning Classifiers
Confidence intervals are no silver bullet, but at the very least, they can offer an additional glimpse into the uncertainty of the reported accuracy and performance of a model.-Sebastian Raschka
All Roads Lead to Rome: The Machine Learning Job Market in 2022
In the next decade we may see software companies adopt an “Artificial General Intelligence strategy” as a means to make their software more adaptive and generally useful.-Eric Jang
Bayesian Rock Climbing Rankings
Spoiler: this model built on an outdoor climbing dataset ends up being wrong. But the journey to get there is worth it.-Ethan Rosenthal
Artificial Intelligence Is Creating a New Colonial World Order
“The more users a company can acquire for its products, the more subjects it can have for its algorithms, and the more resources—data—it can harvest from their activities, their movements, and even their bodies.”-MIT Technology Review
Machine Learning Has a Validity Problem
“One of the central tenets of machine learning warns the more times you run experiments with the same test set, the more you overfit to that test set. This conventional wisdom is mostly wrong and prevents machine learning from reconciling its inductive nihilism with the rest of the empirical sciences.”-arg min blog
Redesigning Etsy’s Machine Learning Platform
In 2017, Etsy built in-house solutions for their small data science team. In 2020, with a scaled-up team, it was time to cut the cord and make use industry standard tooling and managed solutions such as Google Cloud.-Code as Craft
Startup Opportunities in Machine Learning Infrastructure
From the perspective of an investor and previous early-stage operator in the space.-Leigh Marie Braswell
ML and NLP Research Highlights of 2021
Dig into the 15 most interesting papers and research advancements of last year.-Sebastian Ruder
My Machine Learning Process (Mistakes Included)
Many of the tutorial blog posts out there show perfect, unblemished code. This post shows the actual process of creating a model, warts and all.-David Neuzerling
How to Keep Learning About Machine Learning
There’s lots of great advice in here about learning about anything, really: do a personal project that stretches you, adopt a beginner’s mind, and find mentors who are a few steps ahead.-Eugene Yan
Real-time Machine Learning: Challenges and Solutions
“In the last year, I’ve talked to ~30 companies in different industries about their challenges with real-time machine learning. This post outlines the solutions for (1) online prediction and (2) continual learning, with step-by-step use cases, considerations, and technologies required for each level.”-Chip Huyen
How to Read Research Papers: A Pragmatic Approach for ML Practitioners
Is it necessary for data scientists or machine-learning experts to read research papers? Yes. But don’t worry if you lack a formal academic background. This hands-on tutorial will make the endeavor far less intimidating.-NVIDIA Developer Blog
The Steep Cost of Capture
Tech firms are startlingly well positioned to shape what we do—and do not—know about AI and the business behind it, at the same time that their AI products are working to shape our lives and institutions.-Interactions
Transformers from Scratch
If you've always wondered how transformers work but know nothing about machine learning, here's a peek behind the curtain.-Brandon Rohrer
Why Machine Learning Hates Vegetables
Also inspired by the Zillow discussion, this post details a bad use case for machine learning: automated dietary advice.-Emily Riederer
How Wadhwani AI Uses PyTorch to Empower Cotton Farmers
By using PyTorch, Wadhwani AI researchers have been able to create a model that is able to accurately predict the location of pests within cotton crops.-PyTorch
AI Research: The Unreasonably Narrow Path and How Not to Be Miserable
Apparently, there are two paths for AI research: industry and academia. And they’re all hiring the same kind of people, with the same rigid rubric.-Roseanne Liu
Reaching MLE (Machine Learning Enlightenment)
“This is the job, writing and gluing together the code that makes drastically different systems speak to each other in data-oriented language.”-Vicki Boykis
Red Hot: The 2021 Machine Learning, AI and Data (MAD) Landscape
Every company is becoming not just a software company, but also a data company.-Matt Turck
Participatory Data Stewardship
This framework rejects practices of data collection, storage, sharing, and use in ways that are opaque or seek to manipulate people, in favor of practices that empower people to help inform, shape, and govern their own data.-Ada Lovelace Institute
Bad Labels
“The issue here isn’t just that we might have bad labels in our training set, the issue is that it appears in the validation set. If a machine learning model can become state of the art by squeezing another 0.5% out of a validation set one has to wonder. Are we really making a better model?”-Vincent D. Warmerdam
Challenges and Opportunities in NLP Benchmarking
Natural language processing models have become so powerful over the last few years that we need new benchmarks for measuring their performance.-Sebastian Ruder
Introduction to Causal Inference from a Machine Learning Perspective
Here’s Sean Taylor on why causal inference is important for working data scientists: “Typically in causal inference, researchers try to estimate some quantity of interest one time for a publication. In industry, we must build systems to reliably estimate quantities, at scale, over time, for a variety of contexts.”-Brady Neal
Mitigating Dataset Harms Requires Stewardship: Lessons From 1000 Papers
Efforts to implement higher ethical standards and transparency in the machine learning dataset creation process can be more effective if we understand of how datasets are used in practice in the research community.-arXiv.org
Is GitHub Copilot a Blessing, or a Curse?
To see real improvements in program synthesis, we’ll need to go beyond just language models, to a more holistic solution that incorporates best practices around human-computer interaction, software engineering, testing, and many other disciplines.-fast.ai
Homemade Machine Learning
This repository of popular machine learning algorithms is implemented in Python and explains the math behind each algorithm. It’s a great place to get started with machine learning.-Oleksii Trekhleb
Machine Learning Cohorts
What is a "data scientist" or "machine learning engineer," really? This analysis uses tools, libraries, and frameworks to cluster engineers into cohort groups, which might be more indicative of their day-to-day work than overloaded job titles.-Paige Bailey
Data Capitalism
“Data capitalism, like capitalism itself, reinforces dynamics of power and profits. More power creates more profits, and more profits creates more power. Inequality ensures companies get richer and more influential, while everyday people wield less and less power.”-Data for Black Lives
Introducing mltrace
mltrace is designed for collaborative teams working on production machine learning pipelines. It makes it easy to follow a prediction or model’s output back to its most upstream, raw data file.-Shreya Shankar
Māori Are Trying to Save Their Language From Big Tech
With just its first 320 hours of Māori language data, Te Hiku, a small non-profit radio station in New Zealand, was able to build a speech-to-text engine with an initial word error rate of 14 percent. Developing language tools, themselves, is a way for Māori to decolonize the sound of their language.-Wired UK
The Rise of HuggingFace
There’s a lot machine learning startups can learn from HuggingFace about community building.-Breaking the Stagnation
Paper Notes by Vitaly Kurin
Every weekday, this Oxford student reads a machine learning paper, takes notes, and shares them. And now you get to benefit from their studious habits.-Vitaly Kurin
The Giant Leaps In Language Technology — and Who’s Left Behind
What happens when a language is omitted from the digital landscape? And what can be gained when technology acts as a bridge instead of a barrier?-TED
Finding Structure in Users’ Evolving Listening Preferences
This dynamic model of 100K Spotify users shows how music tastes change over time.-Spotify Research
A Recipe for Training Neural Networks
“Suffering is a perfectly natural part of getting a neural network to work well, but it can be mitigated by being thorough, defensive, paranoid, and obsessed with visualizations of basically every possible thing. The qualities that in my experience correlate most strongly to success in deep learning are patience and attention to detail.”-Andrej Karpathy
Moving Beyond “Algorithmic Bias Is a Data Problem”
The model is a problem, too. Recognizing that allows for new ways to reduce harm that are easier than comprehensive data collection.-Patterns
Reducing Toxicity in Language Models
Many large pre-trained models unavoidably acquire certain toxic behavior and biases from the online data they’re trained on. Here are some ways to limit unsafe and harmful content in the finished product.-Lil’Log
Who Is Making Sure the A.I. Machines Aren’t Racist?
When Google forced out two well-known artificial intelligence experts, a long-simmering research controversy burst into the open.-The New York Times
Artificial Intelligence and Inclusion: Formerly Gang-Involved Youth as Domain Experts for Analyzing Unstructured Twitter Data
Folks who train algorithms, take note: this mixture of social work and data science is a shining example of how to create AI with a positive social impact.-SAGE Journals
Introduction to Machine Learning Reliability Engineering
Like site reliability engineers (SREs), machine learning reliability engineers (MLREs) are responsible for reducing toil, keeping costs within budget, and ensuring smooth releases, but with more data savvy.-TestDriven.io
Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities
“Advances in algorithmic fairness have largely omitted sexual orientation and gender identity. We explore queer concerns in privacy, censorship, language, online safety, health, and employment to study the positive and negative effects of artificial intelligence on queer communities.”-arXiv.org
AI Incident Database
This repository of problems caused by AI is intended to help researchers and developers prevent similar harms from happening again.-AI Incident Database
My Friend Radicalized. This Made Me Rethink How I Build AI
“As I watched a mob batter down the windows of the Capitol, inspired by online echo chambers in tweets and newsfeeds, I thought of my work in machine learning, thought of that drone strike, and wondered ‘did I unwittingly help create this?’”-Jaan Altosaar
Why You Should Do NLP Beyond English
7000+ languages are spoken around the world but NLP research has mostly focused on English. It’s time to change that.-Sebastian Ruder
Baking with Machine Learning
A ML-powered tool for understanding the science behind what differentiates a cake from a bread or a cookie.-Sara Robinson
Corporate Reporting in the Era of Artificial Intelligence
“Companies have long seen annual reports and other corporate disclosures as opportunities to portray their business health in a positive light. Increasingly, the audience for these disclosures is not just humans, but also machine readers that process the information as an input to investment recommendations."-National Bureau of Economic Research
The Underlying Values of Machine Learning Research
An analysis of the 68 most highly-cited, influential ML papers of the last few years finds that the majority don’t mention societal need or negative consequences. Instead, these papers assess themselves according to a set of values internal to the field.-Resistance AI Workshop
Machine Learning Is Going Real-time
What does real-time machine learning even mean? How is it done? What are the use cases?-Chip Huyen
Data and Its (Dis)contents: A Survey of Dataset Development and Use in Machine Learning Research
“In this paper, we survey the many concerns raised about the way we collect and use data in machine learning and advocate that a more cautious and thorough understanding of data is necessary to address several of the practical and ethical issues of the field.”-arXiv.org
The Most Important Thing
In machine learning research, it’s easy to fall into the trap of only working with benchmark data sets. You can guard against this by working on concrete problems. You don’t need a grand goal, either, just a concrete one.-Brandon Rohrer
Datasheets for Datasets
“The machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains... [W]e propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on.”-arXiv.org
Building a Gigascale ML Feature Store with Redis, Binary Serialization, String Hashing, and Compression
“Our benchmarking results indicated that Redis was the best option, so we decided to optimize our feature storage mechanism, tripling our cost reduction. Additionally, we also saw a 38% decrease in Redis latencies, helping to improve the runtime performance of serving models.”-DoorDash Engineering
Indigenous AI Position Paper
This paper is a starting place for those who want to design and create AI from an ethical position that centers Indigenous concerns. It is an attempt to capture multiple layers of a discussion that happened over 20 months, across 20 timezones, during two workshops, and between Indigenous people...-Indigenous AI
Creating a Custom Corpus
If you want to use natural language processing (NLP), you need a corpus (a collection of texts) to work with. Here’s how to build a corpus of one’s own.-Python in Plain English
Is Facial Recognition Too Biased to Be Let Loose?
“Amba Kak and others support a moratorium on any use of facial recognition, not just because the technology isn’t good enough yet, but also because there needs to be a broader discussion of how to prevent it from being misused.-Nature
Who Am I to Decide When Algorithms Should Make Important Decisions?
“Technical know-how, whether in government or in the technology industry, cannot substitute for contextual understanding and lived experiences in determining whether it’s appropriate to apply AI systems in sensitive social domains.”-The Boston Globe
Weird AI Yankovic: Generating Parody Lyrics
Syllable and rhyme scheme go in, parody lyrics come out.-arXiv.org
To Apply AI for Good, Think Form Extraction
Form extraction is a hard and important problem. Healthcare professionals, climate scientists, and human rights workers all rely on data that’s often locked away in non-digitized documents.-Jonathan Stray
Characterising Bias in Compressed Models
It is not just the data. Popular compression techniques can amplify bias in deep neural networks.-arXiv.org
Human Learn
What if you could draw a machine learning model? This new library makes it easier to create rule based systems that are scikit-learn compatible.-Vincent D. Warmerdam
Target Didn’t Figure Out a Teenager Was Pregnant Before Her Father Did, and That One Article That Said They Did Was Silly and Bad
“[T]his story in 2012 launched the idea into public consciousness that companies can create algorithms that can diagnose, solve, predict the future, and generally model the human situation better than humans can.” But for the most part, that’s just not true, even seven years later.-Colin Fraser
Machine Learning: The High Interest Credit Card of Technical Debt
Machine learning allows you to build complex systems fast, but it’s dangerous to think of these quick wins as coming for free.-Google Research
Effective Testing for Machine Learning Systems
What’s the difference between testing a machine learning system and a traditional software system? Between model testing and model evaluation? And how do you write a model test, period?-Jeremy Jordan
sklearn-flask-docker
A straightforward example of deploying a sklearn model using Flask and a Docker container. You’ll need some basic knowledge of Docker to get started.-Chris Albon
Traffic Prediction with Advanced Graph Neural Networks
How DeepMind and Google Maps teamed up to improve the accuracy of ETAs by up to 50% in some cities.-DeepMind
How Salesforce Infuses Ethics into Its AI
To support ethical AI, you need to build a culture where employees have the right mindset to create ethical products. Salesforce has a number of processes and programs for doing just that.-Salesforce
An A.I. Training Tool Has Been Passing Its Bias to Algorithms for Almost Two Decades
The CoNLL-2003 dataset is one of the most widely used open source datasets for building natural language processing systems. Over the past 17 years, it’s been cited more than 2,500 times in research literature. The problem? It includes five times as many male names as female names.-OneZero
AI for AG: Production Machine Learning for Agriculture
How one company trained a neural network model that identifies crops and weeds, and then deployed that model to robots in the fields.-PyTorch
The Whiteness of AI
Across the board—in the voices of chatbots and virtual assistants, in stock images, and in film and television—AI is portrayed as white people, and that’s a problem.-Philosophy & Technology
Algorithmic Colonization of Africa
“The continent would do well to adopt a dose of critical appraisal when regulating, deploying, and reporting AI. This requires challenging the mindset that portrays AI with God-like power and as something that exists and learns independent of those that create it. People create, control, and are responsible for any system."-SCRIPTed
Categorizing Products at Scale
How the Shopify team implemented a model to categorize all their products at Shopify, and enabled cross-platform teams to deliver personalized insights to business owners.-Shopify Engineering
Thoughts on Genderify, Gender Discrimination, Transphobia, and (Un)ethical AI
“Products like Genderify are harmful. They’re built on top of biased and inaccurate data, by people who seem to have no interest in risk management or the societal impact of their product, and released for basically no cost into the open for anyone to use.”-Sarah L. Fossheim
applied-ml
A curated collection of blogs, articles, and papers describing how different companies have been using machine learning in production.-Eugene Yan
Combating Anti-Blackness in the AI Community
“The aim of this work is to help community members better identify and understand the scale and scope of anti-Black bias within our AI community and illustrate some concrete steps that members can take to help mitigate these issues and build a more just community."-Devin Guillory
MIT Apologizes, Permanently Pulls Offline Huge Dataset That Taught AI Systems to Use Racist, Misogynistic Slurs
If you’re using 80 Million Tiny Images to benchmark computer-vision algorithms, it’s time to find another data set.-The Register
Reflecting on a Year of Making Machine Learning Actually Useful
“I discuss how working at Viaduct opened my eyes to the challenges of operationalizing machine learning, and how neither my classes nor research forced me to consider these challenges.”-Shreya Shankar
Ethics in NLP
And while you’re learning about natural language processing, add this bibliography to your reading list to make sure you’re mindful about the real harm NLP can cause, no matter your intentions.-Association for Computational Linguistics
NLP Roadmap
For visual learners especially, this mind map of keywords can guide you through what natural language processing topic to study next.-Tae-Hwan Jung
Using GitHub Actions for MLOps & Data Science
GitHub offers almost no features for data science, but they’re trying to change that. This series of GitHub Actions integrate parts of the data science and machine learning workflow with a software development workflow.-The Github Blog
Getting Machine Learning to Production
This guide covers the process of creating an end-to-end proof-of-concept machine learning product, from start to finish.-Vicki Boykis
Modern Rules-Based Models
Deep learning models may be all the rage, but let’s shine a light on some of their less-well-known cousins: rules-based models.-R Views
Q&A: Sabelo Mhlambi on What AI can Learn from Ubuntu Ethics
“To think we can come up with a value system to guide AI without looking into other cultures’ value systems, and then to call it universal, is off.”-People + AI Research
Tonks: Building One (Multi-Task) Model to Rule Them All!
What’s even cooler than a multi-task deep learning library? Reading the story of how two co-workers built this library together.-ShopRunner
Tidymodels: Tidy Machine Learning in R
The tidyverse's take on machine learning is finally here.-Rebecca Barter
Anti-patterns in Open Sourced ML Research Code
Unlike most places on the internet, the comments section on this Reddit post have a lot of constructive, helpful advice.-Jari Safi
Masakhane — Machine Translation for Africa
Africa has over 2,000 languages, but African languages account for a small portion of available resources and publications in Natural Language Processing. To fix that problem, an open-source, continent-wide research effort was formed.-Cornell University
Going Beyond SQuAD
The Stanford Question Answering Dataset (SQuAD) has become the archetypal QA dataset for NLP modeling. The emergence of non-English SQuAD replicas means that NLP can be truly democratized. And it’s a good reminder that English is neither synonymous nor representative of natural language.-Towards Data Science
The Big Bad NLP Database
300 (and counting!) datasets for training your natural language processing models.-Quantum Stat
“Just What I Needed”: Making Machine Learning Scalable and Accessible at Grubhub
Before Grubhub had a suite of tools to help with machine learning model deployments, the difficulty of getting scheduled jobs in production resulted in multiple bespoke solutions, duplicated code, and lots of model maintenance overhead.-Grubhub Bytes
Deep Learning Book Series
These notes with code, examples, and drawings are great for beginners who want to understand enough linear algebra to be comfortable with machine learning and deep learning.-Hadrien Jean
You’re Not Paid to Model
If you want to know how data science projects at companies actually work, there’s no better talk to watch than this one.-Jacqueline Nolis
Flow Fields
“Flow fields are something that many programmers reach for early on when they first get into creating algorithmic artwork, but few take the time to polish their use and explore the crazy variety of ways they can be used."-Tyler Hobbs
The Most Important Sorting Algorithm You Need to Know
How and why the default sorting algorithm for Python became widely-used, yet relatively unrecognizable.-Dev
ML and NLP Publications in 2019
Who was the most prolific author of machine learning and natural language processing papers in 2019? The most prolific company? The most prolific country?-Marek Rei
Curriculum for Reinforcement Learning
Curriculums help humans progressively go from understanding simple concepts to solving hard problems. They might do they same for reinforcement learning models.-Lil’Log
Albert Learns to Read
Is this deep learning model as smart as a first grader? You might just while away an hour or two feeding it stories and asking it questions.-Albert Learns to Read
Interactive Tools for Machine Learning, Deep Learning and Math
Sometimes the best way to unpack a complicated concept is to play with it, visually.-Machine Learning Tokyo
The Twelve Truths of Machine Learning for the Real World
“In the Real World, the distribution of input more likely changes than not, ‘curve balls’ from long-tails come out of nowhere, and you don’t always have an answer.”-Delip Rao
Machine Unlearning
Once users have shared their data online, it is generally difficult for them to revoke access and ask for the data to be deleted. Machine learning exacerbates this problem because any model trained with said data may have memorized it, putting users at risk of a successful privacy attack exposing their information. Yet, having models unlearn is notoriously difficult.-Cornell University
Supervised Machine Learning Case Studies in R
This free course covers exploratory data analysis, preparing data so it’s ready for predictive modeling, training supervised machine learning models, and evaluating those models—all using real-world data.-Julia Silge
Powered by AI: Instagram’s Explore Recommender System
Recommending the most relevant content out of billions of options in real time at scale introduces a ton of machine learning engineering problems. Here’s a detailed look at the system Instagram built to provide its users with personalized content.-Instagram Engineering
Biased Algorithms Are Easier to Fix Than Biased People
Changing algorithms is easier than changing people: software on computers can be updated; the “wetware” in our brains has so far proven much less pliable.-The New York Times
AI Dungeon 2: Creating Infinitely Generated Text Adventures with Deep Learning Language Models
This completely AI generated text adventure will respond logically to just about any command you enter, such as “Eat the moon,” “Summon a giraffe,” or “Join the Great British Bakeoff.”-Perception, Control, Cognition
A Guide to Production Level Deep Learning
No matter how many deep learning models you’ve successfully trained, deploying those models in production is a whole different ballgame.-Alireza Dirafzoon
Managing Bias and Risk at Every Step of the AI-Building Process
“As the field is still young, many machine-learning developers lack experience in building enterprise applications, and many business stakeholders have insufficient knowledge of machine learning to know what questions to ask as they scope and manage projects."-Harvard Business Review
Machine Learning Systems Design
Case studies, resources, and 27 exercises for learning how to deploy a machine learning system in the real world.-Chip Huyen
Machine Learning Engineering
This book is offered as free to read on the “read first, buy later” principle. The first three chapters are available now, and you can sign up for the mailing list to get updated when more are added.-Andriy Burkov
Algorithms Were Supposed to Make Virginia Judges Fairer. What Happened Was Far More Complicated.
“[A] formula designed to reduce prison populations in Virginia led some judges to impose harsher sentences for young or black defendants, and more lenient ones for rapists.”-The Washington Post
DReCon: Data-Driven Responsive Control of Physics-Based Characters
This article will be a treat for those interested in video games, deep reinforcement learning, AND physics.-Ubisoft Montreal
We Looked Into Why Our Subscribers Churned–with the Help of Machine Learning
A Swedish media company discovered many correlations between churn and reader activity, including reader gender, number of images viewed, and number of push notifications sent.-MittMedia
The Abstraction and Reasoning Corpus (ARC)
Brandon Rohrer offers high praise for this long read: “If you like to think deeply about building and measuring intelligence, I highly recommend this read. It's a cogent review of intelligence measurement, its challenges, past approaches, and their limitations. [And it] takes the brave step of proposing a path forward.”-François Chollet
Detecting Audio Deepfakes With AI
Earlier this year, thieves used AI to impersonate a CEO’s voice and successfully demand a fraudulent transfer of nearly $250,000. To defend against similar cyber crimes, one company built a detector system to discern between real and fake audio examples—and you can too.-Dessa News
The Problem With Metrics is a Big Problem for AI
“I am not opposed to metrics; I am alarmed about the harms caused when metrics are overemphasized, a phenomenon that we see frequently with AI, and which is having a negative, real-world impact.”-fast.ai
The State of Machine Learning Frameworks in 2019
The war for machine learning frameworks has two remaining big contenders: PyTorch and TensorFlow. Researchers prefer one, while industry practitioners prefer the other.-The Gradient
Sis: Simple Image Search Engine
You can launch this open-source, visual similarity engine just by running two Python scripts.-Yusuke Matsui
Neural Nets Are Just People All the Way Down
“Every single piece of decision-making in a high-tech neural network initially rests on a human being manually putting something together and making a choice.”-Normcore Tech
AI Deserts
“Open any magazine, click randomly on any article on Medium, visit any public event at a think tank; chances are, concerns raised by the age of AI is the topic. Some of it will be bunk, some of it very thoughtful, but the topic is not exactly under-discussed. What is under-discussed is how unevenly this change will happen, because we misunderstand and overestimate the preconditions for AI outside the private sector."-Code for America Blog
When is a Neural Net Too Big for Production?
This post shows examples of successfully shipping large natural language processing transformer models.-Towards Data Science
Designing Your Neural Networks
What’s a good learning rate? How many hidden layers should your network have? Is dropout actually useful? Why are your gradients vanishing?-Towards Data Science
Dungeon Crawling or Lucid Dreaming?
With a neural net as your Dungeon Master, you can conjure anything simply by referring to it and teleport anywhere by just saying the word.-AI Weirdness
A Beginner’s Guide to the Mathematics of Neural Networks
This resource is made for non-experts, and the illustrations are really helpful for wrapping your mind around new concepts.-Kings College London
Simple Beginner’s Guide to Reinforcement Learning & Its Implementation
One of the research scientists (https://twitter.com/iamtrask/status/1163737631395651584) at DeepMind gives this tutorial gets a ringing endorsement: “This is one of the first truly introductory tutorials of RL & Deep Learning I've seen. My kind of tutorial... lots of toy code examples and simple analogies!!!”-Analytics Vidhya
Machine Learning, Faster
“Speed is not a word that is regularly associated with machine learning teams. When we talk and write about accomplishments in machine learning, there is often a focus on the problem, the algorithmic approach, and the results—but no mention of the time that it took to get there.”-Neal Lathia
DeepMind's Losses and the Future of Artificial Intelligence
Alphabet’s DeepMind lost $572 million last year. Does this mean AI is falling apart? No. But considering DeepMind’s strategy does raise some interesting questions about how much it can offer society beyond mastering the game of Go.-Wired
Intro to Pyenv for Machine Learning
Escape Python dependency hell! Instead of configuring a unique Docker container for each project, how about a unique Python environment?-Weights & Biases
How I Became a Machine Learning Practitioner
“I wasn’t quite prepared for just how much I would feel like a beginner. You need to give yourself the space and time to fail. If you learn from enough failures, you’ll succeed.”-Greg Brockman
Rules of Machine Learning: Best Practices for ML Engineering
This reference guide gets a ringing endorsement from Senior Research Scientist Andrew: “Machine learning in a company is 10% data science and 90% other challenges. It's VERY hard. Everything in this guide is ON POINT, and it's stuff you won't learn in an ML book.”-Google
Awesome Production Machine Learning
This repository contains a curated list of open source libraries that will help you deploy, monitor, version, scale, and secure your production machine learning.-The Institute for Ethical Machine Learning
A Code-First Introduction to Natural Language Processing
Learn topic modeling, classification, language modeling, and translation, completely free.-fast.ai
Generate custom Magic: The Gathering cards from an AI using GPT-2
Input a card name, card type, and card mana cost... get a custom card image and card text back!-Max Woolf
Learning to Traverse Latent Spaces for Musical Score Inpainting
This paper shows how to train a deep learning-based model to fill in missing or lost information in a piece of music.-Georgia Tech
Mutual Exclusivity as a Challenge for Neural Networks
Children use the mutual exclusivity bias to learn new words. Standard neural nets show the opposite bias, making it harder for them to learn in common scenarios.-Cornell University
Word2vec: fish + music = bass
Get ready to chuckle at gems like “yeti – snow + economics = homo economicus” and “American Idol – singers = Project Runway.”-graceavery
The BS-Industrial Complex of Phony A.I.
“For a while, we resisted the A.I. label, understanding that our platform wasn’t going to make Watson sweat anytime soon. But eventually, we gave up and just decided to kind of go along with the hype. The market wanted us to be an A.I. company so we chuckled and decided to call ourselves one.”-Gen
Bayesian Cyber Risk Quantification With Industry-Specific Models
The machine learning community is obsessed with deep learning on big dense datasets, but problems like cyber insurance with small sparse data require Bayesian methods.-Tower Street
AI Adoption is Being Fueled by an Improved Tool Ecosystem
In 2010, the ratio of AI scientific papers to patents filed was 8:1. In 2016, it was 3:1. We’re in the implementation phase now.-O’Reilly
18 Impressive Applications of Generative Adversarial Networks (GANs)
It feels like GANs pop up everywhere these days. If you’re looking for a fundamental understanding of what GANs can do, this is a great overview.-Machine Learning Mastery
Deepfake Propaganda Is Not a Real Problem
There’s real damage being done by deepfake techniques, but it’s happening in pornography, not politics.-The Verge
Once Again, a Neural Net Tries to Name Cats
Start off your Monday morning with a good chuckle.-Janelle Shane
GANs And Deepfakes Could Revolutionize The Fashion Industry
GAN's impact on fashion goes way deeper than virtual fitting rooms and creating avatars to customer's measurements.-Forbes
Machine Learning Product Management: Lessons Learned
Product management for ML projects can be difficult because engineering changes from a deterministic process to a probabilistic one. It requires “an approach that involves learning from data instead of programmatically following a set of human rules.”-Domino Data Lab
Railyard: How We Rapidly Train Machine Learning Models With Kubernetes
Stripe trains hundreds of new models each day, each powered by billions of data points. Running infrastructure at this scale poses a very practical data science and ML problem: how do you give every team the tools they need to train their models without requiring them to operate their own infrastructure?-Stripe
Notes on AI Bias
Bias in AI doesn’t mean just bias against people. Sometimes it just picking up the wrong signal, period. Case in point: a system built to detect skin cancer was detecting rulers instead.-Benedict Evans
Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection
Not only could a simple color printout hide someone from an AI video surveillance system for nefarious purposes, it could offer protection to people who don't want to be tracked in everyday life.-arXiv
This YouTube Channel Streams AI-Generated Death Metal 24/7
Dadabots was developed by two music technologists who wanted to prove that a neural network was capable of capturing the subtle stylistic differences between Death Metal, Math Rock, and other lesser-known genres.-Motherboard
One Model to Rule Them All
This post discusses the obsession with finding the best model and emphasizes what should be done instead: Take a step back and see the bigger picture in which the machine learning model is embedded.-bentoML
Discriminating Systems: Gender, Race and Power in AI
“The field of research on bias and fairness needs to go beyond technical debiasing to include a wider social analysis of how AI is used in context. This necessitates including a wider range of disciplinary expertise.”-AI Now Institute
Open Questions about Generative Adversarial Networks
Practical improvements to image synthesis models are happening almost too quickly to keep up with, but there are still several open research problems left to tackled.-Distill
Scaling Uber’s Customer Support Ticket Assistant (COTA) System with Deep Learning
“Our online tests validate that the COTA v2 deep learning system performs significantly better than the COTA v1 system in terms of key metrics, including model performance, ticket handling time, and customer satisfaction.”-Uber Engineering
My Machine Learning Research Jobhunt
One PhD graduate shares their experience trying to find an AI research position in Europe, from the application process through to salary negotiations.-Generalized Error
Active Learner
Supervised machine learning, while powerful, needs labeled data to be effective. This visualization shows how active learning data labeling strategies can improve your models.-Fast Forward Labs
A Framework for Understanding Unintended Consequences of Machine Learning
The concept of biased data is often too broad to be useful. This framework includes 5 ways of categorizing bias: historical, representation, measurement, evaluation, and aggregation.-Cornell University
A Gentle Introduction to Learning Curves for Diagnosing Machine Learning Model Performance
Discover learning curves and how they can be used to diagnose the learning and generalization behavior of machine learning models.-Machine Learning Mastery
Money Machines: An Interview with an Anonymous Algorithmic Trader
An insider explains how algorithms are rewiring finance.-Logic
Unsolved Research Problems vs. Real-world Threat Models
“I personally think adversarial examples are highly worth studying, and should inspire serious concern. However, most of the justifications for why exactly they’re worrisome strike me as overly literal. I think much of the confusion comes from conflating an unsolved research problem with a real-world threat model.”-Catherine Olson
Game of Thrones Reigns Supreme Among AT&T’s Assets. Here’s How We Used Wikidata’s Entities and Ontology to Find That Out.
Six companies own most U.S. media. Given that each of these companies owns thousands of these types of assets, how do you determine which ones are the most important?-Parse.ly Engineering
Tackling Bias in Machine Learning
This article digs into the hows and whys of the Python package Fair Classifier, which quantifies the fairness of a model and uses an adversarial network to help ensure equity in machine learning models.-Insight
Coconet: the ML model behind today’s Bach Doodle
Last week, Google celebrated J.S. Bach’s 334th birthday with “the first AI-powered Google Doodle.” Here's how their team built a model that takes a user-created melody and harmonizes it in Bach’s style.-Magenta
How I Eat For Free in NYC Using Python, Automation, Artificial Intelligence, and Instagram
That's one way to save money in the Big Apple! Here's how a data engineer created a 100% automated Instagram account to earn free meals at restaurants looking for promotion.-Chris Buetti
What's the difference between data science, machine learning, and artificial intelligence?
Whip this out the next time you tell someone you're a data scientist, and they ask “Does that mean you work on artificial intelligence?”-Variance Explained
Modeling Censored Time-to-Event Data Using Pyro, an Open Source Probabilistic Programming Language
When churn models just weren't cutting it for Uber, they created their own language in Python to properly model the time from a user's first ride to their second.-Uber Engineering
Using Deep Learning to “Read Your Thoughts” — With Keras and EEG
Saying a word in one’s mind, even if not spoken aloud, can result in the firing of the nerves controlling the muscles involved in speech. With some readily available equipment, you can train a model to classify these sub-vocalized words in less than a day.-Justin Alvey
AI Interprets What Rodents Are Saying
With a deep learning-based system for detection and analysis of rodent vocalizations, researchers can better understand their test subjects. And it's adorably named “DeepSqueak.”-Psychology Today
Cocktail Similarity
Thanks to a difference algorithm, you now have the perfect guide for figuring out your minimum-viable at-home bar setup.-Tom MacWright
Interpretable Machine Learning: A Guide for Making Black Box Models Explainable
Make room for this on your reading list.-Christoph Molnar
SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color
This gives a whole new meaning to “Photoshopped.”-Youngjoo Jo
The Limitations of Deep Learning for Vision and How We Might Fix Them
“Now it is difficult to publish anything that is not neural network related. This is not a good development. We suspect that the field would progress faster if researchers pursued a diversity of approaches and techniques instead of chasing the current vogue.”-The Gradient
Data Versioning
The degrees of freedom in versioning machine learning systems poses a unique challenge. Each broad approach to tackle this problem has pros and cons to keep in mind.-Emily Gorcenski
We Analyzed 16,625 Papers to Figure Out Where AI Is Headed Next
This study of 25 years of artificial-intelligence research suggests that deep learning may soon be on its way out.-MIT Technology Review
The Best Defense Against Deepfake AI Might Be . . . Blinking
Researchers can now detect AI-generated fake video with a 95% success rate. Because few images are available online showing people with their eyes closed, there's less training data available for deepfakes to get natural blinking right.-Fast Company
Why Are Machine Learning Projects So Hard to Manage?
“I’ve watched lots of companies attempt to deploy machine learning—some succeed wildly and some fail spectacularly. One constant is that machine learning teams have a hard time setting goals and setting expectations. Why is this?”-Lukas Biewald
What Can Neural Networks Learn?
It’s tricky to know what neural networks are actually learning as they're trained. This post does a good job breaking down what's going on inside.-Data Science and Robots
POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions Through the Paired Open-Ended Trailblazer
“Sometimes… we do not just want to solve known problems, because unknown problems are also important. Consequently, we are exploring algorithms that continually invent both problems and solutions of increasing complexity and diversity.”-Uber Engineering
Most Impactful AI Trends of 2018: the Rise of ML Engineering
Was 2018 an important inflection point for the machine learning industry? Checkout this roundup of key trends and the impact they may have on ML this year.-Emmanuel Ameisen
lazydata: Scalable Data Dependencies for Python projects
This library might help out when you need data version control for your next machine learning project.-rstojnic
Gender and Jobs in Online Image Searches
A really cool example of using machine vision to spot gender bias in Google Image search results.-Pew Research Center
Kernel Density Estimation
Kernel density estimation (KDE) a useful statistical tool that’s way less scary than it sounds. This interactive shows how KDE lets you create a smooth curve given a set of data.-Matthew Conlen
Concepts in object detection
Naming and locating several objects at once in an image with no prior information about how many objects are supposed to be detected is much harder than identifying a single object. Here’s how to do it using TensorFlow and R.-Tensorflow for R Blog
The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence
“In accepting the existing narratives about A.I., vast zones of contest and imagination are relinquished. What is achieved is resignation — the normalization of massive data capture, a one-way transfer to technology companies, and the application of automated, predictive solutions to each and every societal problem.”-Medium
AI Art Gallery
Check out this collection of art, music and design using machine learning from a NeurlIPS 2018 workshop.-Neural Information Processing Systems
These incredibly realistic fake faces show how algorithms can now mess with us
The latest advance in generative adversarial networks allowed researchers to generate fake images of faces with an previously unknown level of control over elements like age, race, gender—even freckles.-MIT Technology Review
State of Deep Learning : H2 2018 Review
“The growth rate of machine learning papers has been around 3.5% a month since July — which is around a 50% growth rate annually. This means around 2,200 machine learning papers a month and that we can expect around 30,000 new machine learning papers next year.”-Atlas ML
Public Attitudes Toward Computer Algorithms
“58% of Americans feel that computer programs will always reflect some level of human bias – although 40% think these programs can be designed in a way that is bias-free.”-Pew Research Center
Beating the State-of-the-art in NLP With HMTL
Learn how Multi-Task Learning—a general method in which a single architecture is trained towards learning several different tasks at the same time—can be applied to natural language processing.-Hugging Face
Is this AI? We drew you a flowchart to work it out
It's a bit hard to read, but if you squint hard enough this flowchart should help you discern if something's truly AI—or just hyped up and mislabeled.-MIT Technology Review
AI adoption advances, but foundational barriers remain
One highlight from this global survey about how AI is used at companies: those working in manufacturing and risk see AI as more valuable than those in other fields like marketing and sales or human resources.-McKinsey & Company
What You Have To Fear From Artificial Intelligence
Vicki Boykis describes this long read best: “Great piece the real, practical concerns of deep learning applications (aka not robots killing us): fake images, text, and soundbytes that we won't know aren't human-generated.”-Current Affairs
Reinforcement Learning with Prediction-Based Rewards
When a reinforcement learning agent was incentivized to be curious and avoid "boredom” while playing Mario, it discovered warp levels, how to defeat bosses, and more.-OpenAI
Deepfake-busting apps can spot even a single pixel out of place
Speaking of AI-generated imagery... it's so easy to use that anyone can make a fake video or image, no matter their motives. Luckily, technology for discerning true images from manipulated creations is catching up.-MIT Technology Review
Generating custom photo-realistic faces using AI
Generating realistic images based on descriptions is much harder than describing an image—for humans and computers. But this new generative model is making that task easier.-Insight
How do you like your ML career?
“Over the last few years ML has lost some of its luster in my mind - the hype around deep learning and ML has added a lot of noise into the system, and for someone who cares about doing good science that's been hard for me.”-r/MachineLearning
Mask R-CNN Benchmark
A fast and modular implementation for Faster R-CNN and Mask R-CNN written entirely in PyTorch 1.0. It's 30% quicker than mmdetection during training.-Facebook Research
How Three French Students Used Borrowed Code to Put the First AI Portrait in Christie’s
The code used to generate this portrait is mostly the work of another artist and programmer. This raises a question about attribution in the open and collaborate AI art community, which is taking its first steps into mainstream attention.-The Wall Street Journal
Deepfake Videos Are Getting Real and That's a Problem
Changing photos used to be tedious and time-consuming. Fast-forward to now: nearly anyone can use deep learning and AI to generate incredibly realistic “fake videos”—President Obama saying something he never said, for instance.-The Wall Street Journal
Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups
How can you train your model on large batches when your GPU can’t hold more than a few samples? Let's find out.-Hugging Face
Artwork Personalization at Netflix
Ever notice how the preview image for the same show or movie on Netflix changes whenever you log back in? Here's a peek into the system that figures out which piece of artwork is the best for convincing a particular member why that title is “for them.”-The Netflix Tech Blog
Data: A key requirement for your Machine Learning (ML) product
For all the PMs out there: here are some tips for how to talk about data in your Product Requirement Document for a machine learning product.-The Lever
A Review of the Neural History of Natural Language Processing
It's kind of crazy that neural network NLP is now old enough to have its own historical timeline. This post condenses about 15 years’ of work into eight milestones that impacted how these technologies are used today.-aylien
Introduction to Machine Learning for Coders: Launch
This new course uses modern tools and libraries, including python, pandas, scikit-learn, and pytorch. Unlike many educational materials in the field, this approach is “code first” rather than “math first.”-fast.ai
Why building your own Deep Learning Computer is 10x cheaper than AWS
Avoid hefty cloud GPU costs by building a computer from scratch.-The Mission
Tabular Data in Scikit-Learn and Dask-ML
Take advantage of Scikit-Learn's latest improvements for working with tabular data.-datas-frame
Help! I can’t reproduce a machine learning project!
Reproducibility breaks down in three main places: the code, the data and the environment. This guide should help you narrow down where your reproducibility problems are, so you can focus on fixing them.-No Free Hunch
Anatomy of an AI System
“The stack that is required to interact with an Amazon Echo goes well beyond the multi-layered ‘technical stack’ of data modeling, hardware, servers and networks. The full stack reaches much further into capital, labor and nature, and demands an enormous amount of each. The true costs of these systems – social, environmental, economic, and political – remain hidden and may stay that way for some time.”-Anatomy of an AI System
Retracing your steps in Machine Learning: Versioning
New prediction systems are fragile things. Change one thing, and the accuracy of the model can drop dramatically, leading to a long troubleshooting process to find the root cause. Skip the headache with this guide to building a robust versioning system for your ML projects.-The Lever
No Machine Learning in your product? Start here
Just how much does a product owner need to know about machine learning? A Google PM shares his experience integrating machine learning into an existing product: Google Forms.-The Lever
Human translators are still on top—for now
Machine translation works well for sentences. For full documents? Not so much.-MIT Technology Review
VerbiAge: Using NLP to help writers craft age-specific writing
This app for tailoring a book’s description for a target K-12 age is a nice example of how machine learning can aid in creative tasks.-Insight
What HBR Gets Wrong About Algorithms and Bias
This post injects some much-needed nuance into the biased algorithms discussion: humans vs machines is not a helpful framing and most critics of unjust bias aren’t anti-algorithm.-fast.ai
Learning Meaning and Semantics in Natural Language Processing
A few weeks ago, data science Twitter spun out a fascinating mega-thread on NLP meaning and semantics. Since Twitter threads can be tricky to parse after-the-fact, this summary, interactive tweet tree, and commented map provide three entry points into the discussion.-Hugging Face
Differentiable Image Parameterizations
This powerful, under-explored tool for neural network visualizations and art produces vibrant images that look like they came straight out of Annihilation.-Distill
ACL 2018 Highlights: Understanding Representations and Evaluation in More Challenging Settings
This post digs into two themes of the Association for Computational Linguistics 2018 conference: gaining a better understanding what NLP models capture and to expose them to more challenging settings.-Sebastian Ruder
Machine Learning Glossary
Find yourself dragged under by wave after wave of machine learning jargon? Part of Google's Machine Learning Crash Course, this glossary provides plain-English descriptions of the terms you've heard thrown around by ML experts, without sacrificing accuracy.-Google
Reinforcement learning’s foundational flaw
“Does it really make sense to start learning a new skill based only on its reward signal, with neither prior experience nor higher-level instruction?”-The Gradient
Feature-wise transformations
Many real-world problems require integrating multiple sources of information. Feature-wise transformations offer a way to effectively capture and leverage the relationship of various sources, across a wide range of problem settings like image recognition, reinforcement learning, and style transfer.-Distill
What do machine learning practitioners actually do?
“Any solution to the shortage of machine learning expertise requires answering this question: whether it’s so we know what skills to teach, what tools to build, or what processes to automate.”-fast.ai
AdamW and Super-convergence is now the fastest way to train neural nets
It’s time to give Adam another go.-fast.ai
Papers with Code
A searchable site that links machine learning papers on ArXiv with code on GitHub.-Papers with Code
Model Tuning and the Bias-Variance Tradeoff
This visual intro to machine learning covers how errors can arise due to assumptions that are overly simple (bias) or overly complex (variance).-R2D3
Gender Shades
This evaluation compares how well IBM, Microsoft, and Face++ products are able to classify gender across skin types. All companies perform better on lighter subjects as a whole than on darker subjects as a whole with an 11.8% - 19.2% difference in error rates, and all companies perform worst on darker females.-Joy Buolamwini
Why the Future of Machine Learning is Tiny
“I’m convinced that machine learning can run on tiny, low-power chips, and that this combination will solve a massive number of problems we have no solutions for right now.”-Pete Warden
Machine learning predicts World Cup winner
Researchers have predicted the outcome after simulating the entire soccer tournament 100,000 times. (Good news awaits if you’re pulling for Brazil, Germany, or Spain!)-MIT Technology Review
A Developer’s Guide to Building AI Applications
O'Reilly and Microsoft collaborated on a free e-book that walks you through the process of building intelligent cloud-based bots (with relevant code samples available on GitHub).-Microsoft Machine Learning Blog
How The New York Times Uses Software To Recognize Members of Congress
The most interesting part of this project isn't the models used (Amazon's Rekognition API), but the practical considerations the team faced when introducing the “Who the Hill” app to the real world: poor lighting for photos in the Capitol halls, bad cell phone reception, and celebrity doppelgängers.-Times Open
Why you need to improve your training data, and how to do it
When you use deep learning as part of an application, getting better training data is vastly more effective than making model adjustments.-Pete Warden
Launching Cutting Edge Deep Learning for Coders: 2018 edition
Part 2 of fast.ai’s free deep learning course is here! All you need is high school math and 1 year of coding experience.-fast.ai
Smart Compose: Using Neural Networks to Help Write Emails
The engineers behind Smart Compose—a Gmail feature that offers sentence completion suggestions as you type—dig into how they tackled the challenges of fairness and privacy, latency, and scale.-Google AI Blog
Feature Engineering and Selection: A Practical Approach for Predictive Models
This book on predictive modeling is about 60% done and the authors are looking for feedback. The section on Engineering Numeric Predictors alone is fantastic.-Max Kuhn and Kjell Johnson
Qualitative before Quantitative: How Qualitative Methods Support Better Data Science
“Have you ever been embarrassed by the first iteration of one of your machine learning projects, where you didn’t include obvious and important features? In the practical hustle and bustle of trying to build models, we can often forget about the observation step in the scientific method and jump straight to hypothesis testing.”-Indeed Data Science
Picking Trending Topics and Celebrities Using Machine Learning
The machine learning engineers at Conde Nast applied their expertise to help Vanity Fair’s writers and editors better craft stories that have a broad, meaningful impact.-Conde Nast Technology
Get Started with Eager Execution in TensorFlow
The folks at TensorFlow are putting their tutorials directly into Google Collab notebooks (which requires zero setup to run!). If you've ever wanted to learn more about machine learning, this time is now. Especially since a recent survey suggests that most data scientists lack advanced machine learning expertise.-TensorFlow
Artist + AI
Here's a new Twitter account for you to follow. This artist combines her hand-drawn work with generative adversarial networks (GANs) to create something completely new.-Helena Sarin
Demystifying Docker for Data Scientists – A Docker Tutorial for Your Deep Learning Projects
Is Docker really the best thing since sliced bread? Find out in this tutorial, which covers the basics of how to interact with Docker containers and create custom Docker images for your AI workloads.-Microsoft's Machine Learning Blog
The Building Blocks of Interpretability
This article really gets you inside a neural network's “head” by explaining the thought process as it decides between two labels for an image, like a bowtie and a pair of sunglasses.-Distill
The Malicious Use of Artificial Intelligence
This 101-page report “surveys the landscape of potential security threats from malicious uses of artificial intelligence technologies, and proposes ways to better forecast, prevent, and mitigate these threats.” Divvy it out across your commutes and moments of downtime this week.-maliciousaireport.com
Descriptive mAchine Learning EXplanations (DALEX)
Unpack some black boxes with this handy cheatsheet for understanding how complex ML models work.-Przemyslaw Biecek
Manifesto for Data Practices
Give this a read, whether you sign it or not.-DataPractices.org
So, How Many ML Models You Have NOT Built?
“What will put us out of our job is Machine Learning Overkill. I have seen implementation of Machine Learning algorithms to very frivolous problems and worse still the companies have invested heavily into the idea. It is a ticking time bomb. The moment the companies realize that the ROI is negative, they will shun the Data Science practice altogether.”-Towards Data Science
THREAD: How computer vision and natural-language processing systems reflect societal stereotypes
A rabbit hole worthy of your time: various types of machine learning bias as tracked by academic papers.-Arvind Narayanan
Exploring Recommendation Systems
In practice, recommenders don’t always work as well as we’d like them to. This post sets out to discover why.-FastForward Labs
Turning Design Mockups Into Code With Deep Learning
Ever wish you could automate the front-end engineering process? Here’s how to teach a neural network to code a basic HTML and CSS website from a design mockup.-FloydHub
Learning Curves for Machine Learning
How do you diagnose bias and variance? And what actions should you take once you’ve detected these errors?-Dataquest
Machine Learning: The High-Interest Credit Card of Technical Debt
There’s no such thing as a free machine learning project. Avoid or refactor these risk factors and design patterns to keep technical debt from piling up.-Research at Google
2017: The year AI beat us at all our own games
“Over the past 12 months AI crossed a series of new thresholds, finally beating human players in a variety of different games, from the ancient game of Go to the dynamic and interactive card game, Texas Hold-Em Poker.”-New Atlas
How many images do you need to train a neural network?
The technically correct answer is: “It depends.” The ballpark answer is: “1,000 representative images for each class.” (With some caveats of course.)-Pete Warden
Deep Learning Achievements Over the Past Year
Carve out some time in your holiday schedule to explore 2017's most exciting developments in text, voice, and computer vision technologies.-Stats & Bots
The U.S. Leads in Artificial Intelligence, but for How Long?
Government policies such as the tax bill, reduced funding, and tightening of rules on immigration for international researchers threaten the U.S.’s advantage in AI.-MIT Technology Review
NIPS 2017 — Highlights
If you didn’t attend the conference on Neural Information Processing Systems last week, never fear! Catch up on the latest in AI with these day-by-day summaries.-Insight Data
Improving Palliative Care with Deep Learning
80% of Americans prefer to spend their final days in their home, but only 20% actually do. This 18-layer deep neural network identifies hospitalized patients with a high risk of death in the next 3-12 months, so they can get access to palliative care sooner.-Standford ML Group
Innovating Faster on Personalization Algorithms at Netflix Using Interleaving
“The interleaving approach allows us to quickly prune down the initial set of ranking algorithms to the most promising candidates, enabling us to conduct experiments a rate much faster than traditional A/B testing to identify winning ideas.”-Netflix Technology Blog
[VIDEO] Livecoding Madness: Let’s Build a Deep Learning Library
This is interesting on two levels: “how to build a deep learning library” and “how someone who’s not me writes Python” (in this case, the answer is: incredibly fast).-Joel Grus
Fairness Measures
Awareness of the bias of algorithms is important, but here’s a way to actually do something about it. Run your dataset through this Python package and you’ll get back a measure that quantifies discrimination within that dataset.-Fairness Measures
The era of easily faked, AI-generated photos is quickly emerging
Nvidia’s researchers trained algorithms on 30,000 images of celebrities, and it’s nearly impossible to tell the generated images from the real ones.-Quartz
Scalable Machine Learning (Part 1)
What do you do when your training dataset fits in memory, but the dataset you're making predictions on doesn't? This post identifies where the usual pandas and scikit-learn for in-memory analytics workflow breaks down and offers some solutions for scaling out to larger problems.-Tom Augspurger
Can Neural Nets Detect Sexual Orientation? A Data Scientist’s Perspective
Dig into the data behind Stanford's controversial paper Deep Neural Networks Can Detect Sexual Orientation From Faces.-fast.ai
My Neural Network isn't working! What should I do?
11 mistakes you may make while implementing a neural network—and how to fix them.-Daniel Holden
Train, Score, Repeat, Watch Out! Zillow's Andrew Martin on modeling pitfalls in a dynamic world.
One of Zillow's data scientists addresses the challenges that don’t crop up in standard textbook problems or most ML competitions: feedback loops, dynamic datasets, and temporal consistency. A great read for Kagglers and non-Kagglers alike.-No Free Hunch
Switching to a Probabilistic Model for Venue Search in Foursquare
How Foursquare’s engineering team improved the accuracy and user experience of their location intelligence by switching from a search ranking algorithm to regression trees and probabilities.-Foursquare Engineering
BuzzFeed News Trained A Computer To Search For Hidden Spy Planes. This Is What We Found.
Learn how BuzzFeed trained a random forest algorithm to spot planes flown by the FBI and DHS.-BuzzFeed
Improving the Realism of Synthetic Images
Producing a large, diverse, and accurate training set for machine learning models is a pricey endeavor. Apple provides a rare behind-the-scenes look at how they cut costs and improved their models by making simulated images look more realistic.-Apple Machine Learning Journal
Technical Debt in Machine Learning
What do feedback loops, correction cascades, and hobo-features have in common? They’re all machine learning anti-patterns that can slowly creep into your infrastructure and create a ticking time bomb.-Towards Data Science
Inside Facebook’s AI Workshop
When Joaquin Candela first started at Facebook, he worked on an ad-targeting algorithm with a handful of engineers. Five years later, he runs the Applied Machine Learning team, which comprises hundreds of employees running thousands of experiments a day. Here’s how he scaled up Facebook’s AI factory at breakneck speed.-Harvard Business Review
Using Machine Learning to Predict Value of Homes On Airbnb
How Airbnb used internal and open-source tools (like Python!) to lower the overall development costs of customer lifetime value (LTV) modeling. Code examples abound.-Airbnb Engineering and Data Science
Human-Centered Machine Learning
For UX folks: A 7-step guide to stay focused on human needs when designing with machine learning.-Google Design
Visualizing High Dimensional Data In Augmented Reality
When you’re trying to understand the relationships in a really big dataset (three-million-grocery-orders big), a 2D scatterplot might not cut it. This immersive 3D visualization technique offers a way to make sense of data with multiple attributes and improve machine learning features and models.-Inside Machine Learning
How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native
The use-case may be farcical, but the deep learning and edge computing behind it are very real.-Hacker Noon
Predicting the Success of a Reddit Submission with Deep Learning and Keras
It all comes down to two things: the time of day and a catchy title.-Max Woolf
Vertical AI Startups: Solving Industry-specific Problems by Combining AI and Subject Matter Expertise
“While most of the machine learning talent works in big tech companies, massive and timely problems are lurking in every major industry outside tech.”-Bradford Cross
J.P. Morgan’s massive guide to machine learning and big data jobs in finance
Get the key takeaways from this 280-page report, including essential data analysis packages, hiring tips, and which machine learning techniques to apply to which problems.-efinancialcareers
“Many enterprise ‘AI products’ and ‘machine intelligence’ products built today have limited appeal or impact”
One investor’s self-described “unpopular” opinion-Sarah Guo
Is Your Organization Ready for ML?
Don’t make this mistake: “[M]any organizations rush to hire ML experts without laying the proper foundation to ensure their success, including creating proper database architecture, building out essential data science technology, establishing data governance, and instilling data-driven decision-making throughout the organization.”-RE•WORK
#machinelearningflashcards
Save this hashtag for the moments when you need to jog your memory on some basic concepts.-Chris Albon
Machine Learning for Product Managers
A brilliant, non-technical read for anyone who designs, supports, manages, or plans for products that use machine learning.-Hacker Noon
Distill: An Interactive, Visual Journal for Machine Learning Research
This new online publication is bringing academic journals into the 21st century: “A Distill article… isn’t just a paper. It’s an interactive medium that lets users – 'readers' is no longer sufficient – work directly with machine learning models.”-Y Combinator
Tips & Tricks for Feature Engineering / Applied Machine Learning
One commenter put it best: 'Probably the best feature engineering slides I have found [on] the internet.' Need we say more?-HJ van Veen
Learning about Machine Learning with an Earthquake Example
How well can we predict whether or not someone is prepared for an earthquake?-Simply Statistics
How Fitbit’s data science team scales machine learning
Workout regimens need to be tailored to each individual. Directional correctness isn’t enough. Fitbit’s head of data science shares how his team builds a model for every user to increase motivation and prevent injuries.-Mixpanel
Fake News Challenge
This grassroots effort is inviting teams to harness AI technologies to help human fact checkers identify hoaxes and deliberate misinformation in news stories. The top three teams get a cash prize, so grab a couple of friends and check out the training dataset.-Fake News Challenge
Machine Learning Videos
More of a visual learner? Here’s a repository of recorded talks at machine learning conferences, workshops, seminars, and more.-Dustin Tran
What is artificial intelligence? A three part definition
“As soon as it works, no one calls it AI anymore.”-Simply Statistics
Poesy
You could be a poet, and not know it. Feed the works of your favorite author through this new Python library to generate as many lines of verse as you want.-Anthony Federico
What I Learned Implementing a Classifier from Scratch in Python
With libraries like scikit-learn, it’s easy to run an algorithm on some data and automagically get an answer—without understanding exactly how you arrived there. Prepare to unpack the black box.-Jean-Nicholas Hould
What’s the state of the job market in data science and machine learning?
“Th[e] proliferation of courses, resources, books and startups would hint that machine learning is becoming more and more accessible to the average programmer and that the market is on track to getting saturated quickly. Is this the current trend?”-Hacker News
20 Weird & Wonderful Datasets for Machine Learning
Getting your hands on a robust dataset is the hardest part of machine learning. Finding interesting datasets is tougher still. From UFO sightings to beautiful Flickr photos, you’re sure to find something to train your model.-Oliver Cameron
Deep-Fried Data
Opening your data can lead to unpredictable benefits, but requires being open to unexpected uses of your data.-Idle Words
Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math
This essay is a godsend for those of us who have trouble understanding or explaining what exactly deep learning is.-WIRED
Boosting Sales With Machine Learning
One developer shares how his team used natural language processing and machine learning in Python to pre-qualify sales leads so reps don’t have to spend hours doing it manually.-Xeneta
Hybrid Intelligence: How Artificial Assistants Work
When humans and machines work together, they accomplish a lot more than either could on their own. This is known as hybrid intelligence—a pretty intimidating term for those unfamiliar with machine learning. Here’s a breakdown.-Clare Corthell
The real prerequisite for machine learning isn’t math, it’s data analysis
Machine learning amateurs, take heart. Proficiency with high level math may be essential for machine learning theory. But with out-of-the-box tools like R’s gmodels package or Python’s scikit-learn library, you don’t need to know linear algebra or calculus to build a successful predictive model. You do, however, need to know your way around a dataset.-Sharp Sight Labs
How Kalman Filters Work, Part 1
This article unpacks different filtering algorithms in an incredibly intuitive way. It’s a long read, but you’ll come away having learned a ton (did you know that NASA used Kalman filters to help Apollo spacecraft navigate to the moon?).-An Uncommon Lab
Microsoft’s Tay is an Example of Bad Design
0r Why Interaction Design Matters, and so does QA-ing.-Caroline Sinders
Here's How We Prevent The Next Racist Chatbot
Tay.ai is the consequence of poor training-Popular Science
Why Microsoft Accidentally Unleashed a Neo-Nazi Sexbot
It’s not surprising that Microsoft’s chatbot spewed racist invective, but here’s how it could have been avoided.-MIT Technology Review
Explained Visually
This website is an incredible collection of interactive visualizations aimed at making tricky concepts like Markov chains and regression easy to understand. Schedule a few hours to explore this one—you’re gonna need them.-Explained Visually
Lift analysis - A data scientist’s secret weapon
Learn how to spot flaws in machine learning models with lift analysis (and why you should add it to your list of evaluation metrics).-Andy Goldschmidt
We Now Have Algorithms To Predict Police Misconduct
You’ve probably heard of predictive policing, but what about predictive policing for the police? One police department teamed up with researchers to test an algorithm that detects troublesome behavior of officers early on.-FiveThirtyEight
Are Your Predictive Models like Broken Clocks?
How can you ensure you’ve picked the “right model” for a very big and very complex dataset?-Rocket-Powered Data Science
Startups Aim to Exploit a Deep-Learning Skills Gap
What do you do when every company wants to build a deep-learning network, but the experts are in short supply? Launch a product, of course. Some startups have created computer chips and software libraries that can accelerate algorithm training, all without having to hire an experienced team of deep-learning experts.-MIT Technology Review
Georgia Tech Researchers Demonstrate How the Brain Can Handle So Much Data
Random projection is frequently used in machine learning to make sense of big, diverse data. It turns out this method could be one of the ways that humans learn, too.-Georgia Tech
The current state of machine intelligence 2.0
These days, it feels like every other article in our newsfeeds is touting the potential of machine intelligence. This article cuts through the hype and presents this year’s major accomplishments in two categories—“(1) the emergence of autonomous systems in both the physical and virtual world and (2) startups shifting away from building broad technology platforms to focusing on solving specific business problems.”-O'Reilly