NLP with Spacy. Making machine learning actually useful. Adventures in R.
July 6, 2020
Pull request study. Ethics in NLP. Tips for a PhD.
June 29, 2020
The gaps between white and Black America. GitHub actions for data science. Common table expressions.
June 22, 2020
Data as protest. Causal inference. Machine learning in production.
June 15, 2020
Amplify Black voices. Protect Black lives.
June 8, 2020
Bye, Internet Explorer. Exploring missing values. Pixelate to communicate.
June 1, 2020
Ubuntu ethics in AI. Rules-based models. Scalable user privacy.
May 25, 2020
Fuzzy name matching. Solar’s cheap future. Deleting data across microservices.
May 18, 2020
Animal Crossing sentiment analysis. Mode improvements. Tricky predictions.
May 11, 2020
Boosting A/B test power. Shortcuts of Mode power users. Uncertainty visualization.
May 4, 2020
What developers need to know about databases. NYC sidewalk widths. Unprecedented line charts.
April 27, 2020
Machine learning in R. Homeschooling + working.
April 20, 2020
Slack guidelines. Rice measurements. Hadley Wickham interview.
April 13, 2020
R packages for exploratory analysis. Building data science infrastructure. Making model diagrams.
April 6, 2020
Weathering economic headwinds. Building DS infrastructure. Visualizing different things about COVID-19.
March 30, 2020
COVID-19 tracking project. Data inclusivity. Medical lit primer.
March 23, 2020
Responsible coronavirus viz. Going remote. Exponential growth.
March 16, 2020
Twitter for R programmers. Big NLP database. DS research challenge areas.
March 9, 2020
Grubhub’s model deployment tools. Deep learning book series. Gender identify data.
March 2, 2020
How AI businesses are different from software. Data-driven weather forecasting. Recommender model metrics.
February 24, 2020
Understanding uncertainty. Bokeh tutorial. Everyday dev tools.
February 17, 2020
10x data scientists? No thanks. A Timsort history. Flow fields.
February 10, 2020
Farewell, OLAP cube. Data training for the underserved. Reinforcement learning curriculum.
February 3, 2020
Wayfair product ranking. Misconceptions about names. A first grade deep learning model.
January 27, 2020
Data engineering observability. Global power lines. Code skills check-in.
January 20, 2020
4 helpful distribution charts. A data project checklist. Interactive tools for ML.
January 13, 2020
Machine learning for the real world. A fun SQL problem. An R cookbook.
December 30, 2019
State of JS. Getting help in R. Keeping it simple.
December 23, 2019
ACLU’s data transformations. Board games + data viz. User-Agent history.
December 16, 2019
Deep learning in production. AI text adventures. Guido Van Rossum.
December 9, 2019
Machine learning systems design. Calculating customer types. Democratic primary.
December 2, 2019
Film flowers. A new R color palette. Machine learning engineering.
November 25, 2019
Working in football analytics. “Biased data.” Churn correlations.
November 18, 2019
A funny dataset. A cleaning package. A new way to measure intelligence.
November 12, 2019
Character encodings. Detecting audio deepfakes. DS archetypes.
November 4, 2019
Boring technology. Metrics problems. Racially-biased medical algorithms.
October 28, 2019
PyTorch vs TensorFlow. Country goodness. A neural net rabbit hole.
October 21, 2019
Introducing Helix. Time series databases. Fall foliage.
October 15, 2019
1-line data exploration. Survival analysis. Handling dates.
October 7, 2019
Neural network design. Chart taxonomies. Analyzing Amazon purchases.
September 30, 2019
Red flags in interviews. Data integrity. Fashion algorithm.
September 23, 2019
Lyft’s hyper-accurate maps. Neural net DM. Selling data science.
September 16, 2019
Data sculptures. Load testing. Abstractions.
September 9, 2019
DS salaries. UX of data. Stats 101.
September 3, 2019
Speedy ML. Fitness app data. NPS overnight stays.
August 26, 2019
Mastering Shiny. DeepMind’s losses. Music trends.
August 19, 2019
More time, please. Pandas tricks. Animating data viz.
August 12, 2019
Machine learning... learning. Behavior funnels. DS team models.
August 5, 2019
Vision science. R-Ladies. “De-identification.
July 29, 2019
Team names. Podcast structures. ML best practices.
July 22, 2019
Free NLP course. Tidy log odds ratio. Colorspace.
July 15, 2019
Practical psychology for DS. Filling in missing music. Deploying models and microservices.
July 8, 2019
BS AI industrial complex. Marketing automation at Lyft. Word2vec.
July 1, 2019
Hadoop’s failure. GANs everywhere. AI adoption.
June 24, 2019
Hadoop’s failure. GANs everywhere. AI adoption.
June 17, 2019
Neural nets name cats. Uncertainty in viz. Misunderstood data engineers.
June 10, 2019
Fashion deepfakes. Research quality data. Instagram analysis.
June 3, 2019
Career advice. Liverpool's analytics. Type stable estimation.
May 27, 2019
Fullstack D3. ML product management lessons. Matrices as tensor network diagrams.
May 20, 2019
Altair. Railyard. profvis.
May 13, 2019
May 6, 2019
Fooling AI surveillance. Skittles math. Data viz on mobile and desktop.
April 29, 2019
DataCamp reflections. dplyr filter. Deadline statistics.
April 22, 2019
Machine learning job hunt. Open GAN questions. The English block on programming.
April 15, 2019
Unintended consequences. Partnership culture. Active learning data labeling.
April 8, 2019
Data viz freelancing. Unpacking adversarial examples. Null reminder.
April 1, 2019
Escaping Excel hell. P-value put down. Harmonizing with Bach.
March 25, 2019
Advice for new data scientists. R surprises. Specialists vs generalists.
March 18, 2019
The art of analytical persuasion. Deep learning & mind reading. Time-to-event data.
March 11, 2019
AI for rats. SQL vs Python for pipelines. NYT's data editor.
March 4, 2019
Visual search engines. Face editing GANs. Tidy Tuesday.
February 25, 2019
Data versioning. Deep learning limitations. Successful career transitions.
February 19, 2019
Podcasts. Data viz criticism. Health risk scores.
February 11, 2019
ML project management. Analytics engineer. Minimally sufficient Pandas.
February 4, 2019
Data science jobs up 29%. Rstudio::conf recap. Curriculum roadmap.
January 28, 2019
Why analytics initiatives fail. Model representation in R. Bye, mid-range shot.
January 22, 2019
Uber's new AI. Location data for sale. Data roadmapping.
January 14, 2019
2018 AI trends. Gender bias in Google Images. The power of great analysts.
January 7, 2019
190+ R-stats tasks. KDEs. Selection bias.
January 2, 2019
The saddest Christmas song. Data science vs engineering. Mongo → Postgres.
December 26, 2018
Next-gen GANs. Tidy tutorial. AI art gallery.
December 17, 2018
5 soft skills analysts need. State of deep learning. AI detected fields and crops.
December 10, 2018
Airflow on your Macbook. Modeling time-lagged conversion rates. Google Earth open data.
December 3, 2018
Public opinion on algorithms. Obscure Tidyverse packages. Gov't data troves.
November 26, 2018
Data dictionaries. New McKinsey AI survey. Tidy Tuesday.
November 19, 2018
Analytics-DevOps harmony. Tidyeval tutorial. Client engagement.
November 13, 2018
3 million election ads. Other data science tools. Montezuma’s Revenge.
November 5, 2018
Satisfaction in ML careers. Making your product freemium. Relearning code.
October 29, 2018
A summer at Airbnb. Deepfakes look too real. Analysts on production.
October 22, 2018
Amazon's secret AI recruiting tool. Data PMs. Altair tutorial.
October 15, 2018
Chromebook data science. Shaky CDC data. An NLP history lesson.
October 8, 2018
Citizen data scientists. ML for coders. Connections across America.
October 1, 2018
Anatomy of an AI system. Reproducing ML projects. 5 public datasets to explore.
September 24, 2018
Strata slides. A DS ethics checklist. Life expectancy in your neighborhood.
September 17, 2018
PhD considerations. SQL vs Python. Machine translators.
September 10, 2018
California's wild fires. Integrating Ml into a product. Twitter toxicity.
August 27, 2018
SQL queries for Salesforce. The beauty of annotations. Capturing data evolution.
August 20, 2018
BigQuery table clusters. Unpacking an NLP Twitter thread. First days on the job.
August 13, 2018
3 million troll tweets. The Holy Grail of email. Partitioning variation.
August 6, 2018
IMDb analysis. Differentiatiable image parameterizations. ACL 2018 highlights.
July 30, 2018
W. E. B. Du Bois' data viz. Machine learning glossary. Speedier R work.
July 23, 2018
What do ML practitioners do? Ditching microservices. Feature-wise transformations.
July 16, 2018
What 85-year-olds are up to. Red flags in interviews. 12 ggplot2 extensions.
July 9, 2018
2018 data viz survey. Data engineering frameworks. ML papers with code.
July 2, 2018
Constrained optimization. Census oddities. Bias-variance tradeoff.
June 25, 2018
ML's future is tiny. One year as a data scientist. Problems with gender classification.
June 18, 2018
Shazam, for Congress. The future of data engineering. Trustworthy data analysis.
June 11, 2018
Better training data. Rethinking academic data sharing. Volcanic history.
June 4, 2018
purrr tutorial. LaCroix color palettes. The challenges of Smart Compose.
May 21, 2018
Strategies for optimizing Python code. The NYC subway crisis. Russian Facebook ads.
May 14, 2018
Attracting top notch candidates. Data violence. Linear vs log scale.
May 7, 2018
Grubhub's seemingly impossible data problem. Qualitative before quantitative. R package dependencies.
April 30, 2018
Mode Studio. A Shiny app for Fido. Streaming 100 billion analytics events.
April 23, 2018
Why data scientists should take a hippocratic oath. Machine learning at Conde Nast. New viz tools.
April 16, 2018
Odes to notebooks. Overcoming objections. Lumpers and splitters.
April 9, 2018
Academia → industry. Lessons from video games. TensorFlow sans setup.
April 2, 2018
A massive NCAA data set. GANs + art. The benefits of blameless postmortems.
March 26, 2018
Stack Overflow's developer survey. Docker for deep learning. Data-driven unit testing.
March 19, 2018
SQL → Pandas. Prophet in Mode. Visualizing outliers.
March 12, 2018
Love for the star schema. 8 in-app analytics examples. “Wall time” semantics.
March 5, 2018
Down with pipeline debt. Malicious use of AI.
February 26, 2018
Pythonic cookies. A data viz engineer definition. Manifesto for data practices.
February 19, 2018
The Olympics. ML models you haven't built. Visualizing missing data.
February 12, 2018
Awesome in-app analytics. Another data privacy mishap. DJ Patil rallies the troops.
February 5, 2018
Window functions in Python & SQL. The future of pandas. Free resource roundup.
January 29, 2018
Data engineering for dummies. The mortality rate of JS frameworks. A new DS podcast.
January 22, 2018
Graphics reporter Q&A. Automating front-end. Job hunting post mortem.
January 15, 2018
Selecting a cloud provider. Academia → industry. Early-stage analytics.
January 8, 2018
Junior DS roles. A literal gamechanger. ML technical debt.
January 2, 2018
The next Bechdel Test. PyCon proposal myths. A re:Invent recap.
December 26, 2017
Apache Airflow, explained. The 3rd dimension of customer success analysis. Molecule's custom reports.
December 18, 2017
U.S. AI threatened. NIPS highlights. Median pitfalls.
December 11, 2017
Netflix's A/B test alternative. Predicting palliative care. Building a deep learning library.
December 4, 2017
Ethics in practice. Data meta-metrics. Numerical optimization.
November 27, 2017
Crossword heatmaps. Generative music. Tips for building a diverse data team.
November 20, 2017
The 3-degree world. Causal inference. Changepoint analysis.
November 13, 2017
Halloween episodes. Ethical responsibility. Word cloud designs.
November 6, 2017
4 data mistakes startups make. Interviewing data scientists. A massive font database.
October 30, 2017
NBA analytics. Power calculations. State of data journalism.
October 23, 2017
12 A/B test pitfalls to avoid. Taxis vs cabs vs the subway. Evaluating ETL tools.
October 16, 2017
zulily's data platform. R for journalists. The NYC job search.
October 9, 2017
Debunking studio exec claims. An ETL company's stack. What closes deals.
October 2, 2017
Accelerating GeoPandas. New R community. Making analytics meaningful.
September 25, 2017
Finding the best data jobs. Legos + text mining. Scalable machine learning.
September 18, 2017
10x data scientists. 30 years of hurricanes. Communicating uncertainty.
September 11, 2017
Language gaps. Packaging metrics. A Python cheat sheet.
September 4, 2017
Foursquare's location intelligence. Giving your first data science talk. 10 Python mistakes.
August 28, 2017
Optimizing for Burning Man. Choosing an ETL tool. Scaling with Python.
August 14, 2017
Cargo cult data science. Millions of Intercom messages. Query optimization.
August 7, 2017
Predicting LTV at Airbnb. Technical debt in ML. What's difficult about histograms.
July 31, 2017
Gender representation in comics. Data systems. Designing enterprise tables.
July 24, 2017
Joy plots. How to spot a misleading graph. Marrying UX & ML.
July 17, 2017
Rise of the data PM. Augmented reality viz. New NYC boroughs.
July 10, 2017
Optimizing Reddit submissions. R at Microsoft. Blogging about data.
July 3, 2017
Millions of doodles. 2 years at Stack Overflow. Coding on the go.
June 26, 2017
3 stages of data infrastructure. 29 common Python errors. 200,000 Uber and Lyft trips.
June 19, 2017
Root cause analyses. How histograms work. Analytics at Athos.
June 12, 2017
The MLB's new metric. How to hire a product analyst. The Paris Agreement.
June 5, 2017
Airbnb's Data University. 30 GBs of federal payroll records. The top DS software.
May 29, 2017
Big news from Mode. The Hitchhiker's Guide to d3.js. Detecting overspend in AWS.
May 22, 2017
Duolingo's language learning model. Etsy's development process. Instacart's strategy for building DS teams.
May 15, 2017
Winning marital arguments with R. 3 million Instacart orders. Dashboards that deliver.
May 8, 2017
Spotify's event delivery system. Craft beers and Python. Data viz vs UI.
May 1, 2017
Machine learning flash cards. Teaching SQL. Analytics trends in 2017.
April 24, 2017
Statistics in D3. Proving yourself without a degree. More on interactive viz.
April 17, 2017
Avoiding analytic rabbit holes. The Data Wheel of Death. Rebuilding an analytics stack.
April 10, 2017
ML for product managers. Analytics for startup founders. Scrabble analyses.
April 3, 2017
Group-by from scratch. Corporate data viz. Test-driving Prophet.
March 27, 2017
Switching programming languages. Data hackathons. Is interactive viz done for?
March 20, 2017
A data GIF tutorial. DS on the Silicon Beach. Blind date data.
March 13, 2017
Hiring a data scientist. The future of Airflow. Advice for switching careers.
March 6, 2017
Predicting earthquake preparedness, partisan conflict, and feature engineering.
February 27, 2017
Online DS courses, ranked. Critical data literacy. Unlearning descriptive statistics.
February 20, 2017
Spotting visualization lies. Data humanism. Encoding categorical values.
February 13, 2017
Mode's stance on Trump. ML at Fitbit. The cleanest NYC restaurants.
February 6, 2017
Data science at Stitchfix. ML videos. A data engineer's manifesto.
January 30, 2017
Redefining “AI.” Behind-the-scenes of sports analytics. Building a master data dictionary.
January 23, 2017
Uber Movement. Q&A w/ Monica Rogati. Visual vocabulary.
January 16, 2017
The NFL and causal inference. Generating poetry with Python. Classifiers from scratch.
January 9, 2017
Mid-career pivots. TV fandoms. Rationality + empathy.
January 2, 2017
Star Wars casualties. The state of the DS job market. CAC calculations.
December 26, 2016
#DataRefuge. A chat with DJ Patil. Analyzing Google trends.
December 19, 2016
Skittles debates. Time series analysis in Python. Ditching vanity metrics.
December 12, 2016
A data detective story. BitTorrent for professors. Seasonality in search engines.
December 5, 2016
Rebuilding trust in analytics, data limitations, and a text analysis tutorial.
November 28, 2016
Data skills we all need, election post mortems, and runner routes.
November 21, 2016
UFO sightings data. 415 viz tools. The science of unpredictability in... science.
November 14, 2016
Why data projects fail. An AI speechwriter. The end of baseball's analytics war.
November 7, 2016
Flash forecasting. The father of soccer analytics. A new viz technique.
October 31, 2016
The problem with North Star metrics, the secret to designing smart products, and the popular vote.
October 24, 2016
The impact of outliers, marathon performance, and why machine learning is like deep frying.
October 17, 2016
Nobel Prize winners, your typical farmers market, and The Simpsons side characters.
October 10, 2016
The year's best data visualizations, fact checking the debate, and movie magic with data viz.
October 3, 2016
Gender roles in Hollywood, stats for soccer fans, and four results from one election poll.
September 26, 2016
Summary analysis, creativity in data viz, and the income increase.
September 19, 2016
Digital economists, swing states, the art of asking good questions.
September 12, 2016
The pros and cons of urban cycling, rebuilding a Graphics team, and the joys of dot plots.
September 5, 2016
One color palette generator, 8 Python data cleaning libraries, and the fastest men in the world.
August 29, 2016
Visualizing clickbait, counting conundrums, and the problem with the Rio pool.
August 22, 2016
Trump tweets, Olympic data viz, and tips for designing better tables.
August 15, 2016
A Star Trek network viz, ethics for algorithms, and the Olympics.
August 8, 2016
Data viz developments, dodgy statistics, and genomics.
August 1, 2016
Amazon reviews, Bayesian thinking explained visually, and dashboard design.
July 25, 2016
Pop music genealogy, FiveThirtyEight's R workflow, and a series of stunning drought maps.
July 18, 2016
Data mining story arcs, theories of everything, and the history of the infographic.
July 11, 2016
Feature engineering, cartogram challenges, and an analysis Leslie Knope would love.
July 4, 2016
Plus: Mode now supports Plotly, data science portfolios, and fantasy football.
June 27, 2016
Escaping Excel hell, real-time dashboards, and the Data Journalism Awards.
June 20, 2016
Pie chart research, a Python cheat sheet, and machine learning for sales.
June 13, 2016
50 years of pop music, hybrid intelligence, and HBR data viz advice.
June 6, 2016
Tighter data security, Foursquare + Uber, and data anonymization best practices.
May 30, 2016
News on the Python front, 24 charting tools, and SF rental prices.
May 23, 2016
Sales data, pandas video tutorials, and data science in healthcare.
May 16, 2016
Kalman filters, pandas tutorials, and why newsrooms should own their data.
May 9, 2016
The power of proprietary data, pirated papers, and a glorious data viz catalog.
May 2, 2016
12 data science methods and 1 big HBO show.
April 25, 2016
Thumbtack's data stack, storytelling at Jawbone, and 15 data viz interviews.
April 18, 2016
FBI spy planes, measuring MRR, and the Hollywood gender gap.
April 11, 2016
Game of Thrones, scaling the data science org, and conversion optimization.
April 4, 2016
Lift analysis, genocidal chatbots, and the plight of pie charts.
March 28, 2016
Moneyball for book publishers, CAC, and the engineer-analyst relationship.
March 21, 2016
Back-end analytics, help center metrics, and predicting police misconduct.
March 14, 2016
Shattering NBA records, 2 million chess games, and statistically significant growth hacking.
March 7, 2016
Punctuation in code, PM employee onboarding advice, and practical data science skills.
February 27, 2016
If Facebook were a pollster, BuzzFeed analytics, and the virtues of keeping things simple.
February 21, 2016
Interstellar cover songs, 10 TED talks, and the presidential primary.
February 14, 2016
Central Africa's dearth of data, alternatives to open data portals, and data viz empathy.
February 6, 2016
Flint failings, research parasites, and Disney princess linguistics.
February 1, 2016
Sabermetrics, DAU, and holiday shopper retention.
January 25, 2016
Edtech analytics, Python prep, and Powerball.
January 18, 2016
Missing ordinals, football analytics, deep-learning chips, and more.
January 11, 2016
'Star Wars,' random projection, inviting dissent, and Nick Felton's final report.
January 4, 2016
A machine intelligence progress report, mesmerizing viz, insightful data science talks, and delivery analytics.
December 21, 2015
Best practices, Google's effect on the 2016 election, climate change, p-values, and data-related stocking stuffers.
December 14, 2015
Smiles, agriculture, Airbnb's data release, and more.
December 7, 2015