A few months ago, Stack Overflow confirmed what many people have been predicting for years, and crowned Python king of the software jungle. According to the results of their 2017 survey, Python is currently both the fastest-growing language and the most visited Stack Overflow tag.*
Among data scientists and analysts, too, Python is a top contender for the title of “scripting language of choice.” Kaggle's 2017 survey found that Python is currently the most commonly-used tool among data scientists in general (though statisticians still prefer R).
Python's well-developed ecosystem of scientific computing libraries (like NumPy, SciPy, and pandas) lets analysts use Python for solving all kinds of problems. You can pull, clean, and manipulate data in Python, or run statistics, models, and predictions. Python notebooks (available natively in Mode) have drawn lots of fans among analysts since they make literate programming (in which you present code or results surrounded by contextualizing prose) easy.
Whether you're just getting started with Python for data science, or you're already a pro Pythonista, listening in on conversations among influencers in the community is a great way to stay up to speed. We've compiled a list of the people we follow most closely in the Python data science community. If you want to stay in the know, too, here are 7 people you should follow on Twitter.
- Wes McKinney. Creator of pandas, the most widely-used Python library for data analysis. Author of Python for Data Analysis. Senior Vice President and Software Architect at Two Sigma.
Follow Wes: @wesmckinn
- Hilary Mason. Founder of Fast Forward Labs. Data Scientist in Residence at Accel. Founder of HackNY and co-host of DataGotham.
There's nothing like real data to surprise with odd artifacts or strange fields. What's the weirdest thing you've seen in a data set lately?— Hilary Mason (@hmason) October 24, 2017
Follow Hilary: @hmason
- Jake VanderPlas. Author of Python Data Science Handbook. Senior Data Science Fellow and Director of Research at the University of Washington's eScience institute.
Idea: Jupyter notebooks could have a "reproducibility mode" where:— Jake VanderPlas (@jakevdp) November 27, 2017
1) Code cells are read-only once executed
2) New code cells cannot be inserted above previously executed cells
3) No cell can be executed until all previous cells are executed
Follow Jake: @jakevdp
- Sarah Guido. Co-author of Introduction to Machine Learning with Python. Senior Data Scientist at Mashable. Co-organizer of the New York Python Meetup.
Follow Sarah: @sarah_guido
- Renee Teate. Creator and host of Becoming a Data Scientist podcast. Creator of the Data Science Learning Club. Data Scientist at Helio Campus.
Highest-rated courses/online course programs on @DataSciGuide so far:— Data Science Renee (@BecomingDataSci) November 17, 2017
- @DataCamp https://t.co/IJLLwTdmM6
- @dataquestio https://t.co/lyKj9bdQ7Q
- The Analytics Edge https://t.co/AVvDD81XLU
- Harvard CS 109 https://t.co/yYY7NdNBoU
Follow Renee: @becomingdatasci
- Lorena Mesa. Co-organizer of PyLadies Chicago and Tech Ladies Chicago city organizer. Python Software Foundation director. Software engineer on data science at Sprout Social.
Follow Lorena: @looorenanicole
- Randy Olson. Senior Data Scientist at Penn Institute for Biomedical Informatics. Community leader for DataIsBeautiful. Co-organizer of Data Science Philly.
Colaboratory looks like a great new tool from #Google: Now you can run Jupyter Notebooks in a Google Docs-like environment. No local install required.— Randy Olson (@randal_olson) November 7, 2017
Unfortunately, only supports #Python 2.7. #MachineLearning #DataSciencehttps://t.co/kIeXwvZkNZ pic.twitter.com/Yz7vfs5XlK
Follow Randy: @randal_olson
Are there Python data science influencers you think we should add to this list? We'd love to hear who you follow! Reach out to us on the forum, where you can talk to other analysts and data scientists and compare notes.
*This statistic applies to tags in high-income nations only.