Python Tutorial
Learn Python for business analysis using real-world data. No coding experience necessary.
Start Now
Mode Studio
The Collaborative Data Science Platform
Statsmodels
As its name implies, statsmodels is a Python library built specifically for statistics. Statsmodels is built on top of NumPy, SciPy, and matplotlib, but it contains more advanced functions for statistical testing and modeling that you won't find in numerical libraries like NumPy or SciPy.
Statsmodels tutorials
The tutorials below cover a variety of statsmodels' features.
Linear regression
- A friendly introduction to linear regression (using Python) (Data School)
- Linear Regression with Python (Connor Johnson)
- Using Python statsmodels for OLS linear regression (Mark the Graph)
- Linear Regression (Official statsmodels documentation)
Multiple regression
- Multiple Regression using Statsmodels (DataRobot)
Logistic regression
Time series analysis
- A Simple Time Series Analysis Of The S&P 500 Index (John Wittenauer)
- Time Series Analysis in Python with statsmodels (Wes McKinney, Josef Perktold, and Skipper Seabold)
- Time Series Analysis (Official statsmodels documentation)
Statistical tests
- Regression Diagnostics and Specification Tests (Official statsmodels documentation)
Statsmodels resources
- Chapter 11: Regression of Think Stats (Allen B. Downey) - This chapter covers aspects of multiple and logistic regression in statsmodels. It explains the concepts behind the code, but you'll still need familiarity with basic statistics before diving in.
- The statsmodels section of Cross Validated - A question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization.
- Logistic regression vs. multiple regression (CoolData) - Not Python related, but this provides a helpful breakdown of the differences between logistic and multiple regression.
- Official statsmodels documentation