It’s been over a decade since I last worked with the Python programming language on day-to-day basis, but it’s pretty clear from the libraries, tutorials, and projects that Python is the defacto standard for data science and machine learning. Here’s how to get up and running, on both Windows and Linux, with the latest toolkits.
Windows (10 Desktop)
- Download and install Anaconda with Python 3.5.
pip install theano algopyfrom the Anaconda shell.
NOTE: Windows is not supported for TensorFlow currently.
Linux (Ubuntu 14.04 LTS)
sudo apt-get install python3-setuptools
sudo easy_install3 pip
sudo apt-get install python 3.5-dev
sudo apt-get install python3.4-dev
sudo pip3 install theano scikit-learn algopy pandas https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.9.0-cp34-cp34m-linux_x86_64.whl
NOTE: These directions worked in June 2016, expect the versions to be different later on.
Following these instructions will give you an isolated install of Python 3 with lots of great libraries, ready to go:
- numpy: Numerical Python library offering linear algebra and useful matrix functions.
- scipy: Scientific Python library with many useful numerical routines.
- algopy: Algorithmic Differentiation library for evaluation of higher-order derivatives.
- pandas: Pandas is great for data analysis and manipulation.
- sklearn: Scikit Learn offers machine learning algorithms.
- theano: Theano offers deep learning algorithms, building on numpy.
- tensorflow: TensorFlow, from Google, offers numerical computation using data flow graphs.
To test out a successful install, try
import theano, numpy, scipy, algopy, sklearn, pandas, tensorflow in the python shell. No errors, means a successful install. I’ll be delving into the use of these tools in future posts.