24 June 2018

It behooves me to address machine learning as a separate field from software engineering due to it having a different focus while rapidly expanding in importance. I’m not just jumping on the bandwagon either as I started with ML projects and coursework around 10 years ago.

For exploration and learning, Python is probably one of the best environments to work in due to it having some of the most important ML libraries available for it and also being highly popular in the ML community.

That being said, working with Python on macOS takes a little extra setup if you want to have a clean and manageable environment. The key idea is to avoid messing with the system Python that comes with macOS. That one should be left untouched for use by the OS because it can be changed during OS upgrades.

Instead, I recommend pyenv and pyenv-virtualenv for creating separate Python environments that can be reserved for machine learning and other purposes.

The reason to do this is because once you setup your ML libraries, you don’t want them or their dependencies to change if you need to use Python for something else. Also, it allows you to have separate versions of Python for separate purposes. Python 2.7.x is still needed for tasks like building Chromium.

I’ve recommended pyenv-virtualenv in addition to pyenv because it allows the creation of separate environments under the same version of Python whereas pyenv by itself is used for installing different versions of Python.

Please see the latest installation docs, links below, for each tool at their respective Github pages.

On macOS, you will want to have Python installed as a framework to use features like integrated plotting with matplotlib. This can be done by setting an environment variable during the installation of your desired version.

$ PYTHON_CONFIGURE_OPTS="--enable-framework" pyenv install 3.6.5

This is only needed for the base version and is unnecessary for subsequent virtual environments installed by a command like

$ pyenv virtualenv 3.6.5 python-3-for-ml

Then, switching into a virtual environment can be done with pyenv alone. For example

$ pyenv shell python-3-for-ml

You can always see what is installed using the versions argument. On my system, I have something like the following:

$ pyenv versions
* miniconda3-latest (set by PYENV_VERSION environment variable)

When using pip for package management, installed packages can be listed with pip list.

For example:

$ pip list
Package            Version
------------------ --------
appnope            0.1.0
asn1crypto         0.24.0
attrs              19.1.0
backcall           0.1.0
bleach             3.1.0
botocore           1.12.86
certifi            2019.3.9
cffi               1.11.5
chardet            3.0.4
colorama           0.3.9
conda              4.6.14
cryptography       2.6.1
cycler             0.10.0
dbgp               1.0
decorator          4.4.0
defusedxml         0.6.0
docutils           0.14
entrypoints        0.3
idna               2.7
ipykernel          5.1.0
ipython            7.5.0
ipython-genutils   0.2.0
ipywidgets         7.4.2
jedi               0.13.3
Jinja2             2.10.1
jmespath           0.9.3
jsonschema         3.0.1
jupyter            1.0.0
jupyter-client     5.2.4
jupyter-console    6.0.0
jupyter-core       4.4.0
kiwisolver         1.0.1
MarkupSafe         1.1.1
matplotlib         3.0.2
mistune            0.8.4
nbconvert          5.5.0
nbformat           4.4.0
notebook           5.7.8
numpy              1.15.4
pandas             0.23.4
pandocfilters      1.4.2
parso              0.4.0
pexpect            4.7.0
pickleshare        0.7.5
pip                19.1.1
prometheus-client  0.6.0
prompt-toolkit     2.0.9
ptyprocess         0.6.0
pyasn1             0.4.5
pycosat            0.6.3
pycparser          2.18
Pygments           2.4.0
pyOpenSSL          18.0.0
pyparsing          2.3.0
pyrsistent         0.14.11
PySocks            1.6.8
python-dateutil    2.8.0
pytz               2018.7
PyYAML             3.13
pyzmq              18.0.0
qtconsole          4.4.4
requests           2.19.1
rsa                3.4.2
ruamel-yaml        0.15.46
s3transfer         0.1.13
scikit-learn       0.20.0
scipy              1.1.0
Send2Trash         1.5.0
setuptools         40.2.0
six                1.11.0
sklearn            0.0
terminado          0.8.2
testpath           0.4.2
tornado            6.0.2
tqdm               4.28.1
traitlets          4.3.2
urllib3            1.23
virtualenv         16.1.0
wcwidth            0.1.7
webencodings       0.5.1
wheel              0.31.1
widgetsnbextension 3.4.2

blog comments powered by Disqus