It behooves me to address machine learning as a separate field from software engineering due to it having a different focus while rapidly expanding in importance. I’m not just jumping on the bandwagon either as I started with ML projects and coursework around 10 years ago.
For exploration and learning, Python is probably one of the best environments to work in due to it having some of the most important ML libraries available for it and also being highly popular in the ML community.
That being said, working with Python on macOS takes a little extra setup if you want to have a clean and manageable environment. The key idea is to avoid messing with the system Python that comes with macOS. That one should be left untouched for use by the OS because it can be changed during OS upgrades.
Instead, I recommend
pyenv-virtualenv for creating separate Python environments that can be reserved for machine learning and other purposes.
The reason to do this is because once you setup your ML libraries, you don’t want them or their dependencies to change if you need to use Python for something else. Also, it allows you to have separate versions of Python for separate purposes. Python 2.7.x is still needed for tasks like building Chromium.
pyenv-virtualenv in addition to
pyenv because it allows the creation of separate environments under the same version of Python whereas
pyenv by itself is used for installing different versions of Python.
Please see the latest installation docs, links below, for each tool at their respective Github pages.
On macOS, you will want to have Python installed as a framework to use features like integrated plotting with matplotlib. This can be done by setting an environment variable during the installation of your desired version.
$ PYTHON_CONFIGURE_OPTS="--enable-framework" pyenv install 3.6.5
This is only needed for the base version and is unnecessary for subsequent virtual environments installed by a command like
$ pyenv virtualenv 3.6.5 python-3-for-ml
Then, switching into a virtual environment can be done with
pyenv alone. For example
$ pyenv shell python-3-for-ml
You can always see what is installed using the
versions argument. On my system, I have something like the following:
$ pyenv versions system 2.7.15 3.6.5 3.6.5/envs/python-3-for-ml 3.7.1 * miniconda3-latest (set by PYENV_VERSION environment variable) python-3-for-ml
pip for package management, installed packages can be listed with
$ pip list Package Version ------------------ -------- appnope 0.1.0 asn1crypto 0.24.0 attrs 19.1.0 backcall 0.1.0 bleach 3.1.0 botocore 1.12.86 certifi 2019.3.9 cffi 1.11.5 chardet 3.0.4 colorama 0.3.9 conda 4.6.14 cryptography 2.6.1 cycler 0.10.0 dbgp 1.0 decorator 4.4.0 defusedxml 0.6.0 docutils 0.14 entrypoints 0.3 idna 2.7 ipykernel 5.1.0 ipython 7.5.0 ipython-genutils 0.2.0 ipywidgets 7.4.2 jedi 0.13.3 Jinja2 2.10.1 jmespath 0.9.3 jsonschema 3.0.1 jupyter 1.0.0 jupyter-client 5.2.4 jupyter-console 6.0.0 jupyter-core 4.4.0 kiwisolver 1.0.1 MarkupSafe 1.1.1 matplotlib 3.0.2 mistune 0.8.4 nbconvert 5.5.0 nbformat 4.4.0 notebook 5.7.8 numpy 1.15.4 pandas 0.23.4 pandocfilters 1.4.2 parso 0.4.0 pexpect 4.7.0 pickleshare 0.7.5 pip 19.1.1 prometheus-client 0.6.0 prompt-toolkit 2.0.9 ptyprocess 0.6.0 pyasn1 0.4.5 pycosat 0.6.3 pycparser 2.18 Pygments 2.4.0 pyOpenSSL 18.0.0 pyparsing 2.3.0 pyrsistent 0.14.11 PySocks 1.6.8 python-dateutil 2.8.0 pytz 2018.7 PyYAML 3.13 pyzmq 18.0.0 qtconsole 4.4.4 requests 2.19.1 rsa 3.4.2 ruamel-yaml 0.15.46 s3transfer 0.1.13 scikit-learn 0.20.0 scipy 1.1.0 Send2Trash 1.5.0 setuptools 40.2.0 six 1.11.0 sklearn 0.0 terminado 0.8.2 testpath 0.4.2 tornado 6.0.2 tqdm 4.28.1 traitlets 4.3.2 urllib3 1.23 virtualenv 16.1.0 wcwidth 0.1.7 webencodings 0.5.1 wheel 0.31.1 widgetsnbextension 3.4.2