Advanced Installation

This document details installation philosophies and approaches for maintaining reproducible packages.

Philosphophy on Reproducibility

Problem: The same package can behave differently based on:

  1. the dependency environment and versions
  2. third-party updates to dependencies
  3. operating system (OS)

The first and second items are prevented with pinned dependencies, one method for repeatable packages. The third item is prevented through continuous integration, specifically with Travis and Appveyor (for Linux and Windows systems respectively). We will discuss proposals for the first two items only.

Proposal: We endorse freezing dependencies at the start and end of the delevopment release cycle.

  • Start: freeze the conda enviroment in an environment.yaml file
  • End: freeze all dependencies in dev_requirement.txt and critical dependencies in requirements.txt files.
In [ ]:
# TODO: Add diagram of trends in typical release cycle; show start and end freezings.

How Reprodicibility Is Approached in LamAna

If packages work flawlessly, reproducible environments are generally not necessary for successful package use. Reproducible enviroments do become important when dependencies conflict with the package due to bugged patches or API changes in sub-dependencies. LamAna support either a “hands-off” or “hands-off” approach to versioning dependencies.

Hands-off Approach

By default, LamAna (and most Pythonic packages) assume that dependencies are coded with minimal API changes that intentionally break code. For example, sub-dependencies may require non-pythonic extensions to build correctly such as C/C++ compiliers. If so, warnings are commonly issued to the users. With this in mind, users can simply:

$ pip install lamana

This command searches for dependencies in the install_requires key of the setup.py file. Dependencies intentionally unpinned here, which means a user will download the latest version of every dependency listed.

Hands-on Approach

In the case where a dependency change breaks the package, the user is empowered to recreate a the dependency environment in which the release was oringially developed and known to work. The recreated environment installs pinned dependencies from a frozen requirements.txt file. This file represents the last list of known dependencies to a work with package correctly.

$ pip install -r </path/to/requirements.txt>

$ pip install lamana                        # source

Locating this file is not hard. Each release is shipped with this a requirements.txt file. The file simply needs to be download from the archives of the correct version of lamana hosted at GitHub releases or search on PyPI. Extract the file to your computer and run the commands.

It should be noted that installing pinned dependencies will change the current environement by upgrading or more likely downgrading existing packages to versions assigned in the requirements file. A developement environment is recommended for testing installations.

Installing from wheels (optional)

Sometimes installing from source is slow. You can force the latter installation method to install with faster binaries.

$ pip install lamana --use-wheel            # binary

Creating a developer environment with conda

The latter methods can be very slow, especially when intalling dependencies that rely on C extensions (numpy, pandas). Anaconda serves as the most consistent option for building dependencies and sub-dependencies. Here is a supporting rationale for using conda in travis. The following creates a fresh conda environment with critical dependencies that trigger installation of sub-dependencies required for LamAna.

$ git clone -b <branch name> https://github.com/par2/lamana
$ conda create -n <testenv name> pip nose numpy matplotlib pandas
$ source activate <testenv name>       # exclude source for Windows
$ pip install -r dev_requirements.txt
$ pip install .                        # within lamana directory

The first command downloads the repo from a spefic branch using git. The second command creates a reproducbile virtual environment using conda where therein, isolated versions of pip and nose are installed. Specific dependencies of the latest versions are downloaded within this environment which contain a necessary backend of sub-dependencies that are difficult to install manually. The environment is activated in the next command. Once the conda build is setup, pip will downgrade the existing versions to the pinned versions found in the requirments.txt file. Afterwards, the package is finally installed mimicking the original release environment.

The latter installation method should work fine. To check, the following command should be able to run without errors:

$ nosetests

Now, you should be able to run include jupyter notebook Demos.

Installing dependencies from source

In the absence of Anaconda, installing the three major dependendencies from source can be tedious and arduous, specifically numpy, pandas and matplotlib. Here are some reasons and tips 1, 2 for installing dependencies if they are not setup on your system.

On Debian-based systems, install the following pre-requisites.

$ apt-get install build-essential python3-dev

On Windows systems, be certain to install the appropriate Visual Studio C-compilers.

Note

Installing dependencies on windows can be troublesomes. See the installation guide for matplotlib. Try this or this for issues installing matplotlib. Future developments will work towards OS agnostiscism with continuous Integration on Linux, OS and Windows using Travis and Appveyor.

Important

If issues still arise, ensure the following requisites are satisfied:

  • the conda environment is properly set up with dependencies and compiled sub-dependencies e.g. C-extensions (see above)
  • the appropriate compiler libraries are installed on your specific OS, i.e. gcc for Linux, Visual Studio for Windows. With conda, this should not be necessary.
  • sufficient memory is available to compile C-extensions, e.g. 0.5-1 GB minimum
  • the appropriate LamAna version, compatible python version and dependency versions are installed according to requirements.txt (see the Dependencies chart)

Dependencies

The following table shows a chart of tested build build compatible with LamAna:

lamana python dependency OS
0.4.8 2.7.6, 2.7.10, 3.3, 3.4, 3.5, 3.5.1 numpy==1.10.1, pandas==0.16.2, matplotlib==1.5.0 linux, local win(?)
0.4.9 2.7, 3.3, 3.4, 3.5, 3.5.1 conda==3.19.0, numpy==1.10.1, pandas==0.16.2, matplotlib==1.4.3 linux, win(?)
0.4.10 2.7, 2.7.11, 3.3, 3.4, 3.5, 3.5.1 conda==3.19.0, numpy==1.10.2, pandas==0.17.1, matplotlib==1.5.1 linux
0.4.10 2.7 (x32, x64), 3.4 (x32), 3.5 (x32, x64) conda==3.19.0, numpy==1.10.2, pandas==0.17.1, matplotlib==1.5.1 win