Install gensim and nltk into Virtualenv

This post introduces how to install gensim and nltk into a virtualenv. It is always a good strategy to install some package(s)/library(s) you often use (together) into a separate virtualenv, so it will not be interrupted by other libraries (because different libraries may depend on different versions of another libraries).

$ sudo apt-get install python-pip python-dev python-virtualenv
  • Create a Virtualenv environment in the directory for python  ~/gensim-venv:

Note that you can setup your virtualenv in any directory you want — just replace the ~/gensim-venv with your preferred directory path. But be sure to change it accordingly when you follow the rest of the tutorial.

$ virtualenv --system-site-packages -p python ~/gensim-venv # for python 

$ virtualenv --system-site-packages -p python3 ~/gensim-venv3 # for python 3, it is better to add "venv3" when naming your virtualenv, so you know it is for python3

The --system-site-packages Option

If you build with virtualenv --system-site-packages ENV, your virtual environment will inherit packages from /usr/lib/python2.7/site-packages(or wherever your global site-packages directory is).

This can be used if you have control over the global site-packages directory, and you want to depend on the packages there. If you want isolation from the global system, do not use this flag (and note that if you do not use the “system-site-packages” flag, you NEED to install the version of the python you need in the virtualenv that you just created)

  • Activate the virtual environment:
$ source ~/gensim-venv/bin/activate  # If using bash
(gensim-venv)$  # Your prompt should change
  • Install gensim in the virtualenv for python:
pip install --upgrade gensim
  • After the install you will activate the Virtualenv environment each time you want to use gensim.
  • With the Virtualenv environment activated, you can now test your gensim installation.
  •  When you are done using gensim, deactivate the environment.
(gensim-venv)$ deactivate

$  # Your prompt should change back

To use gensim later you will have to activate the Virtualenv environment again:

$ source ~/gensim-venv/bin/activate  # If using bash.

(gensim-venv)$  # Your prompt should change.
# Run Python programs that use gensim.
...
# When you are done using gensim, deactivate the environment.
(gensim-venv)$ deactivate
  • To delete a virtual environment, just delete its folder. (In this case, it would be rm -rf gensim-venv.)
  • To install nltk into the virtualenv gensim-venv, just issue this command after you activated the gensim-venv
(gensim-venv)$ pip install -U nltk

Note: you may notice that those commands issued inside a virtualenv (i.e., after you create a virtualenv and activated it), you do not need to add sudo before the commands. Because all those commands will all be confined within that virtualenv folder you created – that is why  it will not affect other packages installed in other virtualenv or system-wide, and it will not be affected by other packages installed in another virtualenv(s). You can create as many as virtualenv you want, so please naming your virtualenv with meaning name, otherwise, over time, we will forget what was actually being installed in some virtualenv:)