Install and run IPython and Jupyter Notebook in virtualenv on Ubuntu 16.04 (Desktop and Remote Server)

This post introduces how to install IPython and Jupyter Notebook in virtualenv on Ubuntu 16.04 (both local Desktop and remote server.)

Step 0: install virtualenv and setup virtualenv environment

If you have not installed virtualenv yet, you need to do so before proceed.

Check my post for more details about how to setup python virtual environment and why it is better to install python libraries in Python virtual environment.

  • Install pip and Virtualenv for python 2.x and python 3.x:
$ sudo apt-get update
$ sudo apt-get install openjdk-8-jdk git python-dev python3-dev python-numpy python3-numpy build-essential python-pip python3-pip python-virtualenv swig python-wheel libcurl3-dev
  • Create a Virtualenv environment in the directory for python2.x and python 3.x:
#for python 2.x
virtualenv --system-site-packages -p python ~/ipy-jupyter-venv

# for python 3.x 
virtualenv --system-site-packages -p python3 ~/ipy-jupyter-venv3

(Note: To delete a virtual environment, just delete the corresponding folder.  For example, In our cases, it would be rm -rf ipy-jupyter-venv or rm -rf ipy-jupyter-venv3.)

Step 1: Install IPython

Before installing IPython and Jupyter, be sure to activate your python virtual environment first.

# for python 2.x
$ source ~/ipy-jupyter-venv/bin/activate  # If using bash
(ipy-jupyter-venv)$  # Your prompt should change

# for python 3.x
$ source ~/ipy-jupyter-venv3/bin/activate  # If using bash
(ipy-jupyter-venv3)$  # Your prompt should change

Use the following command to install IPython

#for python 2.x

(ipy-jupyter-venv) liping:~$ pip install ipython

#for python 3.x

(ipy-jupyter-venv3) liping:~$ pip3 install ipython

Step 2: Install Jupyter

Use the following command to install Jupyter Notebook

#for python 2.x 

(ipy-jupyter-venv) liping:~$ pip install jupyter

#for python 3.x 

(ipy-jupyter-venv3) liping:~$ pip3 install jupyter

Step 3: Test

#for python 2.x


(ipy-jupyter-venv) liping:~$ which python
/home/liping/ipy-jupyter-venv/bin/python


#for python 3.x
(ipy-jupyter-venv3) liping:~$ which python3

/home/liping/ipy-jupyter-venv3/bin/python3


#for python 2.x
(ipy-jupyter-venv) liping:~$ which ipython

/home/liping/ipy-jupyter-venv/bin/ipython

#for python 3.x
(ipy-jupyter-venv3) liping:~$ which ipython3

/home/liping/ipy-jupyter-venv3/bin/ipython3

#for python 2.x
(ipy-jupyter-venv) liping:~$ which jupyter-notebook

/home/liping/ipy-jupyter-venv/bin/jupyter-notebook

#for python 3.x
(ipy-jupyter-venv3) liping:~$ which jupyter-notebook

/home/liping/ipy-jupyter-venv3/bin/jupyter-notebook

Step 4: Add Kernel

The Jupyter Notebook and other frontends automatically ensure that the IPython kernel is available. However, if you want to use a kernel with a different version of Python, or in a virtualenv or conda environment, you’ll need to install that manually. 

We are using virutalenv, so we need to install IPython kernel in the virtualenv we created in Step 0 above.

(ipy-jupyter-venv) liping:~$  python -m ipykernel install --user --name myipy_jupter_env --display-name "ipy-jupyter-venv"

Installed kernelspec myipy_jupter_env in /home/liping/.local/share/jupyter/kernels/myipy_jupter_env

(ipy-jupyter-venv3) liping:~$  python3 -m ipykernel install --user --name myipy_jupter_env3 --display-name "ipy-jupyter-venv3"

Installed kernelspec myipy_jupter_env3 in /home/liping/.local/share/jupyter/kernels/myipy_jupter_env3

Step 5: Run Jupyter Notebook

#for python2.x

 (ipy-jupyter-venv) liping:~$ jupyter-notebook

#for python 3.x

 (ipy-jupyter-venv3) liping:~$ jupyter-notebook

If you are running Jupyter Notebook on a local Linux computer (not on a remote server), you can simply navigate to localhost:8888 to connect to Jupyter Notebook. If you are running Jupyter Notebook on a remote server, you will need to connect to the server using SSH tunneling as outlined in the Step 5-2 below.

At this point, you can keep the SSH connection open and keep Jupyter Notebook running or can exit the app and re-run it once you set up SSH tunneling. Let’s keep it simple and stop the Jupyter Notebook process. We will run it again once we have SSH tunneling working. To stop the Jupyter Notebook process, press CTRL+C, type Y, and hit ENTER to confirm. The following will be displayed:

[C 08:08:04.232 NotebookApp] Shutdown confirmed

[I 08:08:04.232 NotebookApp] Shutting down 0 kernels

Step 5-2: Connecting to a remote Server Using SSH Tunneling

This step is only for those who are connecting to a Jupyter Notebook installed on a remote server.

Now we will learn how to connect to the Jupyter Notebook web interface using SSH tunneling. Since Jupyter Notebook is running on a specific port on the remote server (such as :8888:8889 etc.), SSH tunneling enables you to connect to the remote server’s port securely.

Below I will describe how to create an SSH tunnel from  a Mac or Linux (Windows users can check step 4 introduced at here). Note the following instructions in this step refer to your local computer (not the remote server).

SSH Tunneling with a Mac or Linux

Open a new terminal window on your Mac or Linux.

Issue the following ssh command to start SSH tunneling:

$ ssh -L 8000:localhost:8888 your_server_username@your_server_ip

Note: The ssh command opens an SSH connection, but -L specifies that the given port on the local (client) host is to be forwarded to the given host and port on the remote server. This means that whatever is running on the second port number (i.e. 8888) on the remote server will appear on the first port number (i.e. 8000) on your local computer. You should change 8888 to the port which Jupyter Notebook is running on. Optionally change port 8000 to one of your choosing (for example, if 8000 is used by another process). Use a port greater or equal to 8000 (ie 80018002, etc.) to avoid using a port already in use by another process. 

If no error shows up after running the ssh -L command, now be sure to activate your python virtual environment first.

# for python 2.x 

$ source ~/ipy-jupyter-venv/bin/activate  # If using bash (ipy-jupyter-venv)$  # Your prompt should change 

# for python 3.x 

$ source ~/ipy-jupyter-venv3/bin/activate  # If using bash (ipy-jupyter-venv3)$  # Your prompt should change

you can run Jupyter Notebook by issuing the following command:

#for python2.x  
(ipy-jupyter-venv) liping:~$ jupyter-notebook 

#for python 3.x  
(ipy-jupyter-venv3) liping:~$ jupyter-notebook

Now, from a web browser on your local machine, open the Jupyter Notebook web interface with http://localhost:8000 (or whatever port number you chose above when you ssh -L into your remote server).

Note: see the following instructions for how to change the startup folder of your Jupyter notebook.

  • In your terminal window
  • Enter the startup folder by typing cd /some_folder_name.
  • Type jupyter notebook to launch the Jupyter Notebook App (it will appear in a new browser window or tab).

Step 6 — Using Jupyter Notebook

By this point you should have Jupyter Notebook running, and you should be connected to it using a web browser. Jupyter Notebook is very powerful and has many features. Below I will outline a few of the basic features to get you started using the notebook. Automatically, Jupyter Notebook will show all of the files and folders in the directory it is run from.

To create a new notebook file, select New > Python 3 or New > ipy-jupyter-venv3 from the top right pull-down menu (Note: this is the so called kernel we installed in Step 4 above):

This will open a notebook. We can now run Python code in the cell or change the cell to markdown. For example, change the first cell to accept Markdown by clicking Cell > Cell Type > Markdown from the top navigation bar. We can now write notes using Markdown and even include equations written in LaTeX by putting them between the $$ symbols. For example, type the following into the cell after changing it to markdown:

# Simple Equation

Let us now implement the following equation:
$$ y = x^2$$

where $x = 2$

To turn the markdown into rich text, press CTRL+ENTER, and the following should be the results:

Note: You can use the markdown cells to make notes and document your code.

Now Let’s implement that simple equation and print the result. Select Insert > Insert Cell Below to insert and cell and enter the following code:

#for python 2.x
x = 2
y = x*x
print y

#for pyton 3.x
x = 2
y = x*x
print (y)

To run the code, press CTRL+ENTER. The following should be the results:

You now can include python libraries and use the notebook as you would with any other Python development environment! 

You should be now able to write reproducible Python code and notes using markdown using Jupyter notebook running on a remote server. To get a quick tour of Jupyter notebook, select Help > User Interface Tour from the top navigation menu.

Happy learning and coding!

Step 7: Deactivate your virtualenv

Each time you would like to use iPython and Jupyter, you need to activate the virtual environment into which it installed, and when you are done using iPython and Jupyter, deactivate the environment.

# for python 2
(ipy-jupyter-venv)$ deactivate
$  # Your prompt should change back

#for python 3
(ipy-jupyter-venv3)$ deactivate
$  # Your prompt should change back

Note: To delete a virtual environment, just delete its folder. (In this case, it would be rm -rf ipy-jupyter-venv or rm -rf ipy-jupyter-venv3.)

References:

Intalling Jupyter in a virtualenv (pdf)

Running iPython cleanly inside a virtualenv (pdf)

Using a virtualenv in an IPython notebook

Installing the IPython kernel

How To Set Up a Jupyter Notebook to Run IPython on Ubuntu 16.04 (pdf)

 Running the Jupyter Notebook (Change Jupyter Notebook startup folder (OS X))

 

 

 

The Data Science Venn Diagram by Drew Conway

What is Data Science?

Data Science is a surprisingly hard definition to nail down, especially given the fact that how ubiquitous the term has become.

Vocal critics have variously dismissed the term as a superfluous label (after all, what science doesn’t involve data?)But, these critiques miss something important.

Data science, is perhaps the best label we have for the cross-disciplinary set of skills that are becoming increasingly important in many applications across industry and academia. This cross-disciplinary piece is the key.

In VanderPlas’s opinion, the best existing definition of data science is illustrated by Drew Conway’s Data Science Venn Diagram (see the figure below), first published on Drew Conway’s blog in September 2010.

The Data Science Venn Diagram above captures the essence of what people mean when they say “data science”:

it is fundamentally an interdisciplinary subject. Data science comprises three distinct and overlapping areas:

the skills of a statistician who knows how to model and summarize (big) datasets;

the skills of a computer scientist who can design and use algorithms to efficiently store, process, and visualize this data; and

the domain expertise — what we might think of as “classical” training in a subject — necessary both to formulate the right questions and to put their answers in context.

With this in mind, it would be better to think of data science not as a new domain of knowledge to learn, but as a new set of skills that you can apply within your current area of expertise.

(If you want to get started with your data science journey and apply it in your area of expertise, check out this page for some useful resources that I have collected for you.)

 

References and Further Reading List: