Dynamic GPU usage monitoring (CUDA)

To dynamically  monitor NVIDIA GPU usage, here I introduce two methods:

method 1: use nvidia-smi

in your terminal, issue the following command:

$ watch -n 1 nvidia-smi

It will continually update the gpu usage info (every second, you can change the 1 to 2 or the time interval you want the usage info to be updated).

method 2: use the open source monitoring program glances with its GPU monitoring plugin

in your terminal, issue the following command to install glances with its GPU monitoring plugin

$ sudo pip install glances[gpu]

to launch it, in your terminal, issue the following command:

 $ sudo glances

Then you should see your GPU usage etc. It also monitors the CPU, disk IO, disk space, network, and a few other things

For more commonly used Linux commands, check my other posts at here  and here .

Install and use htop on Ubuntu 16.04 Desktop and Server

This post introduces an interactive tool for visually monitoring the memory and process usages of your Ubuntu 16.04 Desktop or Server in real time.

  • What is Htop?

Htop is an interactive system-monitor process-viewer and process-manager. It is designed as an alternative to the Unix program top. It shows a frequently updated list of the processes running on a computer, normally ordered by the amount of CPU usage. Unlike top, htop provides a full list of processes running, instead of the top resource-consuming processes. Htop uses color and gives visual information about processor, swap and memory status.

  • Install Htop on Ubuntu 16.04 LTS Desktop and Server

(This works on both an Ubuntu 16.04 Desktop and  Server.)

Installing htop package on Ubuntu 16.04 (Xenial Xerus) is as easy as running the following command on terminal:

Step 1. First make sure that all your system packages are up-to-date by running these following apt-get commands in the terminal.

$ sudo apt-get update

Step 2. Installing Htop. Install htop process monitoring tool using apt-get command:

$ sudo apt-get install htop
  • Use Htop to monitor your Ubuntu 16.04 LTS Desktop and Server in real-time

Now that htop is installed on your server you’ll want to start the program by running the following in a command prompt:

$ htop

This will open the program and you’ll see something similar to the following:

Leave this terminal open, you can use CTRL + ALT + T  to open another new terminal for your other work. Then htop will help you monitor your memory usage in real-time:)  Enjoy!

tmux resources

This post provides a brief introduction to tmux and some commonly used commands and useful resources about tmux.

(Thanks Davide for recommending such a handy tool to me.)

(Stay tuned — I will update this post while I am gaining new skills about tmux.)

======What is tmux?

According to the tmux authors:

tmux is a terminal multiplexer. What is a terminal multiplexer? It lets you switch easily between several programs in one terminal, detach them (they keep running in the background) and reattach them to a different terminal. And do a lot more.

 

====== tmux sessions, windows, and panes explained

One of these features is the ability to break your session into more discreet components, called windows and panes. These are good for organizing multiple varied activities in a logical way.

Let’s look at how they relate to each other.

Nesting

tmuxnesting

tmux sessions have windows, and windows have panes. Below you can see how how they are conceptualized:

  • Sessions are for an overall theme, such as work, or experimentation, or sysadmin.
  • Windows are for projects within that theme. So perhaps within your experimentation session you have a window titled noderestapi, and one titled lua sample.
  • Panes are for views within your current project. So within your sysadmin session, which has a logs window, you may have a few panes for access logs, error logs, and system logs.

It’s also possible to create panes within a session without first creating a separate window. I do this sometimes. Hopefully it isn’t as horrible as it sounds right after reading about nesting.

Q about differences between sessions and windows and panes:

Can someone explain how they use sessions and windows and panes?
It feels like one too many levels, why both sessions AND windows?
I’m willing to believe there is a use case, but for the life of me I can’t come up with one.

A: 

I use sessions for different projects, and windows and panes within a project. For example, I’ll have a session for my puppet code, and a session for a bash script I’m working on. Within the puppet session, if I’m going to work on a new module, I open a new window. If I’m writing code in vim, I usually split off a pane so I can test running it next to it. Having the error output next to the code makes debugging really fast.

In a lot of ways, it basically acts like a tiling window manager for the terminal. The advantage over a window manager is that the whole layout can be accessed remotely. So if I’m working on a project from work, I can quickly resume working on the project from my home computer after SSH’ing in and reattaching to the tmux session.

 

====== see below for commonly used tmux comands

By default, tmux uses Ctrl-b as its shortcut activation chord, which enables you perform a number of functions quickly.

  • Switch between different windows

Ctrl – b (presee the  ctrl and b keys at the same time and release them at the same time, and then immediately press window number)  window number

  • Cursor move within an active window

Ctrl-b (release the keys, and immediately press the “[” key ) the key with “[“.

when you see the cursor becomes a solid flashing diamond, you can use arrow key to move your cursor in the active terminal window.

Note: Press “q” to exit the cursor move mode.

  • Session management
  • s list sessions
  • $ rename the current session # default session name and window name is number, can rename session names to meaningful ones
  • d detach from the current session

tmux is developed on a client-server model which means that the session is stored on the server and persist beyond ssh logout.

The following command will create a new session called mysession:

tmux new-session -s mysession

To attach to a session run:

tmux attach -t mysession

To list all session run:

tmux ls

You can kill a session using the following command:

tmux kill-session -t mysession

you can grossly kill all tmux processes with the following command:

pkill -f tmux

Frequently used sessions commands

Ctrl-b d	  Detach from the current session 
Ctrl-b (          Go to previous session
Ctrl-b )          Go to next session
Ctrl-b L          Go to previously used session
Ctrl-b s          Choose a session from the sessions list

Ctrl + D    — exit tmux from terminal.

  • Windows (tabs) Management

Each session can have multiple windows. By default all windows are numbered starting from zero.

Frequently used windows (tabs) commands

Ctrl-b 1  Switch to window 1
Ctrl-b c  Create new window
Ctrl-b w  List all windows
Ctrl-b n  Go to next window
Ctrl-b p  Go to previous window
Ctrl-b f  Find window
Ctrl-b ,  Name window
Ctrl-b w  Choose a window from the windows list
Ctrl-b &  Kill the current window

 

  • One of the handy things about tmux is how easy it is to resize panes:

Ctrl +b, followed by holding down Alt, and using the arrow keys to resize.

 

======things about how to save sessions and recover sessions.

If you reboot you computer you will lose the sessions. Sessions cannot be saved. But, they can be scripted. What most do in fact is to script some sessions so that you can re-create them.

======Installing tmux

  • Installation with sudo privilege

Installation is pretty straightforward if you have Ubuntu or any other Debian-based distribution you can install tmux with:

sudo apt-get install tmux

on CentOS/Fedora:

yum install tmux

and on MacOS:

brew install tmux

Note: check the comments for some up-to-date scripts.

 

References and further reading list

This tutorial covers installation of tmux and some commonly used commands.

This is a pretty good introduction to tmux, it includes why tmux and the comaprision between tmux and screen, as well as some tmux shortcuts

 

 

 

Get the number of CPUs/cores in Linux from the command line

This post introduces how to find out the number of CPUs/cores on Linux machines from command line.

  • method 1:

$ nproc –all

20   # this means there are 20 cores on the linux machine.

  • method 2:

$ lscpu

CPU(s):                20

On-line CPU(s) list:   0-19

Thread(s) per core:    2

Core(s) per socket:    10

 

 

Install Node.js on Ubuntu 16.04 LTS

This post provides instructions about how to install Node.js on Ubuntu 16.04 LTS. See this post for Node.js resources. (Node.js Offical Github Repo.)

NPM (Node Package Manager) is the default package manager for the JavaScript runtime environment Node.js. NPM hosts thousands of free node packages. In general, NPM is installed on your computer after you install Node.js.

There are several ways to install Node.js on Ubuntu:

  • Method #1 (our choice in this tutorial): Install Node.js with Node Version Manager (NVM) to manage multiple active Node.js versions

Using nvm, we can install multiple, self-contained versions of Node.js, which will allow us to control our environment, and get access to the newest versions of Node.js, but will also allow us to keep previous releases that our applications may depend on. (nvm is just like Virtualenv in Python, if you are familiar with it, which allows us to install multiple version of the same Python library into “virtual folders” by pip.)

This is the method we will cover later in this tutorial.

  • Method #2: Install the bundled Distro-Stable Version Node.js (version 4.2.6) – it is very simple to install, just one or two commands.

Ubuntu 16.04 contains a version of Node.js in its default repositories that can be used to easily provide a consistent experience across multiple systems. At the time of writing, the version in the repositories is version 4.2.6. This will not be the latest version, but it should be quite stable, and should be sufficient for quick experimentation with the language.

This tutorial picked the Node Version Manager (nvm) based method, because it is much more flexible.

See below for the step by step instructions. (Check out the reading list below if you need the install instructions for other methods listed above.)

Step 0: (Before we get started) Remove old Node package to avoid conflicts

Open a terminal (Ctrl + Alt + T), and type the following command. 

$ dpkg --get-selections | grep node

# If it says install in the right column, Node is on your system:
#ax25-node                                       install

#node                                            install

Step 1: Install prerequisite packages

We’ll need to get the software packages from our Ubuntu repositories that will allow us to build source packages. The nvm script will leverage these tools to build the necessary components.

First, we need to make sure we have a C++ compiler. Open a terminal window (Ctrl + Alt + T) and install the build-essential and libssl-dev packages. By default, Ubuntu does not come with these tools — but they can be installed by the following commands.

$ sudo apt-get update

$ sudo apt-get install build-essential libssl-dev

Step 2: Install nvm

Once the prerequisite packages are installed, we can install and update NVM(Node Version Manager) using cURL. (Note: to get the latest installation version link, on the page scroll down to “Install script”.) 

$ curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.2/install.sh | bash

Inspect the installation script with nano:

$ nano install_nvm.sh

#Note that we DO NOT need to add anything in the opened nano text editor window. We just need to create the .sh file. 
# Use Ctrl+O to save the file, and then hit Enter, and then Ctrl +X to close the file.

Run the script with bash (Note that run the following command in your terminal):

$ bash install_nvm.sh

It will install the software into a sub-directory under our home directory at ~/.nvm. It will also add some necessary lines to our ~/.profile file in order to use it.

To have access to the nvm functionality, we need to source the ~/.profile file so that our current session knows about the changes:

$ source ~/.profile

Now that we have nvm installed, we can install isolated Node.js versions.

Step 3: Install  Node.js

The following command will tell us which versions of Node.js are available for us to install:

$ nvm ls-remote
Output
...
    v6.10.3 (Latest LTS: Boron)
 v7.0.0
 v7.1.0
 v7.2.0
 v7.2.1
 v7.3.0
 v7.4.0
 v7.5.0
 v7.6.0
 v7.7.0
 v7.7.1
 v7.7.2
 v7.7.3
 v7.7.4
 v7.8.0
 v7.9.0
 v7.10.0

The newest version when I write this post is v7.10.0. We can install it by the following command:

$ nvm install 7.10.0

By default, nvm will switch to use the most recently installed version. We can explicitly tell nvm to use the version we just installed by the following command:

$ nvm use 7.10.0

When we install Node.js using nvm, the executable is called node (NOT nodejs that you may see in other tutorials). We can check the currently used Node and npm version by the following commands:

$ node -v  
# OR 
$ node --version

# Output
# v7.10.0


$ npm -v
# OR 
$ npm --version

# output
# 4.2.0

Step 4: using nvm to manage different versions of installed Node.js 

If you have multiple Node.js versions, you can see what are installed by the following command:

$ nvm ls

To set a default Node.js version to be used in any new shell, use the alias default command

$ nvm alias default 7.10.0

# This version will be automatically selected when a new session spawns. You can also reference it by the alias like this:

$ nvm use default

To learn more  about the options available to use with nvm, run the following command in your terminal:

$ nvm help

Step 5: using npm to install Node.js modules

Each version of Node.js will keep track of its own packages and has npm available to manage these.

We can use npm to install packages to the Node.js project’s ./node_modules directory by using the normal format. For example, for the express module:

$ npm install express

If you’d like to install it globally (i.e., making it available to other projects using the same Node.js version), you can add the -g flag:

$ npm install -g express

This will install the package in:

~/.nvm/node_version/lib/node_modules/package_name

Note that installing globally will allow us to run the commands from the command line, but we will have to link the package within a project in order to use it in that project:

$ npm link express

 

References and further reading list:

This tutorial covers two methods:

Method #1: Install the bundled distro specif Node.js version 4.2.6

Method #2: Install the latest version of Node.js version 6.x or 7.x

This post is very good — it covers the following ways to install Node.js:

Installing using the Official Repository
Installing using the Github Source Code Clone
Installing using Node Version Manager (NVM)

It covers How to Install Multiple Nodejs version with NVM and also covers how to remove Node.js 

This tutorial covers:

  • How To Install the Distro-Stable Version for Ubuntu
  • How To Install Using a PPA
  • How To Install Using NVM

there are a quite a few ways to get up and running with Node.js on your Ubuntu 16.04 server. Your circumstances will dictate which of the above methods is the best idea for your circumstance. While the packaged version in Ubuntu’s repository is the easiest, the nvm method is definitely much more flexible.

This is a pretty good tutorial. It covers 4 Ways to Install Node.js on Ubuntu. There are several ways to do this, but The author of this post recommended  Option 1: Node Version Manager (nvm). Here is the full list of options:

 

 

Run Java program on terminal with external library JAR on Ubuntu

This post provides the instructions how to run a Java program from Terminal with external library JAR.

When using Eclipse to code Java program, which imports some external JAR library, we can use Eclipse to compile/build/run the program.

But if we would like to run our Java program that used External library Jars from Terminal, where should we put those JAR files, and how to build and run the program.

  • For compiling the java file having dependency on a jar
$ javac -cp /path/to/jar/file Myprogram.java
  • For executing the class file
$ java -cp .:/path/to/jar/file Myprogram

Note: cp in the above commands refers to classpath, which is a parameter in Java Virtual Machine or the Java compiler that specifies the location of user-defined classes and packages. The parameter may be set either on the command-line, or through an environment variable.

For example,  if your current working directory in terminal is src/report/

$ javac -cp src/external/myImportedJarfile.jar myJavaProgram.java

$ java -cp .:src/external/myImportedJarfile.jar myJavaProgram

If you have multiple jar files a.jar,b.jar and c.jar. To add them to classpath

$javac -cp .:a.jar:b.jar:c.jar HelloWorld.java

$java -cp .:a.jar:b.jar:c.jar HelloWorld

Note: on Windows, use “;” instead of “:”

Using Java 6 or later, the classpath option supports wildcards. Note the following:

  • Use straight quotes (")
  • Use *, not *.jar

Wild cards were introduced from Java 6. Class path entries can contain the basename wildcard character *, which is considered equivalent to specifying a list of all the files in the directory with the extension .jar or .JAR.

java -cp "lib/*" -jar %MAINJAR%

If you need only specific jars, you will need to add them individually. The classpath string does not accept generic wildcards like Jar*, .jar, hiber etc.

Example

The following entry does not work:

java -cp "Halo.jar;lib/*.jar" ni.package.MainClass

Correct entry is :

java -cp "Halo.jar;lib/*" ni.package.MainClass

 

 

Install Ubuntu 16.04 on Oracle VirtualBox that runs on Windows or Mac

This post provides some notes and useful resources about installing Ubuntu 16.04 on Oracle VirtualBox that runs on your Mac or Windows.

Note: check the RAM and hard disk size of your machine before creating a virtual machine on it.

Notes about which version of Ubuntu to download and install:

For Ubuntu, it is not always a wise choice to pick the newest version. My suggestion is that (unless you are aware that you need to install a particular version), download and install the latest LTS (Long Term Support) version (see the picture below from Ubuntu wiki page). Every two years, a Ubuntu LTS version is released, which will be supported for updates for five years. For example, as of now, Ubuntu 16.04 LTS is the latest LTS version.

The two main things you need to pay attention to when you create a virtual machine:

  • Memory allocation for your virtual machine.

You can set it as half of your RAM (e.g., if your RAM is 8 G, set it as 4 G or 5G for your virtual machine should be fine.)

  • Storage type:  Select “Dynamically allocated” if you are not sure how large storage you actually will need.

There are already several very good tutorials about this along with snapshots, so I won’t create a tutorial for this. See below for some useful resources I collected. (See some notes I wrote below for some posts.)

My notes: This one is very good (with snapshots), including  Guest additions and Shared folders settings. (Note that Guest additions are required if you want to set Shared folder, so be sure to install Guest additions first).

You can use the following command to check whether Guest additions were installed on your Ubuntu virtual machine if you are not sure because you installed your Ubuntu VM a while ago. (Note: even though you may find Guest additions was installed, you will still need to install Guest additions for your newly installed VM, otherwise the Shared folders wont work for you.)

Use lsmod from the command line, as it will tell you not only if it’s installed, but properly loaded:

$ lsmod | grep vboxguest
vboxguest             282624  6 vboxsf

I have tested Shared folders instructions (with pictures) in this tutorial on my Ubuntu 16.04 VM, and it works. The only difference is that on Ubuntu 16.04 VM, after you issued the following command on your terminal and  restart the Ubuntu guest machine, you do not need to do anything as the tutorial said, the shared folder is automatically mounted each time you start you Ubuntu VM. (After you restart, click the Files icon on the task bar, and you will see the shared folder you just set just now is automatically mounted there:))

  • sudo adduser brb vboxsf   # Replace 'brb' with your account name on Ubuntu. 

One more note: Although Shared Folder setting in VM is very convenient, using VirtualBox shared folder directly for fastq data, annotation or output directory can significantly reduce the performance compared to a native (Ubuntu) system or VirtualBox native system, so my recommendation is only use the folder to transfer files between windows/mac and your Ubuntu VM.

P.S. If you see some tutorials tell you that you need to enter some command like “sudo mount -t vboxsf sharing /mnt/share” to automatically mount the shared folder each time you start your Ubuntu VM, that is outdated instructions.

Fortunately, new VirtualBox version (4.x +) has a (GUI) Auto-mount option (see pics below) when you set your shared folder. (Note that you can choose your customized folder to share, instead of using a system predefined folder such as Documents or Downloads.)

If you want to share the clipboard between your host and your virtual machine, check out the picture below.

 

Answers to some frequently asked questions:

Q: Do I need to backup my files when I upgrade my VirtualBox to newer version.

A: just install the latest version and you will have all your files in the new one. You need not have to uninstall the old virtual machine.

Q: After I install the updates of Windows 10, my VirtualBox won’t start…

A: just install the latest version and you will have all your files in the new one. You need not have to uninstall the old virtual machine.

 

My notes:  this one is very good (with snapshots) on Mac. My notes above about VM settings running on Windows work the same for VM settings running on Mac.

 

Compile a .java file on Linux

To compile a .java file on Linux (e.g., Ubuntu), first you need to install Java JDK on your computer. You can install it with the instructions at How do I install Java? , here and here.

Once you have java JDK installed, open your terminal and cd to the direcotry where your .java file located and type the following:

$ javac filename.java

To run the generated class file, use the following command

$ java filename

 

 

Change port for Apache Solr from the default port 8983 on Ubuntu 16.04

This post introduces how to change the default port on which Apache Solr runs on Ubuntu 16.04.  (See my post if you have not installed Solr on your Ubuntu.)

The default port for Solr is 8983, but there are circumstances where you may want to change this. For example, if you wish to experiment with a new release, or you want your various Sitecore development instances to hit separate instances of Solr.  See below for two options for changing the port number on Ubuntu.

Step 1: use sudo service solr status to check your Solr status and the port it is running on.

yourusername@yourservername:~$ sudo service solr status
[sudo] password for yourusername: 
● solr.service - LSB: Controls Apache Solr as a Service
 Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
 Active: active (exited) since Sun 2017-04-30 11:08:43 EDT; 1 weeks 0 days ago
 Docs: man:systemd-sysv-generator(8)

Apr 30 11:08:34 yourservername systemd[1]: Starting LSB: Controls Apache Solr as a Service...
Apr 30 11:08:34 yourservername su[2655]: Successful su for solr by root
Apr 30 11:08:34 yourservername su[2655]: + ??? root:solr
Apr 30 11:08:34 yourservername su[2655]: pam_unix(su:session): session opened for user solr by (uid=0)
Apr 30 11:08:42 yourservername solr[2652]: [194B blob data]
Apr 30 11:08:42 yourservername solr[2652]: Started Solr server on port 8983 (pid=2861). Happy searching!
Apr 30 11:08:43 yourservername solr[2652]: [14B blob data]
Apr 30 11:08:43 yourservername systemd[1]: Started LSB: Controls Apache Solr as a Service.

Step 2: use sudo service solr stop to  stop your Solr first before we go ahead and change  its default port.

yourusername@yourservername:/opt/solr-6.5.1/bin$ sudo service solr stop
yourusername@yourservername:/opt/solr-6.5.1/bin$ sudo service solr status
● solr.service - LSB: Controls Apache Solr as a Service
   Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
   Active: inactive (dead) since Sun 2017-05-07 15:40:57 EDT; 17s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 15132 ExecStop=/etc/init.d/solr stop (code=exited, status=0/SUCCESS)

Apr 30 11:08:42 yourservername solr[2652]: Started Solr server on port 8983 (pid=2861). Happy searching!
Apr 30 11:08:43 yourservername solr[2652]: [14B blob data]
Apr 30 11:08:43 yourservername systemd[1]: Started LSB: Controls Apache Solr as a Service.
May 07 15:40:55 yourservername systemd[1]: Stopping LSB: Controls Apache Solr as a Service...
May 07 15:40:55 yourservername su[15135]: Successful su for solr by root
May 07 15:40:55 yourservername su[15135]: + ??? root:solr
May 07 15:40:55 yourservername su[15135]: pam_unix(su:session): session opened for user solr by (uid=0)
May 07 15:40:55 yourservername solr[15132]: Sending stop command to Solr running on port 8983 ... waiting up to 180 seconds to allow
May 07 15:40:57 yourservername solr[15132]: [56B blob data]
May 07 15:40:57 yourservername systemd[1]: Stopped LSB: Controls Apache Solr as a Service.

Step 3: Change config files

Check out all the following files for the port:

  • cd to /opt/solr-6.5.1/server/solr/
#the file path: /opt/solr-6.5.1/server/solr/solr.xml
yourusernmae:/opt/solr-6.5.1/server/solr$ sudo nano solr.xml
#change port here:  ${jetty.port:8983}
  • cd to /var/
# the file path: /var/solr/data/solr.xml
yourusernmae:/var$ sudo nano /solr/data/solr.xml
# change port here:  ${jetty.port:8983}
  • cd to /etc/default/
# the file path: /etc/default/solr.in.sh
yourusernmae:/etc/default$ sudo nano solr.in.sh
# change port here:  SOLR_PORT=8983

Once you save and close the solr.in.sh file you can return to your terminal and type this command to reload the file 

yourusernmae:/etc/default$ source solr.in.sh

Step 4: Start your solr service again using  sudo service solr start, you will see your solr is now running on the new port your changed just now in the step 3 above.

yourusername@yourservername:/etc/default$ sudo service solr start
yourusername@yourservername:/etc/default$ sudo service solr status
● solr.service - LSB: Controls Apache Solr as a Service
 Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
 Active: active (exited) since Sun 2017-05-07 16:11:32 EDT; 3s ago
 Docs: man:systemd-sysv-generator(8)
 Process: 16988 ExecStop=/etc/init.d/solr stop (code=exited, status=1/FAILURE)
 Process: 17121 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS)

May 07 16:11:29 yourservername systemd[1]: Starting LSB: Controls Apache Solr as a Service...
May 07 16:11:29 yourservername su[17125]: Successful su for solr by root
May 07 16:11:29 yourservername su[17125]: + ??? root:solr
May 07 16:11:29 yourservername su[17125]: pam_unix(su:session): session opened for user solr by (uid=0)
May 07 16:11:32 yourservername solr[17121]: [98B blob data]
May 07 16:11:32 yourservername solr[17121]: Started Solr server on port 8985 (pid=17327). Happy searching!
May 07 16:11:32 yourservername solr[17121]: [14B blob data]
May 07 16:11:32 yourservername systemd[1]: Started LSB: Controls Apache Solr as a Service.

Now you can reference Step 5: Creating a Solr search collection in my another post to create a Solr search collection for this port.

References:

 

 

Install Apache Solr 6 on Ubuntu 16.04

This post provides the tutorial to set up Apache Solr 6 on Ubuntu 16.04. (install Solr as a service that auto-starts when (re)boot Ubuntu.)

What is Apache Solr? Apache Solr is an open source enterprise-class search platform written in Java which enables you to create custom search engines that index databases, files, and websites. It has back end support for Apache Lucene. It can, for example, be used to search in multiple websites and can show recommendations for the searched content. Solr uses an XML (Extensible Markup Language) based query and result language. There are APIs (Applications program interfaces) available for Python, Ruby and JSON (Javascript Object Notation).

Some other features that Solr provides are:

  • Full-Text Search.
  • Snippet generation and highlighting.
  • Custom Document ordering/ranking.
  • Spell Suggestions.

This tutorial will show you how to install the latest Solr version on Ubuntu 16.04 LTS. The steps will most likely work with later Ubuntu versions as well.

Before Solr 5, Solr doesn’t work alone; it needs a Java servlet container such as Tomcat or Jetty. But after Solr 5, it does not need to run on Tomcat.  

Running Solr on Tomcat (No Longer Supported)

Beginning with Solr 5.0, Support for deploying Solr as a WAR in servlet containers like Tomcat is no longer supported.

For information on how to install Solr as a standalone server, please see Installing Solr.

To give an example:

Things need to do when installing Solr version before 6.

Download and install Tomcat (or some other servlet container)
Setup Tomcat as a service
Download and unpack Solr
Create a SOLR_HOME folder with correct content
copy solr.war into tomcat/webapps
set CATALINA_OPTS=“-Dsolr.solr.home=/path/to/home -Dsolr.x.y=z…. GC-flags etc”
Setup  Tomcat as a service
service tomcat start

With Solr 6.x, we just need to do:

Download Solr and unpack the install-script
solr/bin/install_solr_service solr-6.2.0.tgz  # Install
Tune /etc/default/solr.in.sh to your likings (mem, port, solr-home, Zk etc)
service solr start (or bin/solr start [options])

Your client would talk to Solr on typically http://host.name:8983/solr/ as a standalone server, not as one out of many webapps on 8080.

Apache Solr 6 required Java 8 or greater to run.

 There had been lots of scaling improvements in Solr 6.

Now let’s get started with the installation.

 

Step 1: Update your System

Use a non-root sudo user to login into your Ubuntu server. Through this user, you will have to perform all the steps and use the Solr later.

To update your system, execute the following command to update your system with latest patches and updates.

$ sudo apt-get update 
$ sudo apt-get upgrade -y   #note that this will update your ubuntu OS, skip this if you do not want to update your system.

Step 2: Install Java 

(Apache Solr 6 required Java 8 or greater to run. If you have installed Java 8 or greater on your machine, skip this.)

Solr is a Java application, so Java needs to be installed first in order to set up Solr. See my post for detailed Java 8 installation on Ubuntu 16.04.

Check the version of Java installed by running the following command

$ java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Step 3: (Manually) install Solr 

Solr can be installed on Ubuntu in different ways, in this tutorial,we will install the latest package.  (If you would like to install the latest package from the source, check out How to install and configure Solr 6 on Ubuntu 16.04.)

Now Let’s download the required Solr version from its official site or mirrors.

First go to this Solr Download page, click the link to the latest version.

You would probably see something looks like the pic shown below. Get the download link you prefer. (for my case, I used this one http://apache.cs.utah.edu/lucene/solr/6.5.1). Click the download link you selected, and then you would see something like the pic shown below.

#If you do not have sudo privilege
#cd /path to one folder under your account 
# and you do not need to add "sudo" in the following commands
  
cd /opt
sudo wget http://apache.cs.utah.edu/lucene/solr/6.5.1/solr-6.5.1.tgz

Now extract solr service installer shell script from the downloaded Solr archive file and run installer using following commands.

sudo tar xzf solr-6.5.1.tgz solr-6.5.1/bin/install_solr_service.sh --strip-components=2

Then install Solr as a service using the script:

sudo ./install_solr_service.sh solr-6.5.1.tgz

The output will be similar to this: [Note that this installation will make Solr as a service that auto-starts when you (re)boot Ubuntu.]

myusername@myserver:/opt$ sudo ./install_solr_service.sh solr-6.5.1.tgz
id: ‘solr’: no such user
Creating new user: solr
Adding system user `solr’ (UID 117) …
Adding new group `solr’ (GID 126) …
Adding new user `solr’ (UID 117) with group `solr’ …
Creating home directory `/var/solr’ …

Extracting solr-6.5.1.tgz to /opt

Installing symlink /opt/solr -> /opt/solr-6.5.1 …

Installing /etc/init.d/solr script …

Installing /etc/default/solr.in.sh …

Service solr installed.
Customize Solr startup configuration in /etc/default/solr.in.sh
● solr.service – LSB: Controls Apache Solr as a Service
Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
Active: active (exited) since Sun 2017-04-30 11:08:43 EDT; 5s ago
Docs: man:systemd-sysv-generator(8)
Process: 2652 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS)

Apr 30 11:08:34 myserver systemd[1]: Starting LSB: Controls Apache Solr as a Service…
Apr 30 11:08:34 myserver su[2655]: Successful su for solr by root
Apr 30 11:08:34 myserver su[2655]: + ??? root:solr
Apr 30 11:08:34 myserver su[2655]: pam_unix(su:session): session opened for user solr by (uid=0)
Apr 30 11:08:42 myserver solr[2652]: [194B blob data]
Apr 30 11:08:42 myserver solr[2652]: Started Solr server on port 8983 (pid=2861). Happy searching!
Apr 30 11:08:43 myserver solr[2652]: [14B blob data]
Apr 30 11:08:43 myserver systemd[1]: Started LSB: Controls Apache Solr as a Service.
myusername@myserver:/opt$


Step 4:  Start / Stop Solr Service

Use the following command to check the status of the service

$ sudo service solr status

See below for a sample output:

myusername@myserver:/opt$ sudo service solr status
● solr.service - LSB: Controls Apache Solr as a Service
   Loaded: loaded (/etc/init.d/solr; bad; vendor preset: enabled)
   Active: active (exited) since Sun 2017-04-30 11:08:43 EDT; 13min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 2652 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS)

Apr 30 11:08:34 myserver systemd[1]: Starting LSB: Controls Apache Solr as a Service...
Apr 30 11:08:34 myserver su[2655]: Successful su for solr by root
Apr 30 11:08:34 myserver su[2655]: + ??? root:solr
Apr 30 11:08:34 myserver su[2655]: pam_unix(su:session): session opened for user solr by (uid=0)
Apr 30 11:08:42 myserver solr[2652]: [194B blob data]
Apr 30 11:08:42 myserver solr[2652]: Started Solr server on port 8983 (pid=2861). Happy searching!
Apr 30 11:08:43 myserver solr[2652]: [14B blob data]
Apr 30 11:08:43 myserver systemd[1]: Started LSB: Controls Apache Solr as a Service.

 

Use the following commands to Start, Stop and check status of Solr service.

$ sudo service solr stop
$ sudo service solr start
$ sudo service solr status

 

Step 5: Creating a Solr search collection

(Before we create a Solr search collection, check out this post first if you want to change the default port 8983 to another port.)

Using Solr, we can create multiple collections. Run the following command, give the name of your collection (here mysolrcollection) and specify its configurations.

$ sudo su - solr -c "/opt/solr/bin/solr create -c mysolrcollection -n data_driven_schema_configs"

Sample output:

myusername@myserver:/opt$ sudo su - solr -c "/opt/solr/bin/solr create -c mysolrcollection -n data_driven_schema_configs"
 [sudo] password for myusername:

Copying configuration to new core instance directory:
 /var/solr/data/mysolrcollection

Creating new core 'mysolrcollection' using command:
 http://localhost:8983/solr/admin/cores?action=CREATE&name=mysolrcollection&instanceDir=mysolrcollection

{
 "responseHeader":{
 "status":0,
 "QTime":1422},
 "core":"mysolrcollection"}


The new core directory for our first collection has been created. To view the default schema file, got to:

cd /opt/solr/server/solr/configsets/data_driven_schema_configs/conf

You will see some files shown in the picture below.

To view other configuration options , got to:

cd /opt/solr/server/solr/configsets/

 

Step 6: Use the Solr Web Interface (i.e., Access Solr Admin Panel)

Default Solr runs on port 8983. You can access Solr port in your web browser and you will get Solr dashboard.

The Apache Solr is now accessible on the default port, which is 8983. The admin UI should be accessible at http://your_server_ip:8983/solr. The port should be allowed by your firewall to run the links. 

(If you do not know your IP, check my post to find it out.)

For example:

http://192.168.1.100:8983/solr/

Or use your machine’s host name if you have one.

http://example.org:8983/solr/

 

Here you can view statics of created collection in previous steps named “mycollection”. Click on “Core Selector” on left sidebar and select created collection.

To see the details of the first collection that we created earlier, select the “mysolrcollection” collection in the left menu.

After you selected the “mysolrcollection” collection, select Documents in the left menu. There you can enter real data in JSON format that will be searchable by Solr. To add more data, copy and paste the following example JSON onto Document field:

{
"id": 1,
"name":"John",
"age":30,
"cars":[ "Ford", "BMW", "Fiat" ]
}

Note: You can add other formats of data such as CSV etc to Solr. (See the pic below)

Click on the submit document button after adding the data.

Status: success
Response:
{
 "responseHeader": {
 "status": 0,
 "QTime": 758
 }
}

Now we can click on Query on the left side then click on Execute Query,

We will see something like this:

Conclusion

After successfully installing the Solr Web Interface on Ubuntu, you can now insert the data or query the data with the Solr API and Web Interface.

You can write code to add a large set of documents into Solr. See my post for using Solr with Python. See this post for some useful Solr resources I collected.

 

References:

Install Tomcat & Solr (You can’t avoid this one) – This is for Solr before version 5, after Solr 5, Tomcat is not required to install Solr.

Apache Solr Reference Guide/ Installing Solr  & Running Solr  & Solr Quick Start (pdf. a very good concise intro, including some basic usages and indexing xml, json, csv files)

Configuring a schema.xml for Solr

First, rename the /opt/solr/solr/collection1 to an understandable name like apples (use whatever name you’d like). (This can be skipped if you installed it using apt-get. In that case, you can execute the following command instead: cd /usr/share/solr):

cd /opt/solr/solr
mv collection1 apples
cd apples

Also, if you installed Solr manually, open the file core.properties (nano core.properties) and change the name to the same name.

Then, remove the data directory and change the schema.xml:

rm -R data
nano conf/schema.xml

Paste your own schema.xml in here.