Anaconda vs. Miniconda vs. Virtualenv

This post briefly introduces which to choose among Anaconda, Miniconda, and Virtualenv.

If you have used pip and virtualenv in the past, you can use conda to perform all of the same operations. Pip is a package manager, and virtualenv is an environment manager; and conda is both.

Specifically, conda is a packaging tool and installer that aims to do more than what pip does; it handles library dependencies outside of the Python packages as well as the Python packages themselves. Conda also creates a virtual environment, like virtualenv does.

Both Anaconda and Miniconda uses Conda as the package manager. The difference among Anaconda and Miniconda is that Miniconda only comes the package management system. So when you install it, there is just the management system and not coming with a bundle of pre-installed packages like Anaconda does. Once Conda is installed, you can then install whatever package you need from scratch along with any desired version of Python.

Choose Anaconda if you:

  • Are new to conda or Python
  • Prefer having Python and 720+ open source certified packages automatically installed at once
  • Have the time and disk space (a few minutes and 3 GB), and/or
  • Don’t want to install each of the packages you want to use individually.

Choose Miniconda if you:

  • Know what package(s) you need to install
  • Do not have time or disk space (about 3 GB) to install over 720+ packages (many of the packages are never used and could be easily installed when needed), and/or
  • Just want fast access to Python and the conda commands, and prefer to sorting out the other packages later.

Choose Virtualenv only when you have sudo access to the machine you are working on. It is much easier to setup conda rather than virtualenv for a regular (i.e., non sudo/root) user on a linux/Mac machine.

I use Miniconda myself (because it is much more light weight than Anaconda) when I need to setup python programming environment and when I do not have sudo privilege, and I use Virtualenv when I have sudo access on the machine.

(Thanks to  Dr. Brendt Wohlberg  for introducing Miniconda — Miniconda makes me switching from pip & virtualenv to conda.)

References

 

Cool Python Tricks

This post provides some cool python tricks.

(Stay tuned, as I may update the tricks while I plow in my deep learning garden)

With increase in popularity of python, more and more features are becoming available for python coding. Using this features makes writing the code in fewer lines and cleaner. In this article we will see 10 such python tricks which are very frequently used and most useful.

Reversing a List

We can simply reverse a given list by using a reverse() function. It handles both numeric and string data types present in the list.

Example

List = ["Shriya", "Lavina","Sampreeti" ]
List.reverse()
print(List)

Output

Running the above code gives us the following result −

['Sampreeti', 'Lavina', 'Shriya']

Print list elements in any order

If you need to print the values of a list in different orders, you can assign the list to a series of variables and programmatically decide the order in which you want to print the list.

Example

List = [1,2,3]
w, v, t = List
print(v, w, t )
print(t, v, w )

Output

Running the above code gives us the following result −

(2, 1, 3)
(3, 2, 1)

Using Generators Inside Functions

We can use generators directly inside a function to writer shorter and cleaner code. In the below example we find the sum using a generator directly as an argument to the sum function.

Example

sum(i for i in range(10) )

Output

Running the above code gives us the following result −

45

*Using the zip() function

When we need to join many iterator objects like lists to get a single list we can use the zip function. The result shows each item to be grouped with their respective items from the other lists.

Example

Year = (1999, 2003, 2011, 2017)
Month = ("Mar", "Jun", "Jan", "Dec")
Day = (11,21,13,5)
print zip(Year,Month,Day)

Output

Running the above code gives us the following result −

[(1999, 'Mar', 11), (2003, 'Jun', 21), (2011, 'Jan', 13), (2017, 'Dec', 5)]

Swap two numbers using a single line of code

Swapping of numbers usually requires storing of values in temporary variables. But with this python trick we can do that using one line of code and without using any temporary variables.

Example

x,y = 11, 34
print x
print y
x,y = y,x
print x
print y

Output

Running the above code gives us the following result −

11
34
34
11

Transpose a Matrix

Transposing a matrix involves converting columns into rows. In python we can achieve it by designing some loop structure to iterate through the elements in the matrix and change their places or we can use the following script involving zip function in conjunction with the * operator to unzip a list which becomes a transpose of the given matrix.

Example

x = [[31,17],
[40 ,51],
[13 ,12]]
print (zip(*x))

Output

Running the above code gives us the following result −

[(31, 40, 13), (17, 51, 12)]

*Print a string N Times

The usual approach in any programming language to print a string multiple times is to design a loop. But python has a simple trick involving a string and a number inside the print function.

Example

str ="Point";
print(str * 3);

Output

Running the above code gives us the following result −

PointPointPoint

*Reversing List Elements Using List Slicing

List slicing is a very powerful technique in python which can also be used to reverse the order of elements in a list.

Example

#Reversing Strings
list1 = ["a","b","c","d"]
print list1[::-1]

# Reversing Numbers
list2 = [1,3,6,4,2]
print list2[::-1]

Output

Running the above code gives us the following result −

['d', 'c', 'b', 'a']
[2, 4, 6, 3, 1]

Find the Factors of a Number

When we are need of the factors of a number, required for some calculation or analysis, we can design a small loop which will check the divisibility of that number with the iteration index.

Example

f = 32
print "The factors of",x,"are:"
for i in range(1, f + 1):
   if f % i == 0:
print(i)

Output

Running the above code gives us the following result −

The factors of 32 are:
1
2
4
8
16
32

*Checking the Usage of Memory

We can check the amount of memory consumed by each variable that we declare by using the getsizeof() function. As you can see below, different string lengths will consume different amount of memory.

Example

import sys
a, b, c,d = "abcde" ,"xy", 2, 15.06
print(sys.getsizeof(a))
print(sys.getsizeof(b))
print(sys.getsizeof(c))
print(sys.getsizeof(d))

Output

Running the above code gives us the following result −

38
35
24
24

 

 

References:

https://www.tutorialspoint.com/10-interesting-python-cool-tricks (Pradeep Elance Published on 08-Aug-2019 10:22:06)

https://www.tutorialspoint.com/questions/category/Python

Timing how long a python script runs

This post introduces several ways to find out how long a python script takes to complete its execution.

  • If you are using Linux or Mac OS, in your terminal
$ time ./your_script.py
  • Several ways to do the task by adding a few lines of code in your py script.
import time
startTime = time.time()

your_func() #python3: print ("It took", time.time() - startTime, "seconds.")

See the following for an example in python 3. 

import time
import functools

startTime = time.time()

print(functools.reduce(lambda x,y: x+y, [47,11,42,13]))

#python3:
print ("It took", time.time() - startTime, "seconds.")

Another way to do the same thing:

from datetime import datetime
startTime = datetime.now()

#do something

#Python 2: 
print datetime.now() - startTime 

#Python 3: 
print(datetime.now() - startTime)

One more way to do the same thing with a nicely formatted output.

import sys
import timeit

startTime = timeit.default_timer()

#do some nice things...

stopTime = timeit.default_timer()
totalRunningTime = stopTime - startTime

# output running time in a nice format.
mins, secs = divmod(totalRunningTime, 60)
hours, mins = divmod(mins, 60)

sys.stdout.write("Total running time: %d:%d:%d.\n" % (hours, mins, secs))

If you want to compare two blocks of code / functions quickly you can do the following:

import timeit

startTime = timeit.default_timer()
your_func1()
#python3
print(timeit.default_timer() - startTime)

startTime2 = timeit.default_timer()
your_func2()
#python3
print(timeit.default_timer() - starTime2)

Using Apache Solr with Python

This post provides the instructions to use Apache Solr with Python in different ways.

======using Pysolr

Below are two small python snippets that the author of the post used for testing writing to and reading from a new SOLR server.

The script below will attempt to add a document to the SOLR server.

# Using Python 2.X
from __future__ import print_function  
import pysolr

# Setup a basic Solr instance. The timeout is optional.
solr = pysolr.Solr('http://some-solr-server.com:8080/solr/', timeout=10)

# How you would index data.
solr.add([  
    {
        "id": "doc_1",
        "title": "A very small test document about elmo",
    }
])

The snippet below will attempt to search for the document that was just added from the snippet above.

# Using Python 2.X
from __future__ import print_function  
import pysolr

# Setup a basic Solr instance. The timeout is optional.
solr = pysolr.Solr('http://some-solr-server.com:8080/solr/', timeout=10)

results = solr.search('elmo')

print("Saw {0} result(s).".format(len(results)))  

 

======GitHub repos

pysolr is a lightweight Python wrapper for Apache Solr. It provides an interface that queries the server and returns results based on the query.

install Pysolr using pip

pip install pysolr

Multicore Index

Simply point the URL to the index core:

# Setup a Solr instance. The timeout is optional.
solr = pysolr.Solr('http://localhost:8983/solr/core_0/', timeout=10)

SolrClient is a simple python library for Solr; built in python3 with support for latest features of Solr.

Components of SolrClient

 

References:

 

Lambda, map, filter, and reduce functions in python 3

After migration to Python 3 from Python 2,  lambda operator, map() and filter()  functions are still part of core Python; only reduce() function had to go, and it was moved into the module functools

This post introduces how to use lambda, map, filter, and reduce functions in Python 3 (for python 2.7 version, check the references below.)

  • Lambda operator

Some people like it, others hate it and many are afraid of the lambda operator.

The lambda operator or lambda function is a way to create small anonymous functions (i.e., functions without a name). These functions are throw-away functions (i.e., they are just needed where they have been created).

Lambda functions are mainly used in combination with the functions filter(), map() and reduce(). The lambda feature was added to Python due to the demand from Lisp programmers. 

The general syntax of a lambda function is quite simple: 

lambda argument_list: expression 

The argument list consists of a comma separated list of arguments and the expression is an arithmetic expression using these arguments. You can assign the function to a variable, so you can  use it as a function. 

The following example of a lambda function returns the sum of its two arguments:

>>> sum = lambda x, y : x + y
>>> sum(3,4)
7

>>> 

The above example might look like a game for a mathematician — A formalization that turns a straight forward operation into an abstract  formalization.

The above has the same effect by using the following conventional function definition: 

>>> def sum(x,y):
...     return x + y
... 
>>> sum(3,4)
7
>>> 

But, when you learn how to use the map() function, you will see the apparent advantages of this lambda operation.

  • The map function 

The advantage of the lambda operator will be obvious when it is used in combination with the map() function. 

map() is a function which takes two arguments: 

r = map(func, seq)

The first argument func is the name of a function and the second a sequence (e.g. a list).

seqmap() applies the function func to all the elements of the sequence seq. Before Python3, map() used to return a list, where each element of the result list was the result of the function func applied on the corresponding element of the list or tuple “seq”. In Python 3, map() returns an iterator.

The following examples illustrate how map() function works:

>>> def fahrenheit(T):

...   return ((float(9)/5)*T + 32)

... 
# hit Return/Enter to exit to the >>> in your terminal.

>>> def celsius(T):

...   return (float(5)/9)*(T-32)

... 
# hit Return/Enter to exit to the >>> in your terminal.

>>> temperatures = (36.5, 37, 37.5, 38, 39)

>>> F = map(fahrenheit, temperatures)



>>> print(F)

<map object at 0x106d1c3c8>


>>> temperatures_in_Fahrenheit = list(F)

>>> print(temperatures_in_Fahrenheit) 
[97.7, 98.60000000000001, 99.5, 100.4, 102.2]

>>> C = map(celsius, map(fahrenheit, temperatures))

>>> print(C)

<map object at 0x106d1c438>


>>> temperatures_in_Celsius = list(C)


>>> print(temperatures_in_Celsius)
[36.5, 37.00000000000001, 37.5, 38.00000000000001, 39.0]

>>> 

In the example above we haven’t used lambda. When using lambda, we do not need to define and name the functions fahrenheit() and celsius(). You can see this in the following interactive session:

>>> C = [39.2, 36.5, 37.3, 38, 37.8]

>>> F = list(map(lambda x: (float(9)/5)*x + 32, C))

>>> print(F)

[102.56, 97.7, 99.14, 100.4, 100.03999999999999]

>>> C = list(map(lambda x: (float(5)/9)*(x-32), F))

>>> print(C)

[39.2, 36.5, 37.300000000000004, 38.00000000000001, 37.8]

>>> 

map() can be applied to more than one list.

The lists don’t have to have the same length.

map() will apply its lambda function to the elements of the argument lists (i.e., it first applies to the elements with the 0th index, then to the elements with the 1st index until the n-th index is reached). See the following for an illustration example:

>>> a = [1, 2, 3, 4]
>>> b = [17, 12, 11, 10]
>>> c = [-1, -4, 5, 9]
>>> list(map(lambda x, y : x+y, a, b))
[18, 14, 14, 14]
>>> list(map(lambda x, y, z : x+y+z, a, b, c))
[17, 10, 19, 23]
>>> list(map(lambda x, y, z : 2.5*x + 2*y - z, a, b, c))
[37.5, 33.0, 24.5, 21.0]
>>>

We can see in the example above that the parameter x gets its values from the list a, while y gets its values from b, and z from list c. 

If one list has less elements than the others, map() will stop when the shortest list has been completed the mapping:

>>> a = [1, 2, 3]
>>> b = [17, 12, 11, 10]
>>> c = [-1, -4, 5, 9]
>>> 
>>> list(map(lambda x, y, z : 2.5*x + 2*y - z, a, b, c))
[37.5, 33.0, 24.5]
>>>

 

  • The filter function 

filter(function, sequence) 

offers an elegant way to filter out all the elements of a sequence “sequence”, according to the return value of the function function (i.e., if the function returns True it will be kept in the returned iterator object of the filter function). 

In other words: The function filter(f,l) needs a function f as its first argument. f has to return a Boolean value (i.e. either True or False). This function will be applied to every element of the list l. Only if f returns True will the element be produced by the iterator — which is the return value of filter function. 

In the following example, we filter out first the odd and then the even elements of the sequence of the first 10 Fibonacci numbers: 

>>> fibonacci = [0,1,1,2,3,5,8,13,21,34]
>>> odd_numbers = list(filter(lambda x: x % 2, fibonacci))
>>> print(odd_numbers)
[1, 1, 3, 5, 13, 21]
>>> even_numbers = list(filter(lambda x: x % 2 == 0, fibonacci))
>>> print(even_numbers)
[0, 2, 8]

>>> 
  • An example of combining filter() and lambda functions
>>> filter(lambda x: x % 2 == 0, list(range(10,100)))

<filter object at 0x106d1c208>

#this will return all even number between 10 and 100.
>>> list(filter(lambda x: x % 2 == 0, list(range(10,100))))

[10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]

>>> 

 

  • The reduce() function 

reduce() had been dropped from the core of Python when migrating to Python 3. It was moved into the module functools.

reduce(func, seq) 

continually applies the function func() to the sequence seq. It returns a single value. 

If seq = [ s1, s2, s3, … , sn ], calling reduce(func, seq) works like this:

  • the first two elements of seq will be applied to func, i.e. func(s1,s2). The list on which reduce() applied to looks like this now: [ func(s1, s2), s3, … , sn ]
  • Then,  func will be applied on the previous result and the third element of the list, that is, func(func(s1, s2),s3)
    The list now looks like this: [ func(func(s1, s2),s3), … , sn ]
  • repeat the steps until just one element is left and return this element as the result of reduce() function.

If n is equal to 4 the previous explanation can be illustrated like this: Reduce

The following  simple example illustrates how reduce() works. 

>>> import functools
>>> functools.reduce(lambda x,y: x+y, [47,11,42,13])
113
>>>

The following diagram shows the intermediate steps of the calculation: 

 

See below for some examples of using reduce() function.

 

#get maximum number from a list using reduce():

>>> from functools import reduce
>>> f = lambda a,b: a if (a > b) else b
>>> reduce(f, [47,11,42,102,13])
102
>>>
# Calculating the sum of the numbers from 1 to 100:
>>> from functools import reduce
>>> reduce(lambda x, y: x+y, range(1,101))
5050

It’s very straightforward to change the previous example to calculate the product (the factorial) from 1 to a number. We just need to change the “+” into “*”:

>>> reduce(lambda x, y: x*y, range(1,5))

24

>>> 

 

References:

 

Parallel Programming using MPI in Python

This post introduces Parallel Programming using MPI in Python.

The library is mpi4py (MPI and python extensions of MPI), see here for its code repo on bitbucket.

Laurent Duchesne provides an excellent step-by-step guide for parallelizing your Python code using multiple processors and MPI. Craig Finch has a more practical example for high throughput MPI on GitHub. See here for more mpi4py examples from Craig Finch.

An example of TensorFlow using MPI can be found here.

References:

 

Overcoming frustration: Correctly using unicode in python2

>>> string = unicode(raw_input(), 'utf8')
café
>>> log = open('/var/tmp/debug.log', 'w')
>>> log.write(string)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3: ordinal not in range(128)

Okay, this is simple enough to solve: Just convert to a byte str and we’re all set:

>>> string = unicode(raw_input(), 'utf8')
café
>>> string_for_output = string.encode('utf8', 'replace')
>>> log = open('/var/tmp/debug.log', 'w')
>>> log.write(string_for_output)
>>>

Deal exclusively with unicode objects as much as possible by decoding things to unicode objects when you first get them and encoding them as necessary on the way out.

If your string is actually a unicode object, you’ll need to convert it to a unicode-encoded string object before writing it to a file:

(if you are not sure what is the type of your string. use type(your string) to check it, if it is something looks like u ‘….’, it is a unicode string.)

foo = u'Δ, Й, ק, ‎ م, ๗, あ, 叶, 葉, and 말.'
f = open('test', 'w')
f.write(foo.encode('utf8'))
f.close()

When you read that file again, you’ll get a unicode-encoded string that you can decode to a unicode object:

f = file('test', 'r')
print f.read().decode('utf8')