PyBc/AdvancedPython

Advanced Python

Your One, True Love

Python Boot Camp 2010 - Session 9 - January 14

Presented by Anthony Scopatz

As you have seen by now, Python is a rather healthy language. But it is the cardamom that makes the lassi! In "Advanced Python" we'll talk about that extra behind-the-scenes spice that turns Python into your one true love. Sometimes these are simply neat syntactical tricks and sometimes they are basic to how the language works. The parts of Python that we'll talk about here are 'advanced' not because they are difficult (in fact, they are easy to learn). Rather, it is because you can usually get by perfectly well without them! However, knowing them makes your life that much easier.

Advanced Python Topics:

  1. The Zen of Python
  2. Special (Magic) Functions, Methods, & Modules
  3. Decorators (@)
  4. The lambda Expression
  5. The with Statement
  6. Scripting and subprocess

The Zen of Python

Alright, let's start this!

import this
>>> The Zen of Python, by Tim Peters
>>> 
>>> Beautiful is better than ugly.
>>> Explicit is better than implicit.
>>> Simple is better than complex.
>>> Complex is better than complicated.
>>> Flat is better than nested.
>>> Sparse is better than dense.
>>> Readability counts.
>>> Special cases aren't special enough to break the rules.
>>> Although practicality beats purity.
>>> Errors should never pass silently.
>>> Unless explicitly silenced.
>>> In the face of ambiguity, refuse the temptation to guess.
>>> There should be one-- and preferably only one --obvious way to do it.
>>> Although that way may not be obvious at first unless you're Dutch.
>>> Now is better than never.
>>> Although never is often better than *right* now.
>>> If the implementation is hard to explain, it's a bad idea.
>>> If the implementation is easy to explain, it may be a good idea.
>>> Namespaces are one honking great idea -- let's do more of those!

Yes to all of these. The above was written by Tim Peters back in 2004 and can be found at PEP 20 (http://www.python.org/dev/peps/pep-0020/). These mantras are meant to be more guiding principles and less strict coding standards. It is a good idea to keep them around in your back pocket while you write. However, some of them should be more rigidly adhered to than others.

"Special cases aren't special enough to break the rules." - In some languages, which we won't mention by name, there exists the basic syntax which is constantly overridden to perform even the simplest of tasks. The core language does an excellent job of avoiding special cases. You should too. In analogy to the real world, special cases are like nearly every commonly used verb in English. But your computer doesn't speak English. It probably speaks something closer to Lojban (http://en.wikipedia.org/wiki/Lojban). Try to talk in Lojban to your computer.

"There should be one-- and preferably only one --obvious way to do it." - It is hard for me to stress how central this is to Python. A comparison between Python's numerical types (http://docs.python.org/library/stdtypes.html#numeric-types-int-float-long-complex) and C-based languages numbers serves to illustrate this point very well.

If you talk to a mathematician (http://www.mathsisfun.com/sets/number-types.html), they'll tell you about all sorts of basic numerical types. Namely there are the Counting, Integer, Rational, Irrational, Algebraic, Imaginary, and Complex numbers. However, many of the number sets are strict subsets of one of the other number types.

The pythonic way of interpreting these mathematical facts is to reduce these to only 4 types! They are int(), float(), long(), and complex(). Moreover unless you really need long() or complex(), you will probably only ever need to use int() and float(). Amazing.

On the other hand, C-languages take the opposite approach and specifies the number types explicitly. For instance, 'int' will still declare an integer but an 'unsigned int' is actually a counting number! Arguably, the functionality of restricting to only non-negative integers is no longer necessary for most modern applications. This is because computer memory is cheap and abundant so storing a (+/-) is a non-issue. Additionally in a worst-case scenario, different C-compilers may specify different precision for different numerical types. This can cause no end of headaches.

The fact that python handles basic numbers in an intuitive, standardized way is one of its central advantages as a language.


Special (Magic) Functions, Methods, & Modules

Magic (or 'special' in Python) functions are the language's way of explicitly altering how the language works. Special functions always have the form of two underscores before & after the name (__*__). They are a way for you to get your classes, functions, methods to do exactly what you want them to without resorting to ugly kludges. More information can be found at http://docs.python.org/reference/datamodel.html#special-method-names.

NOTE: Special functions in Python are not related to the special functions we saw yesterday in SciPy. The functions present in the scipy.special module are commonly used (though not universally applicable) mathematical functions. Python special functions indicate that this is where the 'magic' happens in the language.

Special Functions Class Example: Let's see how special methods work with regards to classes.

EXAMPLE: SELECT ALL
class MyClass: 
        def __init__(self, n, s=""):
                self.n = int(n)
                self.s = str(s)

mc = MyClass(5.0, "Five")

print(mc.n)
print(mc.s)
print(mc)
>>> 5
>>> Five
>>> <__main__.MyClass instance at 0x7fd977fd25a8>

Here, the special __init__() method acts as a constructor. It is called once when the object is initialized. All the initializer does here is assign an integer n and a string s to itself. Of course, trying to print MyClass itself is not particularly useful. That mc is located in memory at 0x7fd977fd25a8 doens't matter to most mortals. But what the print() function does is call str(object), or more precisely, object.__str__()!

EXAMPLE: SELECT ALL
class MyClass: 
        def __init__(self, n, s=""):
                self.n = int(n)
                self.s = str(s)

        def __str__(self):
                return "Maybe {0} = {1}...".format(self.s, self.n)

mc = MyClass(7.0, "Ten")
print(mc)
>>> Maybe Ten = 7...

Note that in both ___init__() and __str__(), the keyword self must be passed to the method so that they can access other attributes in the class's scope! (ASIDE: Technically, self is not a keyword in Python, just a convention. However, I have never seen anyone use anything but self! http://docs.python.org/tutorial/classes.html#random-remarks) Other basic converters are defined similarly:

__str__(self) Called to implement str() conversion.
__int__(self) Called to implement int() conversion.
__float__(self) Called to implement float() conversion.
__len__(self) Called to calculate an object's length via len().

Finally, documenting in Python is easy, fast, and special. Documentation is stored as a string named object.__doc__.

EXAMPLE: SELECT ALL
class MyClass: 
        "Hey"
        def __init__(self, n, s=""):
                "I am initializing MyClass"
                self.n = int(n)
                self.s = str(s)

        def __str__(self):
                return "Maybe {0} = {1}...".format(self.s, self.n)

        __doc__ = __doc__ + " Nonny Nonny"

mc = MyClass(7.0, "Ten")
print(mc.__doc__)
print(mc.__init__.__doc__)
>>> Hey Nonny Nonny
>>> I am initializing MyClass

Here we see two methods for defining a docstring. The first implicitly set __doc__ by placing a string literal directly underneath the class or function definition. Alternatively, you can modify the __doc__ string explicitly.

Hands-on Example

Modify MyClass to include a special length method. This should return an integer that is already intrinsic to the class somehow...

Then use the built-in function len() to retrieve the length of one of your MyClass objects!

Special Function Numeric Type Example: Probably one of the most useful implementations of special functions is that you can override, emulate, replace numerical types and operators for your classes! Once again in the C-universe, this is known as operator overloading. Rather than letting you fundamentally change the operator (+, -, *, /, %, **, //), an operator in Python is simply a flag for calling the special function from two objects (add(), sub(), etc). Say that x and y are two numerical types. The following expressions are all equivalent (though you'd only ever use the first one):

EXAMPLE: SELECT ALL
z = x + y
z = x.__add__(y)
z = y.__radd__(x)

Analogous expressions exist for all operators. You can find a full set of definitions at http://docs.python.org/reference/datamodel.html#emulating-numeric-types.

For our example here recall that a metric space is simply some set M with a distance function d defined on it (http://mathworld.wolfram.com/MetricSpace.html). Next, consider the most simplistic set we can: M = {0,1} . That is to say that M is binary. Then we need an equally simple d. In words, suppose I existed in this space. Then if I wanted to travel from from me to me (going nowhere) is zero (x=y iff d(x,y)=0). On the other hand if I wanted to go from myself to anywheres else, the distance would be one x!=y iff d(x,y)=1). Not terribly practical...

But we can effectively reproduce our simple metric space (and number type) using special functions! First, define a simple number type:

EXAMPLE: SELECT ALL
class SimpleNum():
        "A Simple Number in a Simple Metric Space."
        def __init__(self, x):
                if int(x) == 0:
                        self.val = 0
                else:
                        self.val = 1

        def __str__(self):
                return str(self.val)

        def __int__(self):
                return int(self.val)

        def __float__(self):
                return float(self.val)

        def __add__(self, y):
                if self.val == 1:
                        return SimpleNum(1)
                elif int(y) == 0:
                        return SimpleNum(0)
                else:
                        return SimpleNum(1)

        #__add__() could be more concisely written as:
        #def __add__(self, y):
        #       return SimpleNum(self.val + SimpleNum(y).val)
        #where __init__() takes care of the details...

        def __radd__(self, y):
                "Addition is communative! Here, at least..."
                return self.__add__(y)

        def __sub__(self, y):
                y = SimpleNum(y)
                return SimpleNum(self.val - y.val)

        def __rsub__(self, y):
                "Since we have no negative numbers, subtraction is also communative!"
                return self.__sub__(self, y)

        def __mul__(self, y):
                y = SimpleNum(y)
                return SimpleNum(self.val * y.val)

        def __rmul__(self, y):
                if isinstance(y, SimpleNum):
                        return self.__mul__(y)
                else:
                        return NotImplemented

Great! Let's test the different operators, starting with addition.

EXAMPLE: SELECT ALL
#Define some SimpleNums
i = SimpleNum(0)
j = SimpleNum(37.0)
k = SimpleNum(True)

print("i = {0}".format(i))
print("j = {0}".format(j))
print("k = {0}".format(k))
print("")

#Define SimpleNums from additive operations 
a = i + 0
b = 0 + j
c = a + b

print("a = {0}".format(a))
print("b = {0}".format(b))
print("c = {0}".format(c))
>>> i = 0
>>> j = 1
>>> k = 1
>>> 
>>> a = 0
>>> b = 1
>>> c = 1

And now for subtraction:

EXAMPLE: SELECT ALL
#Define SimpleNums from subtraction
p = a - b
q = k - (c + 1)
r = k - (c - 1) 

print("p = {0}".format(p)) 
print("q = {0}".format(q))
print("r = {0}".format(r))
>>> p = 1
>>> q = 0
>>> r = 1

And finally, for multiplication:

EXAMPLE: SELECT ALL
#Define SimpleNums from multiplication
e = j * b
f = q * True
g = c * 3
print("e = {0}".format(e))
print("f = {0}".format(f))
print("g = {0}".format(g))
print("")

h = 50.0 * k
print("h = {0}".format(h))
>>> e = 1
>>> f = 0
>>> g = 1
>>> 
>>> Traceback (most recent call last):
>>>   File "PyBC_S9_simple_num.py", line 88, in <module>
>>>     h = 50.0 * k
>>> TypeError: unsupported operand type(s) for *: 'float' and 'instance'

Everything was going great! But what is this TypeError. This was thrown for h because the __rmul__() method specifically disallows right multiplication of a simple number with anything that isn't a simple number! You can specifically disallow operations between various types. For example, while division of a float by an integer usually makes sense, float division by a string rarely does.

In this short example, we have fairly soundly defined our own metric space in Python! Naturally, this can balloon as much as you need it to. You may grab the above example from PyBC_S9_simple_num.py.

Special Modules Example: Lastly, when you write a Python Package (a collection of Python Modules) there is one very special module: __init__.py. Think of this as analogous to an object's __init__() method but for files. However rather than being called when an object is newly instantiated, __init__.py is called when a package is first imported. Say you have the following directory structure:

pkg/
    __init__.py
    mod1.py
    mod2.py
    submodules/
               __init__.py
               submod1.py
    notinpkg/
             unseenmod1.py

In this case, the following expressions would be roughly equivalent:

EXAMPLE: SELECT ALL
#Import the package:
import pkg
import pkg.__init__

#Import the submodules:
import pkg.submodules
import pkg.submodules.__init__

Note that since the directory pkg/notinpkg/ does not contain an __init__.py file, unseenmod1.py will never be imported by Python!

The great thing about __init__.py files is that they can be completely empty! They simply need to exist for the package to work properly. Naturally, you can put whatever code you want in them. For more information please refer to http://docs.python.org/tutorial/modules.html#packages.

Hands-on Example

Make a new directory called 'specialpkg' and put the MyClass and SimpleNum files inside of it. Now write your own __init__.py file that imports everything from MyClass and SimpleNum automatically when you import specialpkg.


Decorators (@)

Decorators are the Python version of macros. Or rather they have the same fundamental goal of macros. The job of a decorator is to inject some bit of code non-intrusively into or around another part of the program. However unlike macros (and preprocessor directives) in other languages, Python decorators are written in the same language as the code itself! You would think that this wouldn't be an earth-shattering idea...

Python decorators to a function are placed before the function definition and are specified syntactically with the '@'-symbol. The only requirement on decorators is that they be callable (ie have a __call__()) method). Therefore any function can be a decorator as can any class where def __call__(self) is defined may also be a decorator. The documentation describes decorators in the most concise fashion (http://docs.python.org/reference/compound_stmts.html#function-definitions):

@f(x)
@g
def h():
     ...
     return something

#The above is equivalent to:
def h():
     ...
     return something
h = f(x)(g(h)) 

Let's see how decorators work in practice. Let's define a null decorator, which for any function will return 'None'.

EXAMPLE: SELECT ALL
def null(f):
        def null_func(*args, **kwargs):
                return None
        return null_func

def CrazyFunc(j, m, s=""):
        return "{0} wants a {1} in its {2}!".format(s, j, m)

@null
def NotSoCrazyFunc(j, m, s=""):
        return "{0} wants a {1} in its {2}!".format(s, j, m)

print(CrazyFunc("Banana", 3.14, "My Shower"))
print(NotSoCrazyFunc("Banana", 3.14, "My Shower"))
>>> My Shower wants a Banana in its 3.14!
>>> None

As you can see, when CrazyFunc is decorated by null it becomes not quite the wild time it once was. Moreover null is portable so whenever you wish to kill a function's output all you need to do is decorate its definition. The one subtlety here is that null() doesn't return None. Rather it returns a function null_func()(nested within itself) that returns None. Additionally, null takes a function as an argument (the function to be decorated). It is null_func that accepts any arguments and any keyword arguments. These are the parameters that are passed to your original function at runtime.

On a more practical level, let's define some functions and decorators that will either add one to an integer, square it, or do nothing. Also, to avoid nesting functions, we'll use classes as decorators.

EXAMPLE: SELECT ALL
class addone_dec:
        def __init__(self, f):
                self.f = f

        def __call__(self, *args, **kwargs):
                return self.f(*args, **kwargs) + 1

class square_dec:
        def __init__(self, f):
                self.f = f

        def __call__(self, *args, **kwargs):
                return self.f(*args, **kwargs)**2

#Add one, then square
@square_dec
def addone(x):
        return x + 1

#Square x, then add one
@addone_dec
def square(x):
        return x * x

#Do nothing, then add one, then square, 
#then add another one, then add a final one
@addone_dec
@addone_dec
@square_dec
@addone_dec
def donothing(x):
        return x

print(addone(2))
print(square(2))
print(donothing(2))
>>> 9
>>> 5
>>> 11

More documentation may be found at http://docs.python.org/reference/compound_stmts.html#function-definitions. However, a wonderful tutorial on decorators is available at http://www.artima.com/weblogs/viewpost.jsp?thread=240808. Lastly, you can grab the above examples at PyBC_S9_decorators.py.

Hands-on Example

Write a decorator that prints the value of the function AFTER it has been called. Then simply return the value of the function without altering it further. Lastly apply your new creation to some functions!


The lambda Expression

First off, let me say that lambdas are not scary! You have probably run across them in someone else's code and they may have seemed like magic. A lambda is simply a way of creating an unnamed (anonymous) function. Such a function is restricted to only one expression (so that lambdas always fit on a single line). See, that wasn't so bad...

A bit of history: Python-lambdas were inspired by an equivalent expression in LISP, a functional language. In fact, LISP has been called "the most intelligent way to misuse a computer" (http://www.wisdomandwonder.com/link/2941/lisp-is-the-smartest-way-to-misuse-a-computer). It is probably true; few languages can claim to have books that teach you using solely the Socratic Method (The Little LISPER & The Little Schemer)! Lamdas in programming are simply an implementation of mathematical Lambda Calculus (http://en.wikipedia.org/wiki/Lambda_calculus).

For example, say you needed a quick function that simply tested if a number was equal to 20:

EXAMPLE: SELECT ALL
#Declare the function
lambda x: x == 20
>>> <function <lambda> at 0x7fb2797b3140>
EXAMPLE: SELECT ALL
#Now call the function
(lambda x: x == 20)(10)
>>> False
EXAMPLE: SELECT ALL
#Assign it to another variable
g = lambda x: x == 20
g(20.0)
>>> True

Notice how if you were to define g as a normal function, you would have to use 'def', '()', and 'return'. All of this is captured by lambda automatically.

EXAMPLE: SELECT ALL
#In lambda form
g = lambda x: x == 20

#In regular form
def g(x):
     return (x == 20)

Because lambdas return functions, and because they are so small, you can define them in-line as an argument to another function! Lets define a function that prints a string before and after a lambda filter has been applied.

EXAMPLE: SELECT ALL
def str_filter(s, n, f):
        return s + " --> " + f(s, n)

str_filter("Da ", 3, lambda s, n: s*n )
>>> Da  --> Da Da Da 
EXAMPLE: SELECT ALL
str_filter("Sham Wow ", 4, lambda s, n: s[n:] + s[:n]) 
>>> Sham Wow  -->  Wow Sham
EXAMPLE: SELECT ALL
str_filter("NeverOddOrEven", 500, lambda s, n: s[::-1]) 
>>> NeverOddOrEven --> nevErOddOreveN
EXAMPLE: SELECT ALL
str_filter("My love for the Dutch is undying", 4, lambda s, n: "The Flying " + s.split()[n] + "man")
>>> My love for the Dutch is undying --> The Flying Dutchman

Hands-on Example

Using the lambda keyword and the str_filter(...) function above, write your own filter that replaces the middle character of a string s with n uppercase copies of that character.

Lambdas also make rather concise decorators:

EXAMPLE: SELECT ALL
addone_dec = lambda f: (lambda x: x + 1)

@addone_dec
def f(x):
        return x  

f(1)
>>> 2

More on lambdas can be found in the documentation at http://docs.python.org/reference/expressions.html#lambda and http://docs.python.org/tutorial/controlflow.html#lambda-forms. You may get the above examples at PyBC_S9_lambda.py.


The with Statement

As of Python v2.5, we have had the context manager with statement. Basically, for an expression EXPR, possibly set to the variable VAR, VAR = EXPR.__enter__() is called before executing the subsequent code block. Moreover, EXPR.__exit__() after the code block is exited for any reason, including exceptions! Such a syntax is built to pass exceptions within the code block on up to the next level. Using with will therefore reduce the number of try statements needed withing EXPR's context. Syntactically,

with EXPR as VAR:
     #implicitly perform VAR = EXPR.__enter__()
     execute BLOCK of code
     #implicitly perform EXPR.__exit__()

This has been detailed in PEP 343 (http://www.python.org/dev/peps/pep-0343/).

Probably the most common use of with is for file objects. Here, with will ensure that the file is opened and closed properly even in the face of extraneous errors. The following example will open 'afile.txt' and place each line's tenth character in a list.

EXAMPLE: SELECT ALL
col_ten = []
with open('afile.txt', 'r') as f:
     for line in f:
          col_ten.append(line[9])

print(col_ten)

Note that the above will fail if any line has less than ten columns in it. But with ensures that the file will be closed BEFORE Python stops the program.

Exercise Left for the Reader

Define your own class using your own __enter__() and __exit__() methods such that it would be compatible with the with statement. Simply make sure to follow the specifications found at http://docs.python.org/reference/compound_stmts.html#the-with-statement and http://docs.python.org/reference/datamodel.html#context-managers.

For example, try making a Year context that contains a Months and Birthdays. Then for each month, have a Year.birthdays("Month") function that prints all the birthdays in that month. The following should print all birthdays for each month:

EXAMPLE: SELECT ALL
with Year({'jenny': 'May 30th', 'Dr. Cyclops': 'January 1'}) as y:
     for m in y.months:
          print(y.birthdays(m) + ' were all born in ' + m + '!')

Scripting and subprocess

Python is often called a scripting language. What this means exactly is best left to those more semantically inclined. Definitional disagreements aside, Python can be used to do what most people consider scripting (though this is not the language's primary objective). Programs that emulate Shell scripts can be written with a minimum of overhead using the 'subprocess' module.

Historically, there have been no fewer than four methods that existed simultaneously to simply open a pipe and execute a command. (Not very Zen!) As of Python v2.6, these are thankfully deprecated. What we now have is a single, robust subprocess.Popen class. However while Popen is hearty, it is more convenient to use the subprocess.call() function which accepts the same arguments as Popen's initializer. Arguments to the process are given as a list of strings.

EXAMPLE: SELECT ALL
import subprocess

#Find all instances of Tofu in our imitation meat recipes
subprocess.call(["grep", "-r", "Tofu", "Recipes/FakeMeat/"])

#Popen has the kwarg "shell=False"
#But setting shell=True will cause the first string argument 
#to be executed in the 'usual way'.
subprocess.call("grep -r Tofu Recipes/FakeMeat/", shell=True)

Great, now any commands we write in BASH we can easily wrap in Python! However, better than BASH, we get the benefit of Python's wonderful lists, strings, loops, etc. It should be noted that Gentoo's Portage package management system is effectively just the above methodology on steroids.

Still, there are some subtitles about subprocesses that need to be noted. The first is that call() does not return the stdout string! Rather it returns the return code of the pipe after it was executed. This is the equivalent of the BASH command $?.

EXAMPLE: SELECT ALL
import subprocess

#A successful call will return 0
return_code = subprocess.call("ls -l /home/")
print(return_code)
>>> 0
EXAMPLE: SELECT ALL
#A failed call will return > 0
return_code = subprocess.call("cp nothere.txt")
print(return_code)
>>> 1

However, this doesn't mean that you there is no way to obtain the stdout and stderr strings. To do this though we have to return to the Popen object and specify that output should go to PIPE rather than None. We can then use the pipe object's communicate() method to grab the value of stdout and stderr. For example, let's make a list of the process identification numbers of all awesome programs we are currently running:

EXAMPLE: SELECT ALL
import subprocess

sp = subprocess.Popen("ps ux | grep awesome", stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
spout, sperr = sp.communicate()

PIDs = []
for line in spout.split('\n'):
     PIDs.append(line.split()[1])

print(PIDs)
>>> ['2158', '6299', '6301']

As you can see, the subprocess module is how to make Python to truly interact with your computer! For more information please see http://docs.python.org/library/subprocess.html.

Attachments