Python in a minute

This page gives a short introduction into the Python programming language, tailored to first-time users of the Python-bindings of opals-modules. Most information given on Python is based on the documentation given on the official Python website.

Installing Python

While the reference implementation (a.k.a. CPython) is written in C, there exist implementations in several different languages, including Java and C#. CPython today is available in 2 production versions: as Python 3 is backwards-incompatible with Python 2, and more third-party software is compatible with the latter (opals can be used with both), installing Python 2 still seems to be advisable. Ready-to-use installers for CPython are available for download that allow for extending Python later on (for that purpose, consider the package pip). However, one may opt for pre-packaged Python distributions that already come with an extended set of features. Among these, pythonxy shall be mentioned, which bundles some of the most powerful Python modules, including

  • GDAL Geospatial Data Abstraction Library
  • PyOpenCV Computer vision
  • PIL Python Imaging Library - image processing
  • IPython An enhanced Python shell, that "feels like" the one of Matlab®
  • NumPy Multidimensional arrays and basic operations
  • SciPy Advanced math, signal processing, optimization, statistics, etc.
  • SymPy Symbolic mathematics
  • Matplotlib 2D plotting library similar to the plotting features of Matlab®
  • Enthought Tool Suite including Mayavi, a powerful 2D and 3D visualization tool
  • VTK 3D graphics toolkit

After installation, the Python interpreter may directly be invoked from a command prompt. However, several comprehensive integrated development environments are available, offering symbolic debuggers, code completion, project management, and more. Among them, PyScripter and Spyder shall be mentioned.

Basic language features and the Python standard library

Invoking the Python interpreter and subsequently importing the module "this", reveals a compact formulation of the philosophy behind Python:

C:\>python
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>quit()
C:\>

Comments

Comments are introduced by a hash sign (#) and extend until the end of line.

Numeric types

  • int: signed integer, 4 bytes
  • boolean: a subtype of int, with the predefined constants True, False
  • long: integer with unlimited precision
  • float: real number, typically 8 bytes
  • complex: complex number consisting of 2 floats

All common arithmetic operators are supported, which adhere to their common precedence:

>>> 3 * 4 + 1
13

Collections

Collections combine into a common container zero or more objects, which may themselves be collections. Note that collections in Python do not restrict the contained elements to be of a common data type.

Sequence types (relevant only)

  • str: String literals are immutable (i.e. constant after construction) and are to be enclosed in either single or double quotes. The escape character is the backslash (). When dealing with e.g. file paths containing backslashes, turn escaping off by preceding the string literal with an 'r' ("raw").
  • unicode: In contrast to p str, whose character set is limited to the code page in use, p unicode strings may contain practically any character of the world's writing systems. Unicode strings are preceded with a p 'u'. Kindly note that opals modules currently do not support p unicode.
  • list: Lists are mutable objects - that is, their size and contents may be changed after construction. Lists are constructed using square brackets, with the items separated by commas.
  • tuple: Tuples are immutable objects and are constructed by the comma operator, with or without enclosing parentheses.

Sequence types share many common operations:

  • x in s: True if an item of s is equal to x, else False
  • x not in s: False if an item of s is equal to x, else True
  • s + t: the concatenation of s and t
  • s * n, p n * s: n shallow copies of s concatenated
  • s[i]: ith item of s, where indexing is zero-based
  • s[i:j]: slice of s from i to j
  • s[i:j:k]: slice of s from i to j with step k
  • len(s): length of s
  • min(s): smallest item of s
  • max(s): largest item of s
  • s.index(i): index of the first occurence of i in s
  • s.count(i): total number of occurences of i in s

Strings support many functions for case conversion, searching, etc. Mutable sequence types support insertion, removal and sorting of elements via respective functions. A few examples:

# a string literal
>>> s = 'text'
# a string literal containing a '\' - precede with 'r'
# to turn escaping off
>>> p = r'D:\file.txt'
# a list of integers
>>> li = [ 1, 2, 3, 4, 5 ]
# extract the second element
>>> li[1]
2
# extract every second element
>>> li[::2]
[1, 3, 5]

Set types

Set types constitute unordered collections of distinct objects. Two set types are defined:

  • set mutable
  • frozenset immutable

Set types may be constructed by a comma-separated list of elements within curly braces (beginning with Python 2.7) and support the set operations union, intersection, difference, symmetric_difference.

Mapping types

Mapping types map hashable objects to arbitrary values (all of Python's immutable built-in types are hashable). A single mapping type is built into Python: dict. dict may be initialized with a comma-separated list of key:value pairs within curly braces:

>>> tel_numbers={'Bruno':12224, 'Gottfried':12235, 'Johannes':12235}
>>> tel_numbers['Gottfried']
12235

Dynamic typing

Python variables are typed at the moment when a value is assigned, meaning that

  • a variable's type may change during its lifetime, and
  • there is no other way than assignment to set a variable's type
>>> var = 3
# query var's type using the built-in function 'type'
>>> type(var)
<type 'int'>
>>> var = 'text'
>>> type(var)
<type 'str'>

References vs. objects

Variables declared in Python are generally only handles that refer to an object. Assigning one variable to another may not copy a value, but increase the reference count of the object that both variables subsequently point to. In CPython, this does not hold for numeric types, but e.g. for sequences.

>>> a = 1
>>> b = 2
>>> a = b
>>> b = 3
# ints copy values:
>>> a
2
>>> li1 = [1, 2, 3]
>>> li2 = [4, 5, 6]
>>> li1 = li2
>>> li2[1] = 7
# lists copy references:
>>> li1
[4, 7, 6]

However, comparing variables using operator == is done by-value: variables are guaranteed to compare equal if their values are identical, regardless of whether they refer to the same object.

Code blocks

Code blocks group together a number of statements that shall be executed in order. In Python, code blocks must be marked by indentation (not by e.g. parentheses). This results in compact code, but may lead to problems if tabulators and space characters are both used for indentation (e.g. if Python code is read from file): following certain rules, the Python interpreter replaces tabulators with 1 to 8 blanks. That's why a mixture of tabulators and space characters in the same file is inadvisable. Nested code blocks, as e.g. in function definitions and in the branches of conditional statements, must be initiated with a colon (:).

Flow control

Amongst others, program flow may be controlled using if, for, and while statements. If a sequence type is to be filled successively in a loop, the type must be pre-defined before the loop.

>>> i = 2
>>> if i == 1:
... print "one"
... elif i == 2:
... print "two"
... else:
... print "unknown"
...
two
>>> a = ['cat', 'window']
>>> for x in a:
... print x, len(x)
...
cat 3
>>> li = list()
>>> for i in 1,2:
... li.append(i)
...
>>> li
[1, 2]

List comprehensions

List comprehensions provide a concise way to create lists. They consist of an expression followed by a for clause, then zero or more for or if clauses:

>>> vec = [2, 4, 6]
# multiply each element with 3
>>> [3*x for x in vec]
[6, 12, 18]

As stated above, list comprehensions may be nested, allowing for processing elements of nested lists (as returned from opals modules):

# create a list of lists of characters:
>>> mat = [
['1', '2', '3'],
['4', '5', '6'],
['7', '8', '9']
]
>>> mat
[['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]
# convert each character to a float
# hint: read from right to left
>>> [ [float(val) for val in row] for row in mat ]
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]

Ternary conditional operator

Like e.g. C, Python features a ternary conditional operator, which evaluates to one of its branches, depending on the result of evaluating its condition. The order in which the branches are specified in Python, however may seem surprising:

  • the result, if the condition evaluates to True, followed by the keyword if,
  • the condition, followed by the keyword else,
  • the result, if the condition evaluates to False

Functions

Function definitions start with the keyword def, followed by

  • the function's name,
  • an opening parenthesis,
  • zero or more argument names, separated by commas,
  • a closing parenthesis,
  • a colon (:), and
  • the (indented) code block to be executed.

Adjacent to the argument names, default values may be specified, separated by an equal sign (=). These may be omitted in function calls. Additionally, any function argument value may be specified by argument name. Values may be returned from functions using the keyword return.

# define function 'func', having 3 arguments,
# with default values for the latter 2:
>>> def func( arg1, arg2=2, arg3=3 ):
... print arg1,' ',arg2,' ',arg3
... return arg1
...
# call 'func', specifying all arguments:
>>> res = func(5,6,7)
5 6 7
# print the result of the function call:
>>> res
5
# call 'func', specifying only the mandatory argument:
>>> res = func(1)
1 2 3
# call 'func', specifying the mandatory argument, and 'arg3' by name:
>>> res = func(1, arg3=4)
1 2 4

Error handling

Error handling in Python is performed using exceptions: exceptions are (small) objects that carry information about the occurred error via their type, and their string representation (error message). Division by zero, for example, results in a ZeroDivisionError to be raised, which is one of Python's predefined exception types:

# Raise a ZeroDivisionError.
# By default, its type, string representation
# and the call stack are printed.
>>> 1/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero

Raising an exception generally makes the interpreter abort execution and return to the prompt or the calling application. However, exceptions may be handled programmatically: when iterating over a sequence, the exception may e.g. be printed on the screen, and looping may be continued.

>>> for i in range( -2, 3 ):
... try:
... res = 1 ./ i
... except Exception, e:
... print e
... continue
... print res
...
-0.5
-1.0
float division
1.0
0.5

OPALS exposes its own exception types (see opals::Exceptions) that are derived from Python's Exception, which may thus be handled specifically.

Importing modules

Anything not built into Python must be imported in order to be accessible - this holds true for opals modules and types, but also for those of the Python standard library. The import may be conducted e.g. by using the command import. Depending on the use of this command, all, some, or no prefixes must be specified to access the imported names: whatever directly follows the command import. One or more names following import may either be specified explicitly (separated by commas), or any name defined in a module may be imported using the wildcard *. By suffixing the import -command with as, aliases may be defined.

>>> import opals.Grid
# everything defined in Module Grid is accessible,
# but all prefixes must be specified:
>>> g = opals.Grid.Grid()
>>> from opals import Grid
# everything defined in Module Grid is accessible,
# only need to prefix with the module name:
>>> g = Grid.Grid()
>>> from opals.Grid import Grid
# only 'Grid' from Grid is accessible,
# no need for prefixes:
>>> g = Grid()
>>> from opals.Grid import *
# everything defined in Grid is accessible,
# no need for prefixes:
>>> g = Grid()
>>> gf = GridFeature()
>>> from opals import Grid as Gr
# everything in Grid is accessible,
# but with the alias 'Gr':
>>> g = Gr.Grid()

Getting help

opals modules provide most help on data types provided in this html-help also on-line in the Python interpreter, which is shown using the Python-built-in function help. This prints e.g. a module's description, together with all its methods, again with respective descriptions. In addition to the method's meaning, a method's description informs about the number and types of arguments and the type of the return value, if any. Note that the internally used C++ - data types are given.

>>> from opals.StatFilter import *
>>> help(StatFilter)
Help on class StatFilter in module opals.StatFilter:
class StatFilter(ModuleBase)
Help on class StatFilter in module opals.StatFilter:
class StatFilter(opals._Base.Base)
| opalsStatFilter performs statistical filtering of raster images w.r.t a certain kernel
|
| The module opalsStatFilter reads a raster image in GDAL suppurted format and produces
| a statistically filtered output raster. For each output raster pixel a statistical
| feature (e.g. min, max, mean, etc.) is evaluated based on a kernel neighbourhood.
|
| Method resolution order:
| StatFilter
| opals._Base.Base
| Boost.Python.instance
| __builtin__.object
|
| Methods defined here:
|
| __getstate__(...)
| __getstate__( (object)arg1) -> tuple
|
| __init__(...)
| __init__( (object)arg1) -> None
|
| __init__( (object)arg1 [, (object)inFile=None [, (object)outFile=None [, (object)oFormat=None [, (object)feature=None [, (object)kernelShape=None [, (object)kernelSize=None [, (object)limit=None [, (object)noData=None [, (object)gridSize=None [, (object)nbThreads=None [, (object)screenLogLevel=None [, (object)fileLogLevel=None [, (object)lineBufferedScreenLog=None [, (object)logFile=None [, (object)cfgFile=None [, (object)paramMapping=None [, (object)inParamFiles=None [, (object)outParamFile=None [, (object)scope=None [, (object)deleteUselessOutFile=None]]]]]]]]]]]]]]]]]]]]) -> object
|
| __reduce__ = <unnamed Boost.Python function>(...)
|
| __setattr__(...)
| __setattr__( (object)arg1, (str)arg2, (object)arg3) -> None
|
| __setstate__(...)
| __setstate__( (object)arg1, (tuple)arg2) -> None
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| StatFeature -- statistical feature
| A statistical feature of the set [min,max,diff,nmin:n,nmax:n,mean,median,sum,variance,rms,pdens,pcount,quantile:p,minority,majority,entropy] is derived. For features nmin:n and nmax:n, n (>0) and for feature quantile p ([0,1]) must be specified
# <rest truncated>

Data persistence

If variable values shall e.g. be preserved for the next Python session, the module cPickle comes in handy, as it allows for converting most Python types (including those exposed by opals) to/from a byte string and (re-)storing them to/from file:

# import the cPickle module
>>> import cPickle
# the variable to be stored
>>> var = "text"
# open a file called 'data_file'
>>> f = open('data_file','wb')
# create a Pickler-instance that uses the file
>>> pick = cPickle.Pickler(f,-1)
# convert the data and store them to disc
>>> pick.dump(var);
# close the file
>>> f.close()
>>> import cPickle
>>>f = open('data_file','rb')
>>>unpick = cPickle.Unpickler(f)
>>>var = unpick.load()
>>>print var
text
Author
wk
Date
26.02.2011
@ images
images group (opalsStripAdjust)
@ never
indicator avoiding resampling
@ pcount
number of points used for interpolation of grid point
Definition: GridFeature.hpp:14
@ pdens
point density estimate
Definition: GridFeature.hpp:13
@ window
window type (2D)
@ division
division mode for surface simplification (opalsSimplify)
@ mean
Mean.
This is the fake namespace of all opals Python scripts.
Definition: doc/temp_doxy/python/__init__.py:1
Contains the public interface of OPALS.
Definition: AbsValueOrQuantile.hpp:8
opalsStatFilter is the executable file of Module StatFilter
Definition: ModuleExecutables.hpp:213
@ integer
limits bin width and centers bins to natural numbers
@ range
scanner range group (opalsStripAdjust)
@ feature
Use a statistic feature of the boundary gap points for filling.
@ all
all possible header feature
@ help
print this usage screen: help on parameters specific to this module
@ open
morphological opening filter