Table of Contents
- See Also
- opals::IAlgebra
Aim of module
Provides a generic raster algebra calculator featuring mathematical, statistical and conditional operators.
General description
Module Algebra is a powerful tool as it allows the combination of multiple input grid datasets by means of a user defined algebraic formula. Basically, Module Algebra can be used either as a selector, as a calculator or, in all generality, as a mixture of both. Module Algebra can be applied to derive:
- difference models: by subtracting two grid models (->calculator)
- grid masks: by evaluating one or more grid models and returning either true (transparent) or false (opaque) as result (->calculator)
- erosion hazard models: by combining the pixels of multiple grids in an algebraic way (e.g. Wischmeier formula, ->calculator)
- landscape dependent DSM models: by choosing the output pixel values from multiple input grid sources according to certain criteria (->selector)
- DTM mosaics: by combining multiple input grids (->selector)
- point density models: by summing up the point densities of multiple input grids (->selector+calculator)
- ...
In general, the output grid (parameter outFile) is derived by applying an algebraic formula (parameter formula) combining the values of one ore more input grid models (parameter inFile). The formula can either be specified as a generic filter or in python syntax. In the latter case the python interpreter is used evaluate the formula.
By default, the output grid is created with the same pixel size as the first input grid, but an arbitrary pixel size can be specified using parameter gridSize. The extents of the resulting grid can either be specified by the user (parameter limit (left,lower,right,upper)) or they are calculated from of the outermost pixel centers of all input grid datasets involved. If no user defined limits are specified, either the common area (parameter limit intersect), the overall area (parameter limit union) or the extents of the i'th dataset (parameter limit dataset:i) are applied (c.f. section Parameter description for more details).
Please note that Module Algebra can deal with non-coinciding models simultaneously. If necessary, the input grids/rasters are re-sampled according to the output grid structure prior to evaluating the formula (parameter resampling). To strictly avoid resampling, please specify resampling=never. This would lead to a program termination if any of the contributing input grid files would require resampling.
Formula syntax
The following sections describe in detail how arbitrary formulae can be specified in Module Algebra using either Generic filter or Python syntax.
Generic filter syntax
This section describes how to define the algebraic formula using the generic filter syntax. Due to the better performance, it is highly recommended to use the generic filter syntax whenever possible.
In addition to the main purpose of sub-selecting data, generic filters also act as a calculator. Within Algebra the grid location and the respective grid values can be used as variables in the formula. The following well-known variables are defined:
Variable name | Description |
x | x-coordinate (Easting) of the pixel location |
y | y-coordinate (Northing) of the pixel location |
r[i] | pixel value of i-th input grid model (index starting with 0) |
invalid | general void indicator |
Using the above variable names we can now combine one ore more input grid values in an algebraic expression to compute a scalar value representing the respective output grid value. For example, to compute a slope raster in % based on a given input gradient raster, the formula is:
Please note, that in the above formula only a single input grid is required which is accessed by r[0]
. In general, the grid/raster values are accessed by their index starting form 0
(first input grid) to n99
(last input grid). The result of the expression (i.e. a scalar value) is assigned to the corresponding output grid cell.
Please note that if no algebraic expression is specified, a logical result is returned rather than the value. Thus,
returns true (1) if r[0] exists, otherwise false (0). Consequently, if we simply want to return the value of the first input grid, e.g. using Algebra as a raster format converter, we would have to write:
Another simple example is to subtract two input grids:
In this case, the difference is returned for all those pixels where, both, r[0] and r[1] exist. If either of the operands is invalid (i.e. nodata) the resulting output pixel will be invalid (nodata) as the expression cannot be evaluated correctly.
A typical application of module Algebra is the derivation of (binary) grid masks. In this case either 0
or 1
is returned depending on a certain condition. Conditions can be specified in generic filters using the ternary operator:
In the above example, first the condition (r[0]*r[1]<1.0
) is evaluated and depending on the binary result of the conditional expression 0
or 1
, respectively, is returned as result of the entire formula. Please note that if one of the operands in the above condition does not exist, the entire condition evaluates to false, i.e. the else
branch of the ternary operator is returned.
Within conditional expressions it is often desired to check the existence of certain grid values as well as their respective value. Consider a formula returning the ratio between two grids:
For this formula we might want to check, if the grid value of the second input grid (r[1]
) exists and is not equal to zero (otherwise the division will fail). Generic filters allow to query the existence of grid values by internally comparing the actual grid value with the grids (inherent) nodata value. For the example at hand we can simply write:
Please note the following details:
- existence check:
r[1]
checks the existence of the grid value; i.e, the grid contains a valid value at the respective location. - concatenation of logical expressions: the existence (
r[1]
) and not-equal-null (r[1]!=0
) checks are concatenated with a logical "and" (&&
) invalid
is returned if the (entire) conditional expression is "false", which results in an invalid (nodata) output pixels.
To support the creation of grid/raster mosaics, the generic filters provide statistical operators (min, max, mean, ...) for the entire stack of grid/raster values. Consider a set of stripwise DSM grid models. In order to create a mosaic of all strip DSMs, we can simply write...
... returning the mean of all n
valid input grids cells. Pleas not, that invalid grids cells (i.e., cell value = nodata value) are not considered for the statistical calculation.
Please note, that in the context of opalsAlgebra the following restrictions w.r.t generic filters apply:
- Only raster values (r[i]) and the grid location coordinates (x,y) can be used as input variables but no additional attributes.
- Assignments are only supported to store intermediate results but don't take an effect beyond the (local) filter operation itself.
- If assignments are used, the formula returns the first non-assignment expression. If only assignments are provided, the result (i.e., right side of the assignment) of the first assignment is returned.
The only reason for using assignments in the context of an opalsAlgebra formula is to increase the readability of the formula. In the following example a "raster cell density" is calculated by first defining the number of involved rasters, secondly counting the number of valid cells and, finally, returning the cell density.
According to the above rules, the formula returns cnt/n
, where n
and cnt
are (locally) calculated within the first two assignments.
Generic filters are a very powerful tool providing a set of arithmetic, logical and conditions operators and other features. A detailed documentation can be found here.
Python syntax
An alternative way of specifying formulae in Module Algebra is to use Python syntax. In this case an internal Python interpreter is used to evaluate the formula. Although the performance is generally worse, the Python syntax offers more flexibility and should be used whenever the formula is too complex to be written in generic filter syntax. The basic syntax of Python style formulae is:
where the return value represents the output grid value. Normally, the output grid value is a function of one or more input grid values. Let's assume that our input grid contains surface slope values (tangent of the steepest slope). In order to compute a grid containing the values of steepest slope in % we need to multiply the original slope values by 100. In this case, the formula looks as follows:
where r[0] represents the grid value of the input grid. Of course, the combination formula can be much more complex, incorporating multiple grids, conditional statements and the like. A simple example based on two input grid models, which are used to create a grid mask (0|1) based on a certain threshold is:
The formula must be specified as a string in Python syntax. In fact, the Python interpreter is used to parse and evaluate the condition. Please refer to the Python documentation for a detailed description of the Python syntax. In the following, important information concerning the construction of formulas is summarized:
To access certain information of the considered grid models (pixel values, no data indicators, etc.), the following well-known variables are predefined:
Variable name | Description |
x | x-coordinate (Easting) of the pixel location |
y | y-coordinate (Northing) of the pixel location |
r[i] | pixel value of i-th input grid model (index starting with 0) |
None | general void indicator (c.f. below for details). |
None
is a constant representing a null-object in Python. It can be used to express empty or invalid objects of any type. In Module Algebra in particular, None
is used to handle invalid input or output pixels. E.g., to verify if all input pixels are valid, one can write (for better readability in multi-line style):
The very same statement can be written in more compact format, without the None
keyword:
Thus, in the first line the statement r[0]==None
is equivalent to not r[0]
(i.e. r[0]
is invalid), and in the second line it is even allowed to skip the return value, as an empty return is equivalent to returning an empty object.
To combine the specific pixel values, a series of logical and mathematical operations are provided by Python. For a complete reference please refer to the Python language reference. The most important operations are listed below:
Boolean operations (or,and,not) ordered by ascending priority):
Operation | Result |
x or y | if x is false, then y, else x |
x and y | if x is false, then x, else y |
not x | if x is false, then True, else False |
Comparison operations:
Operation | Meaning |
< | strictly less than |
<= | less than or equal |
> | strictly greater than |
>= | greater than or equal |
== | equal |
!= | not equal |
is | object identity |
is not | negated object identity |
Built-in arithmetic operations:
Operation | Result |
x + y | sum of x and y |
x - y | difference of x and y |
x * y | product of x and y |
x / y | quotient of x and y |
x // y | (floored) quotient of x and y |
x % y | remainder of x / y |
-x | x negated |
+x | x unchanged |
abs(x) | absolute value or magnitude of x |
int(x) | x converted to integer |
long(x) | x converted to long integer |
float(x) | x converted to floating point |
complex(re,im) | a complex number with real part re, imaginary part im. im defaults to zero. |
c.conjugate() | conjugate of the complex number c. (Identity on real numbers) |
divmod(x, y) | the pair (x // y, x % y) |
pow(x, y) | x to the power y |
x ** y | x to the power y |
Apart from these basic operations, conditional statements are of crucial importance for the construction of complex formulas. Conditions can be specified either as single-line or as multi-line python statements:
Single-line conditional statement:
Multi-line conditional statement:
However, since the formula must be passed as a single string argument, line-breaks have to be entered via the special \n character sequence. The correct formula string corresponding to the example above is:
Please note, that since Python is an indent-oriented language (in opposition to languages using parentheses for grouping), the blank characters before the return
statements are mandatory. The same applies to the \nelif...
sequence, where no blank is allowed in order to ensure the correct indentation.
The formulas are not restricted to conditional statements, in fact, all control statements (for-loops, while-loops, etc.) are supported. The only preconditions are:
- the formula string must be provided in correct Python syntax
- all control paths of the formula must return a value (if a control path does not return a value, the respective pixel will be void.)
Parameter description
Remarks: mandatory
The grid model files are expected in GDAL supported format. The grid values of the respective models are referred to as r[0], r[1] ... in the calculation formula.
Remarks: estimable
Path of output grid file in GDAL supported format.
Estimation rule: The current path and the name (body) of the input file are used as file name basis. Additionally, the default postfix '_algebra' and the extension corresponding to the output format are appended
Remarks: estimable
Grid file format, use GDAL driver names like GTiff, AAIGrid, USGSDEM, SCOP ... .
Estimation rule: The output format is estimated based on the extension of the output file. If neither outFile nor oFormat are specified, GeoTiff is used as default.
Remarks: mandatory
Formula to compute a single grid value as a string in in Generic Filter or Python syntax.
A formula must contain return a value (e.g. generic filter: r[0]-r[r1] or python: return r[0]-r[1])). For more details please refer to the module documentation.
Remarks: estimable
Size of output grid cell in x- and y-direction. Estimation rule: output grid size = grid size of 1st inFile
Remarks: default=9999
Value representing an undefined value in the output grid model.
Remarks: optional
If -limit is skipped, the common area (intersection) of all input datasets is used.
Remarks: default=bilinear
Possible values: nearestNeighbour ... nearest neighbour bilinear ........... bi-linear interpolation bicubic ............ bi-cubic interpolation never .............. indicator avoiding resampling
Method to be used for resampling the input grids to the specified grid structure of the output grid.
Remarks: default=500
The memory consumption scales with the number of input grids in order to provide good reading performance. An upper memory limit can be specified in order to avoid overflows in case of numerous input grids (e.g. large mosaic),
Remarks: estimable
Estimation rule: In opalsAlgebra, the output raster is processed block-wise. In general, for eachblock the pixels of all input datasets are queried, even if they only contain void pixels. The latter is typically the case in merging/mosaicing applications where the input datasets are either disjunct or only partially overlapping. To improve the run-time performance in such cases activating the skipVoidAreas flag allows ignoring blocks of all-void input pixels. Please note, that in this case the order of the input datasets is not strictly maintained which might lead to invalid results if the calculation formula explicitely uses input dataset indices (e.g. -formula r[0]?r[1]:r[2]). However, for all formulae which aggregate only the valid pixels (e.g. -formula mean(r)), activating skipVoidAreas substantially speeds up processing.
Examples
The data used in the subsequent examples are located in the $OPALS_ROOT/demo/
directory and the following preprocessing steps (data import and surface grid calculation) are required:
Example 1: Roof model
In the first example, the roofs of a digital surface model are extracted by applying a simple height threshold. All other values are set to void (no data).
Generic filter syntax:
Python syntax:
The output is written to a separate grid file in GeoTiff format (strip21_roofs[_py].tif
) and a standard color coded visualization of it is shown in the following figure:
Example 2: Grid mask
In this example, two grid models describing roughness (sigmaZ) and shadowing effects (excen), both by-products of the moving planes surface interpolation with /ref ModuleGrid, are used to create a binary grid mask (0..invalid,1..valid pixel).
Generic filter syntax:
Python syntax:
The very same condition can be written more compact as a single (binary) return statement:
Generic filter syntax:
Python syntax:
Example 3: DTM Mosaic I
A simple way of creating a DTM mosaic based on multiple input grids (e.g. each grid covering a single map sheet) is by selecting the first valid grid value from the list of input grids for each grid position using Python's for .. in .. :
statement:
Python syntax:
This commands takes three input grids (strip31.tif strip21.tif strip11.tif
), calculates the xy-extents of the output grid as the union of all input datasets (-limit union
) and, for each grid post, returns the first grid point with a valid height (-formula "... x:\n return x"
) within a loop over all input grids (-formula "for x in r[:]: ..."
).
To calculate a DTM mosaic based on the mean value of all valid pixels the following command in generic filter sysntax can be used:
The same result can be achieved using the wild card syntax for specifying the input files:
Example 4: DTM Mosaic II
In the following example a (global) point density model is created based on strip-wise point density models. The strip-wise point density models can be created with the following command:
Now, we can calculate the global point density model as:
Generic filter syntax:
Python syntax:
Contrary to the previous example, all valid point densities values are summed up and the sum is returned (if the sum is greater 0).
Example 5: Landscape dependent DSM
Finally, opalsAlgebra is used for the derivation of a landscape dependent surface model as described in Hollaus et al (2010). Hereby, the single DSM pixels are either selected from a raster map containing the highest elevation within a cell (dsm_max.tif) or from a grid model calculated via moving planes interpolation (dsm_mp.tif) depending on a roughness map (dsm_mp_sigma0.tif, by-product of moving planes interpolation).
Generic filter syntax:
Python syntax:
Figure 2 shows hill shadings of all three DSM variants derived with the following commands:
References
M. Hollaus, G. Mandlburger, N. Pfeifer, W. MÃ¼cke: Land cover dependent derivation of digital surface models from Airborne laser Scanning data; in: "IAPRS Volume XXXVIII Part 3A", (2010), ISSN: 168299750; 221 - 226.
- Date
- 26.05.2016