Generic filter is one of the Filters provided by OPALS to filter vector data. Among them, generic filter is a rather flexible one: it allows for evaluating arbitrarily complex expressions consisting of any of the supported attribute identifiers (see ODM predefined attributes), coordinate identifiers, raster identifiers, neighbor geometry identifiers, constants, scalar-valued unary and binary functions, and a bunch of arithmetic, relational, logical, conditional, and assignment operators.
For each object to filter, generic filter (generally) substitutes attribute identifiers contained in the expression for the object's corresponding attribute values, and coordinate identifiers for the object's coordinates. If the subsequent evaluation converts to true, then this object passes through the generic filter.
The expression syntax not only provides means to use attribute and coordinate values, but also to test the presence of attributes and the validity of coordinates.
Kindly note that filtering by coordinates is applicable only to point data i.e. any lines will not pass. Depending on the OPALS module, the set of admissible operators may be limited.
In the following, the expression syntax is discribed twice: once in a rather colloquial way (see Informal syntax definition with examples), including some exemplary expressions, once in a formal way (see Formal grammar ).
Generic filter expressions constitute rooted binary trees, where the leaves are either attribute identifiers, coordinate identifiers, or constants. If the root node is an attribute identifier, then the evaluation returns the presence of the attribute, not its value. Likewise, if the root node is a coordinate identifier, then the evaluation returns the validity of the coordinate (and not its value). Thus,
Generic[SigmaZ]
is a valid generic filter consisting of a leaf only: objects that feature the specified attribute pass through, regardless of the attribute's value.
Generic[Z]
is a valid generic filter, too. Objects featuring valid z-coordinates pass through.
Leaves may be combined to arithmetic expressions using resp. operators and functions (see the table below), regarding the usual operator precedence:
Generic[0.1 + 2*SigmaZ]
is a generic filter consisting of 3 leaves and 2 binary operators: as the attribute identifier is not the root node, the corresponding attribute value is evaluated. Objects that feature the specified attribute and for whose attribute value 0.1 + 2*SigmaZ
converts to true
(i.e. the result is different from zero) pass through. As one may expect, the attribute presence and value are evaluated first, then the multiplication is performed, followed by the addition, and finally, the resulting real number is converted to a boolean value.
Leaves and / or arithmetic expressions may further be combined to comparative expressions using resp. operators:
Generic[SigmaZ < 0.1]
Generic[2*SigmaX < SigmaY]
Generic[PointLabel == "Vienna"]
Generic[Z > 100]
Leaves, arithmetic expressions, and comparative expressions may be combined to logical expressions using the operators already mentioned in Filters :
Generic[SigmaZ < 0.1 AND PointLabel == "Vienna"]
lets data pass through that feature both the attributes SigmaZ
and PointLabel
, and for whom both comparisons evaluate to true.
Generic[SigmaZ < 0.1 OR PointLabel == "Vienna"]
lets data pass through that either feature the attribute SigmaZ
, and whose value is smaller than 0.1, or data that feature the attribute PointLabel, and whose value is Vienna
, or data that meet both conditions.
Generic[Z >= 0 AND Z <= 100]
lets (point) data pass through that feature valid z-coordinates within the (inclusive) interval [0, 100].
Like the root node, logical operators show a special behaviour concerning the evaluation of attribute-leaves if applied directly as operands:
Generic[SigmaX AND SigmaY]
does not evaluate any attribute value, but combines the presence of the 2 attributes, i.e. data pass that provide both attributes, regardless of their values. In
Generic[SigmaX OR SigmaY]
objects that feature one or both of the specified attributes pass through, while in
Generic[SigmaX OR SigmaY < 0.1]
objects pass through that either provide the attribute corresponding to SigmaX
, or hold SigmaY
, with a value smaller than 0.1. Data that meet both conditions are also passed through.
Logical expressions may serve as a condition for the ternary conditional operator. This operator evaluates to its 2nd argument if the condition (1st argument) evaluates to true, and to its 3rd argument otherwise:
Generic[x >= 0 ? y : z ]
lets data pass through whose x-coordinates are larger than or equal to zero and who have valid y-coordinates; otherwise, those who have valid z-coordinates.
Finally, the assignment operator is supported, which may e.g. serve for setting attributes during import/export:
Generic[amplitude = 100*_myIntensity]
The assignment operator always assigns to its 1st argument its 2nd argument, regardless of the 2nd argument's validity, and it lets all data pass through. Thus, in the above example, for objects that feature the user-defined attribute _myIntensity
, this user-defined attribute times 100 is assigned to the (predefined) attribute amplitude
. amplitude
results as invalid for objects that do not feature _myIntensity
, no matter if amplitude
was valid before or not. Regardless of the resulting validity of amplitude
, all objects pass through the filter.
Multiple assignments may be specified, separated by a semicolon.
For user-defined attributes, their data type may explicitly be given enclosed in parentheses, following the user-defined attribute identifier:
Generic[_myQuality(float) = pow( SigmaX*SigmaY*SigmaZ, 1/3 ) ]
For objects that feature the predefined attributes SigmaX
, SigmaY
, and SigmaZ
, this filter assigns to the user-defined attribute _myQuality
of data type float
a valid value. For all other objects, it renders the attribute _myQuality
as invalid.
Generic filters support a set of statistical operators via the general syntax:
operator(.)
where operator
is one of the identifiers that belong to rule Stat: min, max, mean, sum, stdDev, ... . With this syntax, statistical operators are applied to the valid items among the supplied ones. These items can either be the
r
, or then
. In this case, either a coordinate or an attribute identifier to be evaluated must follow, separated by a dot. The exception to this rule is count(n)
, which counts the number of supplied neighbor geometries, independent of their coordinates and attributes.As mentioned, only valid items among the supplied ones are considered in the statistical operations. Thus, e.g.
Generic[ _myStat(float) = max(r) ]
determines the maximum value of all valid rasters (i.e. not NODATA) and assigns that maximum to the user-defined attribute _myStat
of type float
, while
Generic[ _myStat(float) = max(n.z) ]
does the same for the valid z-coordinates (i.e. of finite value) of all neighbor points.
count(n)
considers all supplied neighbor geometries as valid, and hence, the following assigns the number of supplied geometries to the user-defined attribute _myStat
:
Generic[ _myStat(float) = count(n) ]
If only a subrange of the supplied rasters shall be evaluated, then use the following syntax, which follows the indexing syntax of Python sequence types:
operator(r[beg:end])
where beg
is the zero-based index of the first supplied raster to be evaluated, and end
is the index of one after the last supplied raster to be evaluated. Both beg
and end
are optional and default to 0 and the number of supplied rasters, respectively. Negative indices are added to the number of supplied rasters, and hence refer to the end of the array of supplied rasters.
Generic[ _myStat(float) = max(r[0:2]) ]
thus determines the maximum value of the valid rasters among the first 2 of all supplied rasters and assigns that maximum to the user-defined attribute _myStat
of type float
. The same indexing syntax can be used to select a subrange of neighbor geometries, e.g.:
Generic[ _myStat(float) = max(n[0:-3].SigmaX) ]
assigns the maximum value of the valid attributes SigmaX
among the supplied neighbor geometries, excluding the last 3 neighbor geometries.
Random numbers may be generated using the general syntax:
random( distribution, arg1 [, arg2] )
where distribution
is one of the distribution names mentioned in column "Meaning" for rule random: uniform_int, normal, ... . The distribution name must be followed by the distribution parameters, given in column "Symbol(s)", separated by commas. Hence,
random( normal, 0, 1 )
will generate real numbers distributed according to the standard normal distribution (with mean 0.0 and standard deviation 1. ). Note that the combination of the uniform integer distribution and the modulo function provides a simple way of random subsampling. To e.g. select 1 out of 10 points, use:
random( uniform_int, 1, 10 ) == 1
For a different way to do subsampling, see Serial number generator .
The enumeration of non-negative integer numbers in ascending order (0, 1, 2, ...) is gained with:
serial()
This offers a way to subsample data based on its order of evaluation. To e.g. import only every 4th geometry, one may call Module Import like this: opalsImport -infile data.las -filter "Generic[fmod(serial(), 4) == 0]"
For a different subsampling method, see Random number generators.
When using this nullary function in a more complex filter tree, mind that each serial number generator holds its own position in the sequence of non-negative integers as state, and this position is advanced only when queried. This has implications when using
serial() + serial()
generates the sequence 0, 2, 4, ...
x >= 0 ? serial() : -1
returns -1 for points with negative x-coordinates. For other points, it enumerates the sequence 0, 1, 2, ..., irrespective of how many points with negative x-coordinates are evaluated in between them.
Generic filter divides the attribute data types supported by the ODM into 2 groups:
Internally, generic filter converts all numeric values to real numbers, and the full range of operators shown in the table below is supported. String-type-attributes support the addition and relational operators only. Identifiers of numeric and character string attributes must not be mixed within the same arithmetic or relational expressions, but they may be combined using logical operators. Hence,
Generic[SigmaX AND PointLabel]
is a valid expression, while
Generic[SigmaX + PointLabel]
is not.
The following table comprises all tokens understood by generic filters. The precedence defines the order of evaluation: the lower the precedence number, the sooner the evaluation. Operators and functions of the same precedence are evaluated from left to right. Kindly note that all trigonometric functions use radians as angular measure.
Group | Prec. | Rule | Symbol(s) | Meaning |
---|---|---|---|---|
arithmetic | 1 | Attribute | <predefined attribute identifier> <user-defined attribute identifier> | depending on the parent node, evaluates to either the presence of the corresponding data attribute, or the attribute value. User-defined attribute identifiers may be single- or double-quoted. If quoted, they may contain white space. |
Coordinate | x, y, z | depending on the parent node, evaluates to either the validity or the value of the corresponding data coordinate. Returns invalid values for non-point data. | ||
Raster | r[i] | depending on the parent node, evaluates to either the validity or the value of the corresponding raster element. Indexing starts at 0. | ||
Neighbor | n[i].attribute | depending on the parent node, evaluates to either the validity or the value of the neighbor coordinate/attribute. Indexing starts at 0. | ||
Self | s | Used only as argument to rule NeighborBinary, to select the current geometry. This is the same as n[0] unless different processing and neighbor filters are used, as is possible with e.g. Module Normals. | ||
Real | <string literal convertible to a real number> | leaf that evaluates to the passed number | ||
String | <single or double quoted string literal> | leaf that evaluates to the passed string (unquoted) | ||
Factor | - | unary minus | ||
+ | unary plus, evaluates to its argument | |||
(...) | grouping operator | |||
Constant | true | leaf that evaluates to the boolean value 'true' | ||
false | leaf that evaluates to the boolean value 'false' | |||
pi | leaf that evaluates to \(\pi\), the ratio of a circle's area to the square of its radius | |||
invalid | leaf that evaluates to an invalid value | |||
Nullary | serial() | evaluates to the enumeration of non-negative integers in ascending order: 0, 1, 2, ... See Serial number generator | ||
Unary | abs(...) | unary function that evaluates to the absolute value of its argument | ||
acos(...) | unary function that evaluates to the arc cosine of its argument | |||
asin(...) | unary function that evaluates to the arc sine of its argument | |||
atan(...) | unary function that evaluates to the arc tangent of its argument, return values are in the range \([-\pi/2,+\pi/2]\) | |||
ceil(...) | unary function that evaluates to the smallest integer not less than its argument | |||
cos(...) | unary function that evaluates to the cosine of its argument | |||
cosh(...) | unary function that evaluates to the hyperbolic cosine of its argument | |||
deg2rad(...) | unary function that converts its argument from degrees to radians i.e. it evaluates to \(\pi/180\) times its argument | |||
exp(...) | the exponential function \(e^x\), with argument as x, or Euler's number raised to the argument-th power, resp. | |||
floor(...) | unary function that evaluates to the largest integer not greater than its argument | |||
grad2rad(...) | unary function that converts its argument from gradians (gons) to radians i.e. it evaluates to \(\pi/200\) times its argument | |||
log(...) | unary function that evaluates to the natural (base \(e\)) logarithm of its argument | |||
log10(...) | unary function that evaluates to the base 10 logarithm of its argument | |||
rad2deg(...) | unary function that converts its argument from radians to degrees i.e. it evaluates to \(180/\pi\) times its argument | |||
rad2grad(...) | unary function that converts its argument from radians to gradians (gons) i.e. it evaluates to \(200/\pi\) times its argument | |||
round(...) | unary function that evaluates to the integer closest to its argument. round(0.5) == 1, round(-0.5) == 0 | |||
sin(...) | unary function that evaluates to the sine of its argument | |||
sinh(...) | unary function that evaluates to the hyperbolic sine of its argument | |||
sqrt(...) | unary function that evaluates to the square root of its argument | |||
tan(...) | unary function that evaluates to the tangent of its argument | |||
tanh(...) | unary function that evaluates to the hyperbolic tangent of its argument | |||
Binary | atan2(..., ...) | atan2(y,x) evaluates to the arc tangent of y/x, using the signs of the arguments to compute the quadrant of the return value that is in the range \([-\pi,+\pi]\) | ||
fmod(..., ...) | fmod(x,y) evaluates to the remainder of x/y | |||
ldexp(..., ...) | ldexp(num,exp) evaluates to num * 2exp | |||
pow(..., ...) | pow(base,exp) evaluates to base raised to the exp-th power | |||
NeighborBinary | SqrDist2D(n0, n1) | evaluates to the Euclidean distance squared, considering x- and y-coordinates only | ||
SqrDist3D(n0, n1) | evaluates to the Euclidean distance squared, considering x-, y- and z-coordinates | |||
Dist2D(n0, n1) | evaluates to the Euclidean distance, considering x- and y-coordinates only | |||
Dist3D(n0, n1) | evaluates to the Euclidean distance, considering x-, y- and z-coordinates | |||
Azimuth(n0, n1) | evaluates to the angle with the positive y-axis of the difference vector of 2 neighbors projected onto the x/y plane, counted clockwise i.e. \(atan2( n1.x - n0.x, n1.y - n0.y)\) | |||
ZenithDist(n0, n1) | evaluates to the angle with the positive z-axis of the difference vector of 2 neighbors i.e. \(atan2( sqrt( pow(n1.x - n0.x, 2) + pow(n1.y - n0.y, 2) ), n1.z - n0.z )\) | |||
Quadrant(n0, n1) | evaluates to the quadrant number (1-4) of n1 with n0 as coordinate system origin. The coordinate axes are included to the quadrant region in a counterclockwise manner. i.e. the positive x-axis is part of quadrant 1, the positive y-axis is part of quadrant 2, etc. Furthermore, the origin is included to quadrant 1. | |||
Octant(n0, n1) | evaluates to the octant number (1-8) of n1 with n0 as coordinate system origin. The octant definition is based on the quadrant definition above. if n1.z < n0.z than the quadrant value is increased by 4. | |||
Stat | count( . ) | returns the number of valid items | ||
countUnique( . ) | returns the number of distinct, valid items | |||
first( . ) | returns the first valid item | |||
min( . ) | returns the minimum of valid items | |||
max( . ) | returns the maximum of valid items | |||
sum( . ) | returns the sum of valid items | |||
mean( . ) | returns the arithmetic mean of valid items | |||
median( . ) | returns the median of valid items | |||
rms( . ) | returns the root mean square of valid items as \(\sqrt{\sum_{i=1}^{N} r_i^2 / N}\) with \(N\) being the count of valid items | |||
stdDev( . ) | returns the standard deviation of valid items as \(\sqrt{\frac{1}{N-1} \sum_{i=1}^{N}(r_i - \bar{r})^2 }\) with \(\bar{r}=\sum_{i=1}^{N}r_i/N\) and \(N\) being the count of valid items | |||
stdDevMAD( . ) | returns a robust estimation of the standard deviation of valid items: the median of the absolute values of the deviations from the sample median, scaled to make it a robust, consistent estimator of the standard deviation of normally distributed data (see here). It is computed as: \(1.4826 \cdot median_i(\left|r_i - median_j(r_j)\right|)\) with \(i,j=1..N\) and \(N\) being the count of valid items | |||
minAbs( . ) | returns the minimum absolute value of all valid items: minAbs(-5,4,-1)=1 | |||
maxAbs( . ) | returns the maximum absolute value of all valid items: maxAbs(-5,4,-1)=5 | |||
meanAbs( . ) | returns the arithmetic mean of the absolute values of all valid items | |||
minAbsSigned( . ) | returns the signed minimum absolute value of all valid items: minAbsSigned(-5,4,-1)=-1 | |||
maxAbsSigned( . ) | returns the signed maximum absolute value of all valid items: maxAbsSigned(-5,4,-1)=-5 | |||
Random | a, b | uniform_int | ||
a, b | uniform_real | |||
p | bernoulli | |||
t, p | binomial | |||
t, p | negative_binomial | |||
p | geometric | |||
\(\mu\) | poisson | |||
\(\lambda\) | exponential | |||
\(\alpha\), \(\beta\) | gamma | |||
a, b | weibull | |||
a, b | extreme_value | |||
\(\mu\), \(\sigma\) | normal | |||
m, s | lognormal | |||
n | chi_squared | |||
a, b | cauchy | |||
m, n | fisher_f | |||
n | student_t | |||
2 | Term | * | binary operator that multiplies its operands | |
/ | binary operator that divides its left operand by its right operand | |||
3 | Expression | + | binary operator that adds its operands | |
- | binary operator that subtracts its right operand from its left operand | |||
relational | 4 | Less | <= | binary operator that evaluates to true, iff its left operand is smaller than or equal to its right operand |
< | binary operator that evaluates to true, iff its left operand is smaller than its right operand | |||
>= | binary operator that evaluates to true, iff its left operand is greater than or equal to its right operand | |||
> | binary operator that evaluates to true, iff its left operand is greater than its right operand | |||
5 | Equal | == | binary operator that evaluates to true, iff its left operand is equal to its right operand | |
!= | binary operator that evaluates to true, iff its left operand is not equal to its right operand | |||
logical | 6 | Inverted | ! not | unary operator that evaluates to true, iff its operand evaluates to false |
(...) | logical grouping operator | |||
7 | And | && and | binary operator that evaluates to true, iff both of its operands evaluate to true | |
8 | Or | || or | binary operator that evaluates to true, iff one or both of its operands evaluate to true | |
conditional | 9 | TernaryConditional | ? : | a ? b : c evaluates to b if a evaluates to true, otherwise evaluates to c |
assignment | 10 | Assignment | = | assignment operator. Multiple assignments must be separated by ';'. To specify the data type of user-defined attributes to be assigned to, append the data type in parentheses. |
tbd
In the following, a formal definition of the generic filter syntax is given. For possible tokens denoting predefined attributes ( "PreDefinedAttribute" in rule Attribute), see ODM predefined attributes. For the production rule for real numbers ( "Real" in rule Constant), see Filter string syntax.
The generic filter syntax represented as railroad diagrams, as here: for an explanation of symbols, see Railroad diagrams.
|
For the formal definition of the generic filter syntax in EBNF, the same notation as in Filter string syntax is used.