This document describes the means provided by OPALS to filter vector data. In the context of OPALS, filtering denotes the generation of a new data set based on another data set using element-wise selection criterions and / or transformations.
The OPALS datamanager (ODM) is designed to store arbitrary attributes (see ODM predefined attributes) along with vector data. Hence, OPALS modules use the ODM as the central resource of vector data. Users may want to transform the data geometrically before import, or process only a subset of the data provided by an ODM, e.g. data within a certain region only. Likewise, only a subset of the processed data may need to be exported. This is where filters come into consideration.
Filter types and their combination
For the purpose of filtering, OPALS provides a number of filter types that belong to one of two classes (except for Generic):
Selectors accept and hence append data fulfilling a certain geometric or semantic criterion to the output data set, rejecting (absorbing) them otherwise. Selectors may also split data and accept certain parts only, which is e.g. the case if an input object crosses the boundary of a geometric validation region.
Transformers accept and hence pass all data, but change their geometry or attributes.
Most filters perform their operations based on proper parameters. The table below lists the implemented filter types together with their operations.
OPALS filter types and their operations
Name
Class
Inspects
Parameters
Operation
Generic
selector, transformer
attributes, point coordinates
An expression consisting of constants, any of the supported attribute identifiers, and logical / comparison / arithmetic operators.
Whole geometry objects for whose attributes the expression evaluates to true pass through.
Region
selector
geometry
A simple (non-self-intersecting) polygon.
Data (or datum parts) within the polygon pass through.
Echo
selector
attributes
a set of acceptable echo descriptors
Points with acceptable echo attributes pass through. The given descriptors are transformed internally to valid combinations of the attributes EchoNumber and NrOfEchos.
Class
selector
attributes
a set of acceptable classifications
Points with acceptable classifications (attribute Classification) pass through
Synthetic
selector
attributes
–
Points having set the ‘synthetic’ bit (within the attribute ClassificationFlags) pass through
KeyPoint
selector
attributes
–
Points having set the ‘keypoint’ bit (within the attribute ClassificationFlags) pass through
Withheld
selector
attributes
–
Points having set the ‘withheld’ bit (within the attribute ClassificationFlags) pass through
Semantic
selector
attributes
a set of acceptable semantic descriptors
Data with acceptable semantic descriptors (attribute ScopSemantic) pass through
Affine
transformer
geometry
the 12 parameters of a 3d affine transformation given by a transformation matrix \( A = [a_{i,j}]_{3 \times 3}\) and a translation vector \( b = [b_{k}]_{3 \times 1}\)
The coordinates \(\{X,Y,Z\}\) of data vertices are transformed to \(\{X',Y',Z'\}\) by application of
Attributes are passed unchanged. Inversion of this filter yields the application of the inverse transformation.
EchoWidthDepOnAmp
selector
attributes
a continuous, piecewise linear function
The function defines the maximum echo width (attribute EchoWidth) for data to be acceptable, as a function of their amplitude (attribute Amplitude). It is thought to extend infinitely with constant value at its marginal nodes. Data featuring echo widths smaller or equal to this adaptive threshold pass through.
PointSeq
selector
attributes
a set of acceptable indices
The order of appearance (e.g. as read from file) of data is inspected. Points that appear consecutively and feature the same GPS-time (attribute GPSTime) are regarded as a sequence. Data featuring acceptable indices i.e. positions within these sequences pass through. Thus, as an exception to the rule that filters apply element-wise criterions, PointSeq considers data appearing before or after.
PointSeq2Echo
transformer
attributes
–
The order of appearance (e.g. as read from file) of data is inspected. Points that appear consecutively and feature the same GPS-time (attribute GPSTime) are regarded as a sequence. To each point in such a sequence, the attributes EchoNumber and NrOfEchos are assigned, based on the point's position within the sequence and the sequence length.
Remove
transformer
attributes
none, one, or more attribute identifiers
If no attribute identifiers are given, then this filter removes all attributes from all data. This way, memory requirements may be lowered.
If one or more attribute identifiers are specified, then only these attributes will be removed from all data.
Primitive
selector
–
a set of acceptable primitive types
Data of the specified primitive types pass through.
Pass
–
–
–
All data pass through unmodified. This filter cannot be combined with others, and its specification simply underlines that no filter is used.
Tree composition from definition strings
Filters are invertible and may be combined to composite filters i.e. arbitrarily large filter trees. OPALS filters and filter trees are created based on definition strings that adhere to a special syntax.
Module parameters containing the word 'filter' expect strings of this structure. Informal syntax definition with examples presents a rather colloquial definition of this syntax, together with exemplary filter definition strings. The syntax is defined formally in section Formal grammar
Informal syntax definition with examples
Filters are specified using their names. For filters depending on parameters, respective filter parameter definitions must be appended, enclosed in square brackets. These must adhere to filter-specific parameter syntaxes, which are given in the table below.
Generic[ SigmaZ < 0.01 ]
geometry objects that provide an attribute identified by "SigmaZ", with a value smaller than 0.01 pass through
Generic[ abs( atan2( NormalY, NormalX) * 180 / pi ) < 30 and PointLabel == "Vienna" ]
geometry objects whose planar normal vector direction deviates by less than \(\pm 30^{\circ}\), and whose point label is "Vienna" pass through
Region
The query or analysis to be performed (optional), followed by
two or more points in 2-D (required), or
name of a file from which polygons are to be loaded
.
The supported queries and analyses are a subset of those defined by OGC's Simple Feature Access. Queries leave their arguments unchanged and return a truth value. Analyses perform the respective geometric operation, replace the current geometry object with its result, and return true if the result is non-empty, or false otherwise. Defaults to intersects, if not specified.
Either the name of a file from which polygons shall be loaded must follow, or two or more 2-tuples of real numbers. Each 2-tuple defines a 2-dimensional position, and the sequence of tuples defines the simple polygon that is used as the second argument of the query or analysis. Two 2-tuples define an axis-aligned rectangle. More than two 2-tuples surround the polygon's interior in counter-clockwise order.
Supported queries and analyses
Queries
Analyses
intersects
intersection
Note that the distinction between intersects and intersection is relevant only for 1- and 2-dimensional objects, but not for points.
Region[ -10 -10 10 10 ]
defines a selection window / axis-aligned rectangle (the first two real numbers define the lower left corner). As query/analysis defaults to intersects, geometries that intersect the window pass the filter unchanged.
Region[ intersection -10 -10 10 10 ]
performs an intersection using this window. Points either pass through unchanged or not at all. Polylines and polygons are split at the window's edges, and only the parts inside the window pass the filter.
Region[ -13.66 -3.66 3.66 -13.66 13.66 3.66 -3.66 13.66 ]
defines a quadrilateral (in this case, the window from above, rotated by -30 degrees)
Region[ outline.shp ]
lets all data pass through that intersect the union of polygons contained in the file outline.shp
Echo
One or more echo descriptors. Valid descriptors are:
First
Last
Intermediate
Single
First : echo number == 0 Or echo number == 1
Last : echo number == number of echoes
Intermediate: NOT( First OR Last )
Single: First AND Last
Multiple descriptors are combined with OR.
Echo[Last]
last-echo-points only
Note that the following is equivalent:
Generic[EchoNumber == NrOfEchos]
Echo[First Last]
last-echo- or first-echo-points
Class
One or more classification descriptors. Valid descriptors are:
Created
Unclassified
Ground
LowVegetation
MediumVegetation
HighVegetation
Building
LowPoint
ModelKeyPoint
Water
Rail
RoadSurface
OverlapPoint
WireGuard
WireConductor
TransmissionTower
WireStructureConnector
BridgeDeck
HighNoise
Other
These textual descriptors correspond to the union of classifications defined in LAS versions 1.1 to 1.4, extended by Other, which selects everything but the classifications defined in LAS v.1.1. In addition to these textual descriptors, non-negative numbers are supported. Multiple descriptors are combined with OR.
Class[Ground Water]
terrain-data or data on water Class[64]
user-defined class 64
Synthetic
–
Synthetic
Data whose synthetic-bit is set i.e. data that hold a ‘ClassBits’-attribute, and whose synthetic-bit within the class-bits is 1
KeyPoint
–
KeyPoint
Data whose keypoint-bit is set
Withheld
–
Not Withheld
Data whose withheld-bit is not set i.e. data that either do not hold a ‘ClassBits’-attribute, or whose withheld-bit is 0
Semantic
One or more semantics descriptors. Valid descriptors are:
Profile
GridPoint
Alignement
CrossSection
ContourLine
BulkData
BulkData
SpotHeight
FormLine
BreakLine
BorderLine
EnclaveLine
FaultLine
All of these descriptors may be rendered more precisely by applying one or both of the following height descriptors to them:
2d - 2-dimensional data only
OffTerrain - data not on the terrain
Semantic[BulkData]
bulk data only
Semantic[BreakLine[2d]]
2-dimensional breaklines only
Affine
a sequence of 2, 3, 4, 6, 9, or 12 real numbers. If less than 12 values are supplied, then the other parameters are set to neutral values w.r.t. the transformation.
Affine[0.87 0.5 -0.5 0.87]
2d rotate the data by \(-30^\circ\) (i.e. clockwise), without translation
EchoWidthDepOnAmp
a sequence of one or more 2-tuples of real numbers, separated by commas. In each tuple, the first element specifies the amplitude for which the second element specifies the maximum admissible echo width.
EchoWidthDepOnAmp[ 150 1.7, 100 1.75, 30 1.8 ]
PointSeq
One or more position descriptors. Valid descriptors are:
First
Second
Third
Begin - select an arbitrary position, indexed from the start. PointSeq[First] = PointSeq[Begin[1]].
TwoBeforeLast
OneBeforeLast
Last
End - select an arbitrary position, indexed from the end. PointSeq[Last] = PointSeq[End[1]].
Multiple descriptors are combined with OR. The descriptors 'Begin' and 'End' require attributes themselves, namely a positive integer.
PointSeq[Second]
the second datum in a sequence
PointSeq[ Begin[5] Begin[6] ]
the fifth and sixth datum in a sequence
PointSeq2Echo
–
PointSeq2Echo
Remove
none, one, or more predefined or custom attribute identifiers
Remove
remove all attributes
Remove[SigmaX _myAttribute]
remove the predefined attribute SigmaX and the custom attribute _myAttribute
Primitive
One or more primitive type descriptors. Valid descriptors are:
Point3
Polyline3
Polygon2_5
Window
Box
Plane
Quadric
Line2
Multiple descriptors are combined with OR.
Primitive[Point3]
only points pass through
Primitive[ Point3 Polyline3 ]
only points and polylines pass through
Composite filters
Filters may be inverted by prefixing the filter name with the unary operator NOT. Furthermore, two filters are combined logically by inserting either of the binary operators AND and OR between them. Finally, filters may be grouped by enclosing them in round brackets. Filters are generally evaluated from left to right. The operator precedence and hence the structure of filter trees will be familiar to users: NOT is evaluated first, then AND, finally OR.
For convenience, the three logical operators have aliases: ! for NOT, && for AND, and || for OR.
Following are two examples for composite filters. For the sake of better readability, linebreaks have been inserted.
Select last-echo-data that is either inside a window, or features some classification flag and either of two semantics:
Echo [Last]
AND
(
Region [100.00 100.00 150.00 125.00]
OR
NOT Class [Unclassified]
AND
Semantic[ Formline Breakline[2d] ]
)
Select data with acceptable echo width, translate them and finally remove any attributes:
EchoWidthDepOnAmp[ 150 1.7, 100 1.75, 30 1.8 ]
&&
Affine[1. 0. 0. 1. 100 200]
&&
Remove
Formal grammar
Graphical repesentation
Railroad diagrams
For the graphical representation of formal grammars, we use railroad diagrams. These diagrams display
terminals (characters to be typed) as rounded rectangles filled with dark yellow,
non-terminals (syntax rules) as rectangles filled with medium yellow, and
regular expressions (rules that input text must stick to) as hexagons filled with light yellow.
Filter string syntax
The generic filter syntax represented as railroad diagrams. The syntax for rules GenericFilterAttrib and Attribute can be found here: GenericFilter.
In computer science, the Extended Backus Naur Form (EBNF) is a metasyntax notation used to express context-free grammars: i.e., a formal way to describe computer programming languages and other formal languages. It is an extension of the basic Backus Naur Form (BNF) metasyntax notation.
This W3C variant of EBNF uses the following conventions:
rule: name::=...;
terminal item: '...' | "..."
Non-terminal item: ...
Concatenation:
Choice: |
Optional: (...)?
Zero or more repetitions: (...)*
One or more repetitions: (...)+
Grouping: (...)
Comment: /*...*/
Exception: -
Filter string syntax
The W3C-EBNF conforming filter string syntax is defined as follows. The syntax for rules GenericFilterAttrib and Attribute can be found here: GenericFilter.
Filter ::= And ( ( "Or" | "||" ) And )* | PassFilter