Module Info
See also
opals::IInfo

Aim of module

Extracts and reports information of an ODM, vector or raster file.

General description

opalsInfo extracts statistic information (number of points/lines/polygons, the bounding box, attribute information, spatial index details, etc..) from a data set input file. The information provided in the header of the file format is made available through a generic file statistics object. Since not all data formats provide full statistic with the header, the module also provides an exact computation mode. In this mode not only the header, but the full content of the input file is read through allow extracting complete statistic information. Please note that the exact computation mode is typically much slower than the standard mode but it will report correct informations even in cases where the header information is incomplete or incorrect.

See also
Module Info

Beside standard statistic values (min, max, mean and stdDev) it is possible to retrieve value frequencies of requested attributes for ODM and vector files. Whereas standard statistics might be extractable from the file header, value frequency determination require a complete read through of the input file. So for huge files this can take some time. Value frequencies are not restricted to integer attributes, but are currently limited to 1000 distinct values. As shown in example 3 the feature can be used to get the distribution of e.g. the classification values.

To get a quick visual overview of a vector data set, the export overview feature of Module Info can be used. Whereas for ODM files the overview rasters are directly accessible within the file header (they are incrementally updated during import), other vector format require a complete read through for generating the corresponding height and/or point count raster. Internally the same code for ODM and vector files are used, which is why the raster will be same if created from the original vector file or from the corresponding imported ODM as it is demonstrated in example 4.

Generation of OPALS Format Definition (OFD)

Besides the derivation of different statistical information Module Info can be used to generate OPALS Format Definition (OFD) files. Those XML files basically describe the format, the attributes and what of the data should be imported or exported. Based on a given input file (ODM or vector format) the module creates an specific OFD file for a specific vector format. Currently, the formats las, rdb and shape are supported (The generic text and binary OFD are not supported, since those file do not contain any header information). When providing an input file of one of the aforementioned formats the OFD file is automatically created for this target format (if not defined differently). If an ODM is given as inFile the user needs to explicitly specify the target format by the generateOFD.format parameter or a las OFD file is created. The generated OFD file also included some XML comments which should support the user in adapting the file. The created OFD file contains all attribute entries based on the given input files, since it is easier to delete irrelevant entries than adding missing ones. For completeness it is mentioned that especially the Riegl RDB format definition might required some adoptions depending on the usage context (e.g. in Module Translate). See Example 5: Generation of an OPALS Format Definition for further details.

Parameter description

-inFileinput file
Type: opals::Vector<opals::Path>
Remarks: mandatory
Currently ODMs, vector and grid can be specified as input file.
-exactComputationexact computation mode
Type: bool
Remarks: default=0
In exact mode the information is extracted/computed from the file content rather than from the file header. For certain formats (e.g. LAS/LAZ) more information can be presented, if the extact mode is activated.
-valueFrequencyextract value frequencies for defined attributes/bands
Type: opals::Vector<opals::String>
Remarks: optional
For the specified attribute/bands value frequencies will be extracted. This options implicitly activates the exact computation mode.
-statisticfile statistics
Type: opals::Vector<opals::DataSetStats>
Remarks: output
The result of the file statistic extraction provided as generic object. Specify -outParamFile in order to store these results to an XML file.
-exportOverviewexport header features
Type: opals::Vector<opals::HeaderFeature>
Remarks: optional
Exports an height and/or density overview raster of the current data set. For ODMs those overview rasters can be directly extracted from the file header wheras for all other vector formats the file needs to be completely read through (exactComputation will be activated).
-multiBandExport overview as multi-band raster
Type: bool
Remarks: default=0
When activated and exporting multiple overview feature a single raster file with multiple bands will be created.
-mergeCreate single file statistics
Type: bool
Remarks: default=0
In case of multiple input files, this option allows to create a single file statistics if activated.
-enduranceModeendurance mode flag
Type: bool
Remarks: default=0
If this option is activated the module doesn't throw exceptions while processing files. Potential errors message will be stored in the error message field of the corresponding dataset statistics object. This is particularly useful when processing multiple. This way is secured that all files are processed, even if one fails.
-generateOFDIGroup: Generating OFD file options
These parameters are only need when an OFD file should be generated
.fileOFD file path
Type: opals::Path
Remarks: optional
Specifies the OFD xml file path that will activate the OFD generating functionality
.formatTarget format of OFD file
Type: opals::String
Remarks: optional
If not specified the module uses the file format of the input file to generate the corresponding OFD file. Currently, LAS, RDB and Shape file format are supported. If an ODM is provided, a LAS OFD file is created.

Examples

The data used in the following example are located in the $OPALS_ROOT/demo/ directory. The example shows how to use opalsInfo to get an overview about data (including their attributes) within an odm.

Example 1:

As a prerequisite, the ALS point cloud data must be imported into the ODM. To achieve that, change to the demo directory and type:

opalsImport -inFile fullwave.fwf

Now, run the following command

opalsInfo -inFile fullwave.odm

which gives the following output:

[...]
14:24:23:
Data set statistic
Statistic Value
Filename D:\opals\demo\fullwave.odm
Type odm
Point Count 67413
Line Count 0
Polygon Count 0
Point density 3.75
SpatialReference
---
Bounding box
X Y Z
Minimum 24820.000 311160.006 275.535
Maximum 25000.000 311260.000 328.500
[...]

After a generic information block the section about the data set attributes is listed. Figure 1 captures the corresponding section from the log file.

Fig. 1: Attribute statistics of demo data set fullwave.fwf

A special feature of the ODM is the on-the-fly collection of statistical information of all attributes. A useful feature for checking the correctness of imported or processed data. The last five attributes (Id, FileId, LayerId, WinputCode and StructNr) are "virtual" attributes. They exist for internally reason, however, are not made persistent on disk.

With the following command

opalsNormals -infile fullwave.odm -neighbours 8 -searchRadius 1

local planes are estimated within each point of the data set. In the current example the computation succeeds for 67297 points (and fails in 116 cases). Running opalsInfo again

opalsInfo -inFile fullwave.odm

will show the newly added attributes (c.f. Figure 2). The statistics also reflect the number of successful computated plane estimations (For NormalSigma0 even less values have been set, since redundancy is required for its computation).

Fig. 2: Attribute statistics after normals estimation

The last section of the module output covers information of the spatial index (c.f. Figure 3). The spatial index statistics is also reported after importing a data set, since it is crucial for processing the ODM. For further details please refer to section Analysing the index statistics of an ODM.

Fig. 3: Spatial index information

Example 2:

When running Module Info within a Python script, direct access to the statistics of the dataset is provided via the OPALS Python API. This is exemplified in the sample script $OPALS_ROOT/demo/infoDemo.py:

1 import opals
2 from opals import Import
3 from opals import Info
4 #
5 # Set screen log level
6 #
7 logLevel=opals.Types.LogLevel.none
8 #
9 # Import strip11.laz dataset into an ODM
10 #
11 imp = Import.Import(inFile=["strip11.laz"], tileSize=5, screenLogLevel=logLevel)
12 imp.run()
13 #
14 # Run opalsInfo and query statistic object
15 #
16 inf = Info.Info(inFile=[imp.outFile], screenLogLevel=logLevel)
17 inf.run()
18 att_stat = inf.statistic[0].getAttributes()
19 idx_stat = inf.statistic[0].getIndices()
20 #
21 # Print attribute statistics
22 #
23 print("Attribute statistics of file {}".format(imp.outFile) )
24 print("Name: Count/Min/Mean/Max")
25 for a in att_stat:
26  print( "Name {}: {:0d}/{:0.3f}/{:0.3f}/{:0.3f}".format( a.getName(), a.getCount(), \
27  a.getMin(), a.getMean(), a.getMax()))
28 print("-----------------------------------------------------------")
29 #
30 # Print spatial index statistics
31 #
32 print("Spatial index statistics of file {}".format(imp.outFile) )
33 for i in idx_stat:
34  print("{:0d} leaves".format( i.getCountLeaf()))
35  print("Tile size: {:0.1f}".format(i.getTileSize()))
36  print("Min #pts per leaf: {:0.1f}".format(i.getObjectsInLeafMin()))
37  print("Mean #pts per leaf: {:0.1f}".format(i.getObjectsInLeafMean()))
38  print("Max #pts per leaf: {:0.1f}".format(i.getObjectsInLeafMax()))

To run the script, type:

python infoDemo.py

The script imports the dataset strip11.laz and queries the attribute and spatial index statistics. This is achieved by accessing the statistic parameter provided by Module Info. The script uses the access function of the complex Python type opals::DataSetStats to query (and print) the respective values.

Attribute statistics of file strip11.odm
Name: Count/Min/Mean/Max
Name Amplitude: 378712/1.000/68.774
Name EchoNumber: 378712/1.000/1.100
Name NrOfEchos: 378712/1.000/1.201
Name ClassificationFlags: 378712/0.000/0.000
Name ScanDirection: 378712/0.000/0.000
Name EdgeOfFlightLine: 378712/0.000/0.000
Name Classification: 378712/0.000/0.917
Name ScanAngle: 378712/0.000/0.000
Name UserData: 378712/0.000/0.000
Name PointSourceId: 378712/11.000/11.000
Name GPSTime: 378712/37983.419/37986.177
Name _PulseWidth: 378712/1.700/4.837
Name Id: 378712/2147483648.000/4729435047628892.000
Name FileId: 378712/1.000/1.000
Name LayerId: 378712/0.000/0.000
Name WinputCode: 378712/30.000/30.000
Name StructNr: 378712/0.000/0.000
-----------------------------------------------------------
Spatial index statistics of file strip11.odm
2090 leaves
Tile size: 5.0
Min #pts per leaf: 1.0
Mean #pts per leaf: 181.2
Max #pts per leaf: 530.0

Example 3: Value frequencies

Beside the standard statistic measures like min, max, mean and standard deviation, the module also supports the determination of value frequencies of specified attributes. This is feature can be used e.g. to retrieve all classification values that exist within a dataset. The following command

opalsInfo -inFile strip11.laz -valueFreq Classification

outputs (after the standard attribute table) a value frequency table for the specified attributes:

...
PointSourceId uint16 378712 --- 11 11 11.000 0.000
GPSTime double 378712 --- 37983.419 37988.897 37986.177 1.120
_PulseWidth double 378712 --- 1.700 31.500 4.837 0.653
12:22:52:
Value Frequencies
Name Value Absolute Frequency Relative Frequency [%]
Classification --- 0 0.00
0 205137 54.17
2 173575 45.83

Nodata values (or attributes that are marked as invalid) are printed in the first row and therefore also visible within the table. The value frequency is not limited to integer attributes, but the computation is currently restricted to 1000 unique values. If more than 1000 values exists, the frequency histogram is not further extended.

Example 4: Export overview raster

Using exportOverview one can export an height and/or point count overview raster of the input vector data set. The overview rasters will always have a resolution of 200-400 pixels in height and width independent of data set size, density and bounding box. The raster file(s) are always created at the same path location and with the same time stamp as the input file to exhibit their origin. It is possible to explicitly export the height raster, the point count raster, or both. The raster file names are post-fixed with _overview_z and _overview_pcount if not the multiBand flag is activated. In this case both raster are written as two bands into one file that post-fixed _overview only.

opalsInfo -inFile strip11.laz -exportOverview all -multiband 1

Whereas vector formats need to be read through for exporting overview rasters, this is not the case for ODM files. Since this overview information is always created during import and stored within the ODM header, writing the overview raster does not required reading any ODM geometry objects. Hence, export runs very fast but of course the source vector file needs to be imported first as shown below.

opalsImport -inFile strip11.laz -outf ov_test.odm
opalsInfo -inFile ov_test.odm -exportOverview all -multiband 1

In both cases, the same code/algorithm is use which results in the exact same overview raster files. For completeness, it should be mentioned that overview files are also used in some workflow scripts for informational purposes (e.g. Python script preDataCheck) or for speeding up processing (e.g. Python script preCutting).

Fig. 4: Height (left) and point count (right) overview rasters

Example 5: Generation of an OPALS Format Definition

When importing data into OPALS usually all information provided by the data set file is preserved. Using OFD files the imported (or exported) information can be controlled, i.e. attributes can be excluded or renamed during io operations. The demo strip laz files contain an extra byte attribute named PulseWidth. For skipping this extra byte attribute during import the following command can be used to create strip_ofd.xml file.

opalsInfo -inFile strip11.laz -generateOFD.file strip_ofd.xml

The created files looks like

<?xml version="1.0" encoding="utf-8"?>
<opalsFormatDefinition>
<las versionMinor="2" pointDataRecordFormat="1">
<scaleCoordinates x="0.001" y="0.001" z="0.001"/>
<extraBytes name="_PulseWidth" lasName="PulseWidth" type="double" lasType="uint16" scale="0.10000000000000001" lasDescription="Echo pulse width" invalidValue="65535"/>
<!--Additional information about definition:
also see: https://opals.geo.tuwien.ac.at/html/nightly/ref_fmt_las.html
Possible types: bool|int8|uint8|int16|uint16|int32|uint32|int64|float|float32|double|float64|string
Minimal entry : <extraBytes name="." type="t"/>
Full entry : <extraBytes name="." lasName="." type="t" lasType="t" scale="1" offset="0" lasDescription="desc." invalidValue="val"/>
-->
</las>
</opalsFormatDefinition>
<!--This file was created by opalsInfo 2.5.0.0 (compiled on Feb 9 2024 01:08:00)-->

By simply removing (or commenting out) the corresponding extraBytes line and using the OFD file during import, the extra byte attribute PulseWidth will be skipped.

opalsImport -inf strip11.laz -iformat strip_ofd.xml
opalsInfo -inf strip11.odm

To check the absence of the PulseWidth attribute use Module Info (as shown above) to inspect the attribute table:

Fig. 5: Attribute statistics of strip11.laz excluding PulseWidth

The OFD creation functionality is even more relevant for exporting data. OPALS is a powerful and flexible tool in extracting, computing and storing of new attributes. In the following example surface normals are computed. Beside the normal vector components also the sigma0 of the adjustemnt is estimated and stored as attribute NormalSigma0. This information can be interpreted as surface roughness and might be relevant for exporting into a las file. Therefore, we create an OFD file using the generateOFD.file feature which includes all attributes that are stored within the ODM.

opalsNormals -inf strip11.odm -neighb 8 -searchMode d3
opalsInfo -inFile strip11.odm -generateOFD.file las_with_roughness.xml

Since we are only interested in the surface roughness we disable/remove all other extra byte entries and set the las name to Roughness (also see $OPALS_ROOT/demo/las_with_roughness.xml)

<?xml version="1.0" encoding="utf-8"?>
<opalsFormatDefinition>
<las versionMinor="4" pointDataRecordFormat="1">
<extraBytes name="NormalSigma0" lasName="Roughness" type="float"/>
</las>
</opalsFormatDefinition>

The final las file is then created by the following export command

opalsExport -inf strip11.odm -outf strip11_with_roughness.laz -oformat las_with_roughness.xml

As it can be seen during export the extra bytes Roughness is written

[...]
10:28:38:
Extra bytes mapping
Mapped extra bytes 0
<lasName> Roughness
<lasType> float
<internalName> NormalSigma0
<internalType> float
100%
10:28:38: finished
[...]

or use a Module Info run to check for the existence of the extra bytes

opalsInfo -inf strip11_with_roughness.laz

at the end of the attribute table:

[...]
GPSTime (GPSTime) double 378712 --- --- --- --- ---
Roughness (_Roughness) float 378712 --- 0.000 0.643 --- ---
10:32:35: Running opalsInfo took: 00:00:00

As mentioned above, the OFD creation functionality can be used for rdb and shape files in a similar way.

Example 6: Multi file input

Like various other modules Module Info also supports multi file input which was introduced to serve two purposes:

  • Improved performance by multi threaded reading of the specified files
  • Merging information of multiple files into a single dataset statistics object by activating the merge option

In case of the standard (non-merged) mode for each file a separate statistics object is provided. Please note that no statistics objects will be provided if an error occurred for any of the specified files. This might be an unwanted behavior and can be changed by activating the enduranceMode mode. In this mode it is secured that all files are read and possible file reading errors are stored in the corresponding dataset statistics object (see ErrorMessage function of class opals::DataSetStats). Since the multi input file case is mainly relevant for python scripts, this feature is demonstrated by the following script

1 import opals
2 from opals import Info
3 #
4 # Run opalsInfo and query multiple statistic object for each input file
5 # (output is disabled)
6 #
7 inf = Info.Info(inFile=["strip??.laz"], screenLogLevel=opals.Types.LogLevel.none)
8 inf.enduranceMode = True # make sure that all files are processed even if one isn't readable
9  # (no exception is thrown)
10 inf.run()
11 stats = inf.statistic
12 #
13 # attribute statistics
14 #
15 print(f"Individual statistics of {len(stats)} files")
16 for a in stats:
17  if a.isSetErrorMessage():
18  # if an error occurred the error message member is set and dataset object will
19  # most likely not contain any useful information (except filename)
20  print(f"\tError occurred while during file {a.getFilename()}")
21  print(f"\t\t{a.getErrorMessage()}")
22  else:
23  print(f"\tDetails on file {a.getFilename()}")
24  box = a.getBoundingBox()
25  print(f"\t\t{a.getPointCount()} points with bounding box ({', '.join([ '%.3f' % d for d in box])})")
26 
27 #
28 # Re-run and acquire a single merged statistic object
29 #
30 inf.merge = True
31 inf.run()
32 assert(len(inf.statistic) == 1) # we will only get one statistic object
33 merged_stats = inf.statistic[0]
34 print("\nMerged statistics of all files:")
35 print(f"\tDetails on files {merged_stats.getFilename()}")
36 box = merged_stats.getBoundingBox()
37 print(f"\t\t{merged_stats.getPointCount()} points with bounding box ({', '.join(['%.3f' % d for d in box])})")

To run the script, type the following command in the $OPALS_ROOT/demo/ directory:

python infoMultiFileDemo.py
@ statistic
general statistic information about a given input file (opalsInfo)
@ uint16
16 bit unsigned integer
opalsNormals is the executable file of Module Normals
Definition: ModuleExecutables.hpp:148
@ Y
sessions.adjustment.leverArm group(opalsStripAdjust)
@ ClassificationFlags
See LAS spec.
@ file
generic file parameter (opalsInfo)
@ NormalSigma0
Sigma0 of normal estimation.
opalsImport is the executable file of Module Import
Definition: ModuleExecutables.hpp:113
@ Z
sessions.adjustment.leverArm group(opalsStripAdjust)
@ ScanAngle
Scan angle as defined by LAS standard.
@ Count
number of elements
Definition: GridFeature.hpp:34
opalsExport is the executable file of Module Export
Definition: ModuleExecutables.hpp:73
@ Amplitude
Linear scale value proportional to the receiving power.
opalsInfo is the executable file of Module Info
Definition: ModuleExecutables.hpp:118
This is the fake namespace of all opals Python scripts.
Definition: __init__.py:1
@ exportOverview
export overview features (opalsInfo)
@ odm
OPALS Datamanager file.
@ X
sessions.adjustment.leverArm group(opalsStripAdjust)
@ searchMode
dimension of nearest neighbor search (opalsNormals)
@ EchoNumber
This is the k-th return/echo for a certain pulse, where for the first return: k==1 (see LAS spec....
@ all
all possible header feature
@ NrOfEchos
The pulse which this point is based on generated this number of returns/echoes (see LAS spec....
@ Classification
See LAS spec.
@ box
box type (3D)
@ d3
Search based on full 3D coordinates (x,y and z)
@ generateOFD
defines maximum distance to be bridged (opalsInfo)