The OPALS Generic Format defines custom file formats for point coordinates and (optionally) corresponding attributes in an OPALS Format Definition. Data may be imported and exported either in text or binary representations. All data belonging to a point record is located either on a separate line (text) or in a consecutive block of memory (binary). Which attributes are used, the way they are formatted, and in which order they appear within a record, is up to the user.
Due to the ability of the OPALS datamanager (ODM) to store arbitrary attributes along with vector data, it is possible to import and export almost any line-based data format into/from an ODM.
Generic Format Files are OFD files with the XML element specifying the format either being <text> or <binary>. This element may contain a <header> element that describes the start of a file. After this optional element, one or more of the following elements must follow:
The sequence of these elements defines the format for a single point record:
Each <column> or <segment> element defines either a coordinate or an attribute to be accessed. Only the element's name attribute must be specified in any case. Further XML-attributes may be supported or even necessary - depending on the name, text or binary format definition, and whether the OFD is only used for import, or also for export:
Instead of specifying the value of invalidValue directly, for numeric attributes, the following identifiers have a special meaning:
If invalidValue is not given, 0 is used for numeric and character types, and an empty string for strings.
<skip> allows for ignoring one or more (consecutive) columns when importing text files, while <skip> yields the specified number of bytes to be ignored when reading binary files. Like any OFD file, Generic Format Files must comply with the OFD XML Schema.
As an example, consider the following simple text file:
For a successful import, the first line must be skipped. The rest may be interpreted as lines containing X, Y, and Z coordinates each, in that order.
The following xml block shows an excerpt of the corresponding text OFD XML file:
The file given above may be readily used for import and export. For export, however, one may opt for documenting the file content by usage of a header text. Furthermore, the width and precision of exported numbers may need to be adapted:
For the complete OFD file, see $OPALS_ROOT/addons/formatdef/simpleAscii.xml
.
As described here, the ODM supports predefined and user-defined attributes. User-defined attributes are indicated by a prefixing "_" (underscore) character in the attribute name. While user-defined attributes require the definition of type using one of the XML representations, predefined attributes have predefined data types, and hence type must not be given. Consider the following text file that contains points and corresponding attributes:
The semantics of column 4 and 5 match the predefined attributes GPSTime and EchoWidth. Column 6 has in terms of OPALS an unknown semantic, which may be accessed using a user-defined attribute. An appropriate OFD file could look like:
If a text file to be imported contains more columns than defined by the OFD file, then the trailing columns are ignored.
So far, only examples of text OFD files have been given, but the definition of coordinate columns and attributes is similar for binary files. However, the Generic Format concept also supports features that are only valid for text or for binary file definitions. Those are described in the following sections, starting with special text features.
OPALS supports 4 additional attributes that may be defined within an <text> tag
Since OPALS implements appropriate defaults for all elements, their definition is optional as declared in the schema file.
The <decimalSeparator> attribute is useful for text files that were generated by localised programs (e.g. using "," instead of "." as decimal separator).
<columnSeparators> specifies one or more characters. Any of them is considered as separating coordinates or attributes in text files. If an OFD file does not define <columnSeparators>, then OPALS uses white space as column separator. In case <columnSeparators> defines more than one character, Module Export uses the first character for separation.
The <commentInitiator> attribute tells OPALS to ignore lines that start with the given character (string). Consider the situation that a text point file contains a few outliers. One could remove those points by deleting the entire line of the corresponding points. This, however, requires an additional documentation step or the outlier information is lost (or at least difficult to reproduce). Using the <commentInitiator> attribute, the corresponding points can be simply "commented out" for import. Although text formats often use the hash character ("#") for comments, OPALS does not implement a <commentInitiator> default.
Unless <skipWhiteSpace> is set to false, white space is skipped during the import of text files.
When accessing binary files, it is most important to specify the correct <typeFile> using one of the XML representations found here - one needs to know beforehand whether e.g. point coordinates are stored on file as single (float) or double precision (double) real numbers. By default, OPALS assumes <typeFile> to be identical to <type>.
The binary representation of numeric data types is generally not portable across different processor architectures. As an advanced feature, OPALS therefore allows for the specification of the byte representation to be used on file (<endian>), while using native endianness by default.
The file can be found in the $OPALS_ROOT/addons/formatdef/
directory
The file can be found in the $OPALS_ROOT/addons/formatdef/
directory