How to select the best Geospatial Data? Step 4

GeoBI Concepts
Qualité des données géospatiales

Spatial Data Quality

Step 4: Quality assessment

To view or review step 3 of this series, please follow this hyperlink:Content assessment of selected datasets

Before beginning the detailed quality assessment of your dataset, it is important to verify, in the Geospatial Data licencing terms, if the rights and restrictions are adequate for the planned usage of the data. This information can also be found in the MD_ (Metadata Constraints) section of the ISO-19115 Standard. These rights can be: to allow modification, to improve data and to distribute the Data as well as to manufacture and distribute derivative products.

The first step of this data assessment consist of translating the user needs (the Why) in internal quality elements. Each Object Class of the Conceptual Data Model showing the user needs (done at step 1), can be evaluated in relation to the quality criteria described in the ISO-19113 Standard illustrated in the following table.

Data Quality Element ISO-19113

Sub-Element

Completeness: “presence and absence of features, their attributes and relationships;” (ISO-19113)
Ex.: To verify that at least 90% of the buildings are present. To verify that all the number of the numbered road are present.
commission : “excess data present in a dataset” (ISO-19113)
omission: “data absent from a dataset” (ISO-19113)
Logical   Consistency: “degree of adherence to logical rules of data structure, attribution and relationships(data structure can be conceptual, logical or physical);” (ISO-19113)
Ex.: Buildings having their area greater than 100
m² might be represented by a polygon instead of a point.
conceptual consistency: “adherence to rules of the conceptual schema” (ISO-19113)
domain consistency: “adherence of
values to the value domains” (ISO-19113)
format consistency: “degree to which
data is stored in accordance with the physical structure of the dataset” (ISO-19113)
topological consistency: “correctness of the explicitly encoded topological characteristics of a dataset” (ISO-19113)
Positional accuracy: “accuracy of the position of features; ” (ISO-19113)
Ex.:Fire hydrants might be located with ± 1m accuracy.
absolute or external accuracy: “closeness of reported coordinate values to values accepted as or being true” (ISO-19113)
relative or internal accuracy:“closeness of the relative positions of features in a dataset to their respective relative positions accepted as or being true” (ISO-19113)
Temporal accuracy: “accuracy of the temporal attributes and temporal relationships of
features; ” (ISO-19113)
Ex.: The roads must be up to date.
accuracy of a time measurement: “correctness  of the temporal references of an item (reporting of error in time measurement)”(ISO-19113)
temporal consistency: “correctness of
ordered events or sequences” (ISO-19113)
temporal validity: “validity of data with respect to time” (ISO-19113)
Thematic accuracy: “accuracy of quantitative attributes and the correctness of non-quantitative attributes and of the classifications of features and their relationships.” (ISO-19113)
Ex.: To verify that all roads are correctly classified.
classification correctness: “comparison of the classes assigned to features or their attributes to a universe of discourse (e.g. ground truth or reference dataset)” (ISO-19113)
non-quantitative attribute correctness
quantitative attribute accuracy
Once we know the expected quality, we have to choose the dataset quality evaluation method according to the user requirements, the time we have, and the budget allocated to the project. The ISO-19114 standard  proposes three evaluation methods divided into two main classes: direct and indirect:
  1. Indirect quality evaluation method
  2. Internal direct quality evaluation method
  3. External direct quality evaluation method

The method used can vary from one object class to another. For example, only the indirect method can be used for the water course, and the direct method is used for the road segment.

The direct methods can be done in a complete fashion (full inspection), i.e. using the whole dataset, or by sampling. These methods can also be done automatically, semi-automatically or manually.

The indirect quality evaluation method consists of evaluating the dataset using external knowledge that resides outside the dataset itself. All pertinent information retrieved in relation to data quality criteria or its  processes lineage information can be useful to evaluate the quality of the dataset. This information can be found in the dataset’s documentation such as metadata, catalog and specifications.  Unfortunately,  as mentioned in a previous post I wrote, the metadata and specs are not always provided and complete. Most of the suppliers only provide the 32 ISO-19115 CORE metadata, which are way too generic to allow a proper evaluation of the data.

Furthermore, the metadata describe only the feature types and not the relationships between feature types, like spatial integrity constraints permitted between geographic object types. If the evaluation cannot be completely executed using the indirect method, we must now consult the data in order to evaluate its quality, and ideally, compare the data with another data source such as an ortho-photo for example. Because many data quality elements such as positional accuracy and temporal accuracy vary, according to their location, the acquisition technics used and their objects class, it is often inevitable to have to evaluate the data using the direct method.

Once the analysis is completed, we now have to choose the best source. This step will be the subject of my next Post: Step 5 – Choose the data sources.

Référence

ISO-TC/211, 2002. Geographic Information – Quality principles 19113.
ISO-TC/211, 2003. Geographic Information – Quality evaluation procedures 19114. http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=26019
ISO-TC/211, 2003. Geographic information – Metadata 19115.
Written by
 
SLarrivee

INTELLI3
3700 Blv Wilfrid-Hamel, suite 80