How to choose the best Geospatial Data? Step 1

GeoBI Concepts

Sources de données géospatiales

Step 1: Needs assessment

This post is the first of a series on How to choose the best Geospatial Data? 5 steps to reach this goal!

Before being able to start the search for your geospatial datasets, you first have to define your needs and/or your client’s needs.

The following questions are some that we have to answer to better identify needs. These examples are in the transportation domain.

What geographic entities are required?

  • Roads, ports, railway station, infrastructures, merchandises, etc.

What are the required features’ properties or

  • Speed limit, pavement, clearance height, merchandise weight, merchandise type, etc.

What is the use and/or what analysis do I need to perform with my data? The answer to this question will help in order to identify; the content, the required quality level and the type of analysis Tools you will need: GIS, CAD, SOLAP.

  • Optimal path computation, address geocoding, shipped tonnage per year/per region, etc.

What positional accuracy do I require?

  • Positional accuracy (required accuracy of the road center should be 0.5m, 1m, 5m, ..,) and
    shape accuracy (buildings are represented by a center point, a rectangle or detailed foundations shape)

What is the required level of completeness to satisfy my requirements?

  • I absolutely need all the Ports and Railway stations; I only need major roads and highways; the rivers and lakes are only shown as contextual layers in my application so I only require the main ones.

What semantic accuracy do I need?  Will attributes value have to be 100% in conformance to reality?

  • Does roads classification (highways, primary roads, secondary roads) must be precise? Can I accept road classification mismatches such as a local road classified as a primary road?

What is the accuracy of the measures stored in the attribute? The accuracy of these measures is very important in multidimensional databases because these values are aggregated considering several levels of
aggregation. If they are wrong at the start, the whole system will be wrong.

  • What accuracy is desirable for: merchandise tonnage, decline rate, price, surface in square meters, length in meter, etc.?

Must the data be up-to-date (data topicality or temporal accuracy)?

  • For some entities which evolved slowly like waterways, data published in 2000 can be quite satisfactory. But for others entities, temporal accuracy may vary depending on the location. For example, a municipality having all of its territory occupied will not have a lot of changes in its road network. On the other hand, in a new residential development, the addition of new streets is frequent.

In which format do I want the data to be delivered?

  • Shapefile, Mid/Mif, Oracle Database, KLM, GML? It is possible that this question becomes irrelevant if I have an ETL tool as I will be able to transform the data to the required format.

What are my budgetary capabilities for acquiring and processing data?

These questions are only a subset of the questions we need to ask ourselves to define our needs in terms of geospatial data. Designing a conceptual data model with those needs, like an UML class diagram, will greatly facilitate the communications with the client. When we have a good idea of what we are looking for, we can go to the next step. Needs understanding will become a lot clearer along the process and it is possible that new needs appears along the way.

Follow me in my next post, Step 2 on How to select the best Geospatial Data? Geospatial data search

Written by

3700 Blv Wilfrid-Hamel, suite 80