News By Tag
News By Place
Standardized Geolocation Datasets Released into the Public Domain
OpenGeoCode.org, the open Big Data project, has announced its new standard for a CSV Universal Database Exchange (CUDE) standard for geolocation datasets and has released accompanying datasets into the public domain.
The organization has announced the first public release of the CSV Universal Database Exchange (CUDE) standard and tools, along with accompanying initial twenty datasets derived from the data repositories of the US 2010 Census, US Department of Education, British Columbia (Canada) Department of Education, and the CIA World Factbook.
“We are unveiling our new standard CSV Universal Data Exchange (CUDE) for flattening national and international governmental and standards organization's geospatial databases. This standard and tools we are rolling out will give developers the most unprecedented and easy access to geospatial databases worldwide”, explains Andrew Ferlitsch, Co-founder of OpenGeoCode.org.
The organization has pioneered a database flattening method and data representation for extracting data from diverse sources and compiling the data into a simplified text format for exchange and importation across geolocation systems. The technique pioneered is especially useful for entities that need to integrate datasets from multiple sources.
The CUDE format consists of 46 distinct fields of information which the co-founders have found to be effective in describing high value attributes of diverse set of entities such as political and administrative divisions, locations, geographic features, habitats, population demographics, buildings, businesses, economic activity, and roads and bridges.
According to Mr. Ferlitsch, “The system incorporates a number of elements from the International Standards Organization (ISO) and National Geospatial-Intelligence Agency (NGA) along with our proprietary method of sequencing the fields and qualifiers”.
This system allows the 46 fields to act as a common base from which over a thousand distinctive attributes can be represented. The organization’
“Duplicates and bad data have been the Achilles Heel of integration from diverse data sources. Our data reduction method catches duplicates at extraordinary levels, and our machine learning is showing amazing promise in detecting data points that don’t belong in a set”, according to Mr. Mayall.
The organization has plans to continue to place in the public domain CUDE datasets and metadata of high value data from national and international sources. The CUDE specification and toolsets for extraction, exchange and maintaining CUDE systems is free to use by academic institutions and may be licensed for commercial use.