Fulltext Geocoding Versus Spatial Metadata for Large Text Archives: Towards a Geographically Enriched Wikipedia

Presents the basic workflow of a fulltext geocoding system which uses software algorithms to parse through a document, identify textual mentions of locations, and use databases of places and their approximate locations known as gazetteers to convert those mentions into mappable geographic coordinate...

Full description

Saved in:
Bibliographic Details
Published inD-Lib magazine Vol. 18; no. 9/10
Main Author Leetaru, Kalev H.
Format Journal Article
LanguageEnglish
Published 01.09.2012
Subjects
Online AccessGet full text
ISSN1082-9873
1082-9873
DOI10.1045/september2012-leetaru

Cover

More Information
Summary:Presents the basic workflow of a fulltext geocoding system which uses software algorithms to parse through a document, identify textual mentions of locations, and use databases of places and their approximate locations known as gazetteers to convert those mentions into mappable geographic coordinates. Overviews the United States National Geospatial-Intelligence Agency's GEOnet Names Server (GNS) (NGA) and the United States Geological Survey's Geographic Names Information System (GNIS) gazetteers that lie at the heart of nearly every global geocoding system. Provides a case study comparing manually-specified geographic indexing terms versus fulltext geocoding on the English-language edition of Wikipedia to demonstrate the significant advantages of automated approaches, including finding that previous studies of Wikipedia's spatial focus using its human-provided spatial metadata have erroneously identified Europe as its focal point because of bias in the underlying metdata. Source: National Library of New Zealand Te Puna Matauranga o Aotearoa, licensed by the Department of Internal Affairs for re-use under the Creative Commons Attribution 3.0 New Zealand Licence.
Bibliography:Includes illustration, maps, references
Includes links to related electronic resources
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1082-9873
1082-9873
DOI:10.1045/september2012-leetaru