CountryInfo.txt: Country names, codes, places and leaders

CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet...

Full description

Bibliographic Details
Main Author: Schrodt, Philip
Format: Dataset
Language:unknown
Published: 2015
Subjects:
Online Access:https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f
id dataone:sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f
record_format openpolar
spelling dataone:sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f 2024-06-03T18:46:52+00:00 CountryInfo.txt: Country names, codes, places and leaders Schrodt, Philip 2015-05-07T00:00:00Z https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f unknown Social Sciences Dataset 2015 dataone:urn:node:HD 2024-06-03T18:07:25Z CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet information for the development of TABARI dictionaries. File contains about 32,000 lines, covering about 240 countries and administrative units (e.g. American Samoa, Christmas Island, Hong Kong, Greenland). It is internally documented and almost but not quite XML: The major fields are delimited with tags of the form ... but elements inside are delimited with line feeds. Converting this to strict XML would be a relatively simple programming exercise for anyone who should be working with the file in the first place. File is UTF-8 with Unix line feeds and will need to be converted if used in a Windows system. Fields include Country name in English Adjectival forms and synonyms of the country name, including some non-English versions of the name ISO-3166 numeric, alpha2 and alpha3 codes, FIPS-10 code, IMF code, COW alpha and numeric codes Capital city Cities with populations over 1-million Regions and geographical features (WordNet meronyms) Leaders, 1960-2008 (rulers.org) Members of government, 2003-2010 (CIA World Leaders) The beginning of the file has fairly extensive documentation on the formats used. Dataset Greenland Unknown Greenland
institution Open Polar
collection Unknown
op_collection_id dataone:urn:node:HD
language unknown
topic Social Sciences
spellingShingle Social Sciences
Schrodt, Philip
CountryInfo.txt: Country names, codes, places and leaders
topic_facet Social Sciences
description CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet information for the development of TABARI dictionaries. File contains about 32,000 lines, covering about 240 countries and administrative units (e.g. American Samoa, Christmas Island, Hong Kong, Greenland). It is internally documented and almost but not quite XML: The major fields are delimited with tags of the form ... but elements inside are delimited with line feeds. Converting this to strict XML would be a relatively simple programming exercise for anyone who should be working with the file in the first place. File is UTF-8 with Unix line feeds and will need to be converted if used in a Windows system. Fields include Country name in English Adjectival forms and synonyms of the country name, including some non-English versions of the name ISO-3166 numeric, alpha2 and alpha3 codes, FIPS-10 code, IMF code, COW alpha and numeric codes Capital city Cities with populations over 1-million Regions and geographical features (WordNet meronyms) Leaders, 1960-2008 (rulers.org) Members of government, 2003-2010 (CIA World Leaders) The beginning of the file has fairly extensive documentation on the formats used.
format Dataset
author Schrodt, Philip
author_facet Schrodt, Philip
author_sort Schrodt, Philip
title CountryInfo.txt: Country names, codes, places and leaders
title_short CountryInfo.txt: Country names, codes, places and leaders
title_full CountryInfo.txt: Country names, codes, places and leaders
title_fullStr CountryInfo.txt: Country names, codes, places and leaders
title_full_unstemmed CountryInfo.txt: Country names, codes, places and leaders
title_sort countryinfo.txt: country names, codes, places and leaders
publishDate 2015
url https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f
geographic Greenland
geographic_facet Greenland
genre Greenland
genre_facet Greenland
_version_ 1800872326378553344