CountryInfo.txt: Country names, codes, places and leaders
CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet...
Main Author: | |
---|---|
Format: | Dataset |
Language: | unknown |
Published: |
2015
|
Subjects: | |
Online Access: | https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f |
id |
dataone:sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f |
---|---|
record_format |
openpolar |
spelling |
dataone:sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f 2024-06-03T18:46:52+00:00 CountryInfo.txt: Country names, codes, places and leaders Schrodt, Philip 2015-05-07T00:00:00Z https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f unknown Social Sciences Dataset 2015 dataone:urn:node:HD 2024-06-03T18:07:25Z CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet information for the development of TABARI dictionaries. File contains about 32,000 lines, covering about 240 countries and administrative units (e.g. American Samoa, Christmas Island, Hong Kong, Greenland). It is internally documented and almost but not quite XML: The major fields are delimited with tags of the form ... but elements inside are delimited with line feeds. Converting this to strict XML would be a relatively simple programming exercise for anyone who should be working with the file in the first place. File is UTF-8 with Unix line feeds and will need to be converted if used in a Windows system. Fields include Country name in English Adjectival forms and synonyms of the country name, including some non-English versions of the name ISO-3166 numeric, alpha2 and alpha3 codes, FIPS-10 code, IMF code, COW alpha and numeric codes Capital city Cities with populations over 1-million Regions and geographical features (WordNet meronyms) Leaders, 1960-2008 (rulers.org) Members of government, 2003-2010 (CIA World Leaders) The beginning of the file has fairly extensive documentation on the formats used. Dataset Greenland Unknown Greenland |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
dataone:urn:node:HD |
language |
unknown |
topic |
Social Sciences |
spellingShingle |
Social Sciences Schrodt, Philip CountryInfo.txt: Country names, codes, places and leaders |
topic_facet |
Social Sciences |
description |
CountryInfo.txt is a general purpose file intended to facilitate natural language processing of news reports and political texts. It was originally developed to identify states for the text filtering system used in the development of MID4, then extended to incorporate CIA World Factbook and WordNet information for the development of TABARI dictionaries. File contains about 32,000 lines, covering about 240 countries and administrative units (e.g. American Samoa, Christmas Island, Hong Kong, Greenland). It is internally documented and almost but not quite XML: The major fields are delimited with tags of the form ... but elements inside are delimited with line feeds. Converting this to strict XML would be a relatively simple programming exercise for anyone who should be working with the file in the first place. File is UTF-8 with Unix line feeds and will need to be converted if used in a Windows system. Fields include Country name in English Adjectival forms and synonyms of the country name, including some non-English versions of the name ISO-3166 numeric, alpha2 and alpha3 codes, FIPS-10 code, IMF code, COW alpha and numeric codes Capital city Cities with populations over 1-million Regions and geographical features (WordNet meronyms) Leaders, 1960-2008 (rulers.org) Members of government, 2003-2010 (CIA World Leaders) The beginning of the file has fairly extensive documentation on the formats used. |
format |
Dataset |
author |
Schrodt, Philip |
author_facet |
Schrodt, Philip |
author_sort |
Schrodt, Philip |
title |
CountryInfo.txt: Country names, codes, places and leaders |
title_short |
CountryInfo.txt: Country names, codes, places and leaders |
title_full |
CountryInfo.txt: Country names, codes, places and leaders |
title_fullStr |
CountryInfo.txt: Country names, codes, places and leaders |
title_full_unstemmed |
CountryInfo.txt: Country names, codes, places and leaders |
title_sort |
countryinfo.txt: country names, codes, places and leaders |
publishDate |
2015 |
url |
https://search.dataone.org/view/sha256:d6d15fe697a6bf3c214f4570fc5e4fde0873fd931bcc9221fafc24b640dfcf1f |
geographic |
Greenland |
geographic_facet |
Greenland |
genre |
Greenland |
genre_facet |
Greenland |
_version_ |
1800872326378553344 |