1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate

1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate. How to use from pathlib import Path import requests import pandas as pd url = ("https://raw.githubusercontent.com/" "epogrebnyak/ru-cities/main/assets/towns.csv") # save file locally...

Full description

Bibliographic Details
Main Authors: Pogrebnyak, Evgeniy, Artemov, Kirill
Format: Dataset
Language:Russian
Published: Zenodo 2021
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.5148692
https://zenodo.org/record/5148692
id ftdatacite:10.5281/zenodo.5148692
record_format openpolar
spelling ftdatacite:10.5281/zenodo.5148692 2023-05-15T18:47:08+02:00 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate Pogrebnyak, Evgeniy Artemov, Kirill 2021 https://dx.doi.org/10.5281/zenodo.5148692 https://zenodo.org/record/5148692 ru rus Zenodo https://dx.doi.org/10.5281/zenodo.5148693 https://dx.doi.org/10.5281/zenodo.5151423 Open Access Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 info:eu-repo/semantics/openAccess CC-BY cities, Russia dataset Dataset 2021 ftdatacite https://doi.org/10.5281/zenodo.5148692 https://doi.org/10.5281/zenodo.5148693 https://doi.org/10.5281/zenodo.5151423 2021-11-05T12:55:41Z 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate. How to use from pathlib import Path import requests import pandas as pd url = ("https://raw.githubusercontent.com/" "epogrebnyak/ru-cities/main/assets/towns.csv") # save file locally p = Path("towns.csv") if not p.exists(): content = requests.get(url).text p.write_text(content, encoding="utf-8") # read as dataframe df = pd.read_csv("towns.csv") print(df.sample(5)) Files: towns.csv - city information regions.csv - list of Russian Federation regions alt_city_names.json - alternative city names Сolumns (towns.csv): Basic info: city - city name (several cities have alternative names marked in alt_city_names.json ) population - city population, thousand people, Rosstat estimate as of 1.1.2020 lat,lon - city geographic coordinates Region: region_name - subnational region (oblast, republic, krai or AO) region_iso_code - ISO 3166 code, eg RU-VLD federal_district , eg Центральный City codes: okato oktmo fias_id kladr_id Data sources City list and city population collected from Rosstat publication Регионы России. Основные социально-экономические показатели городов and parsed from publication Microsoft Word files. City list corresponds to this Wikipedia article. Alternative dataset is wiki-based Dadata city dataset (no population data). Comments City groups Ханты-Мансийский and Ямало-Ненецкий autonomous regions excluded to avoid duplication as parts of Тюменская область . Several notable towns are classified as administrative part of larger cities ( Сестрорецк is a municpality at Saint-Petersburg, Щербинка part of Moscow). They are not and not reported in this dataset. By individual city Белоозерский not found in Rosstat publication, but should be considered a city as of 1.1.2020 Alternative city names We suppressed letter "ё" city columns in towns.csv - we have Орел , but not Орёл . This affected: Белоозёрский Королёв Ликино-Дулёво Озёры Щёлково Орёл Дмитриев and Дмитриев-Льговский are the same city. assets/alt_city_names.json contains these names. Tests poetry install poetry run python -m pytest How to replicate dataset 1. Base dataset Run: download data stro rar/get.sh convert Саратовская область.doc to docx run make.py Creates: _towns.csv assets/regions.csv 2. API calls Note: do not attempt if you do not have to - this runs a while and loads third-party API access. You have the resulting files in repo, so probably does not need to these scripts. Run: cd geocoding run coord_dadata.py (needs token) run coord_osm.py Creates: coord_dadata.csv coord_osm.csv 3. Merge data Run: run merge.py Creates: assets/towns.csv : See code at Github: https://github.com/epogrebnyak/ru-cities Dataset Ненец* DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language Russian
topic cities, Russia
spellingShingle cities, Russia
Pogrebnyak, Evgeniy
Artemov, Kirill
1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
topic_facet cities, Russia
description 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate. How to use from pathlib import Path import requests import pandas as pd url = ("https://raw.githubusercontent.com/" "epogrebnyak/ru-cities/main/assets/towns.csv") # save file locally p = Path("towns.csv") if not p.exists(): content = requests.get(url).text p.write_text(content, encoding="utf-8") # read as dataframe df = pd.read_csv("towns.csv") print(df.sample(5)) Files: towns.csv - city information regions.csv - list of Russian Federation regions alt_city_names.json - alternative city names Сolumns (towns.csv): Basic info: city - city name (several cities have alternative names marked in alt_city_names.json ) population - city population, thousand people, Rosstat estimate as of 1.1.2020 lat,lon - city geographic coordinates Region: region_name - subnational region (oblast, republic, krai or AO) region_iso_code - ISO 3166 code, eg RU-VLD federal_district , eg Центральный City codes: okato oktmo fias_id kladr_id Data sources City list and city population collected from Rosstat publication Регионы России. Основные социально-экономические показатели городов and parsed from publication Microsoft Word files. City list corresponds to this Wikipedia article. Alternative dataset is wiki-based Dadata city dataset (no population data). Comments City groups Ханты-Мансийский and Ямало-Ненецкий autonomous regions excluded to avoid duplication as parts of Тюменская область . Several notable towns are classified as administrative part of larger cities ( Сестрорецк is a municpality at Saint-Petersburg, Щербинка part of Moscow). They are not and not reported in this dataset. By individual city Белоозерский not found in Rosstat publication, but should be considered a city as of 1.1.2020 Alternative city names We suppressed letter "ё" city columns in towns.csv - we have Орел , but not Орёл . This affected: Белоозёрский Королёв Ликино-Дулёво Озёры Щёлково Орёл Дмитриев and Дмитриев-Льговский are the same city. assets/alt_city_names.json contains these names. Tests poetry install poetry run python -m pytest How to replicate dataset 1. Base dataset Run: download data stro rar/get.sh convert Саратовская область.doc to docx run make.py Creates: _towns.csv assets/regions.csv 2. API calls Note: do not attempt if you do not have to - this runs a while and loads third-party API access. You have the resulting files in repo, so probably does not need to these scripts. Run: cd geocoding run coord_dadata.py (needs token) run coord_osm.py Creates: coord_dadata.csv coord_osm.csv 3. Merge data Run: run merge.py Creates: assets/towns.csv : See code at Github: https://github.com/epogrebnyak/ru-cities
format Dataset
author Pogrebnyak, Evgeniy
Artemov, Kirill
author_facet Pogrebnyak, Evgeniy
Artemov, Kirill
author_sort Pogrebnyak, Evgeniy
title 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
title_short 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
title_full 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
title_fullStr 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
title_full_unstemmed 1117 Russian cities with city name, region, geographic coordinates and 2020 population estimate
title_sort 1117 russian cities with city name, region, geographic coordinates and 2020 population estimate
publisher Zenodo
publishDate 2021
url https://dx.doi.org/10.5281/zenodo.5148692
https://zenodo.org/record/5148692
genre Ненец*
genre_facet Ненец*
op_relation https://dx.doi.org/10.5281/zenodo.5148693
https://dx.doi.org/10.5281/zenodo.5151423
op_rights Open Access
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
info:eu-repo/semantics/openAccess
op_rightsnorm CC-BY
op_doi https://doi.org/10.5281/zenodo.5148692
https://doi.org/10.5281/zenodo.5148693
https://doi.org/10.5281/zenodo.5151423
_version_ 1766239073564884992