Machine Learning Applied to Reach Classification in a Northern Sweden Catchment
An accurate fine resolution classification of river systems positively impacts the process of assessment and monitoring of water courses, as stressed by the European Commission’s Water Framework Directive. Being able to attribute classes using remotely obtained data can be advantageous to perform ex...
Main Author: | |
---|---|
Format: | Bachelor Thesis |
Language: | English |
Published: |
Umeå universitet, Institutionen för ekologi, miljö och geovetenskap
2021
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184140 |
id |
ftumeauniv:oai:DiVA.org:umu-184140 |
---|---|
record_format |
openpolar |
spelling |
ftumeauniv:oai:DiVA.org:umu-184140 2023-10-09T21:54:38+02:00 Machine Learning Applied to Reach Classification in a Northern Sweden Catchment dos Santos Toledo Busarello, Mariana 2021 application/pdf http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184140 eng eng Umeå universitet, Institutionen för ekologi, miljö och geovetenskap http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184140 info:eu-repo/semantics/openAccess machine learning geomorphology random forest channel type Computer Sciences Datavetenskap (datalogi) Natural Sciences Naturvetenskap Oceanography Hydrology and Water Resources Oceanografi hydrologi och vattenresurser Computer and Information Sciences Data- och informationsvetenskap Student thesis info:eu-repo/semantics/bachelorThesis text 2021 ftumeauniv 2023-09-22T13:53:58Z An accurate fine resolution classification of river systems positively impacts the process of assessment and monitoring of water courses, as stressed by the European Commission’s Water Framework Directive. Being able to attribute classes using remotely obtained data can be advantageous to perform extensive classification of reaches without the use of field work, with some methods also allowing to identify which features best described each of the process domains. In this work, the data from two Swedish sub-catchments above the highest coastline was used to train a Random Forest Classifier, a Machine Learning algorithm. The obtained model provided predictions of classifications and analyses of the most important features. Each study area was studied separately, then combined. In the combined case, the analysis was made with and without lakes in the data, to verify how it would affect the predictions. The results showed that the accuracy of the estimator was reliable, however, due to data complexity and imbalance, rapids were harder to be classify accurately, with an overprediction of the slow-flowing class. Combining the datasets and having the presence of lakes lessened the shortcomings of the data imbalance. Using the feature importance and permutation importance methods, the three most important features identified were the channel slope, the median of the roughness in the 100-m buffer, and the standard deviation of the planform curvature in the 100-m buffer. This finding was supported by previous studies, but other variables expected to have a high participation such as lithology and valley confinement were not relevant, which most likely relates to the coarseness of the available data. The most frequent errors were also placed in maps, showing there was some overlap of error hotspots and areas previously restored in 2010. Bachelor Thesis Northern Sweden Umeå University: Publications (DiVA) |
institution |
Open Polar |
collection |
Umeå University: Publications (DiVA) |
op_collection_id |
ftumeauniv |
language |
English |
topic |
machine learning geomorphology random forest channel type Computer Sciences Datavetenskap (datalogi) Natural Sciences Naturvetenskap Oceanography Hydrology and Water Resources Oceanografi hydrologi och vattenresurser Computer and Information Sciences Data- och informationsvetenskap |
spellingShingle |
machine learning geomorphology random forest channel type Computer Sciences Datavetenskap (datalogi) Natural Sciences Naturvetenskap Oceanography Hydrology and Water Resources Oceanografi hydrologi och vattenresurser Computer and Information Sciences Data- och informationsvetenskap dos Santos Toledo Busarello, Mariana Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
topic_facet |
machine learning geomorphology random forest channel type Computer Sciences Datavetenskap (datalogi) Natural Sciences Naturvetenskap Oceanography Hydrology and Water Resources Oceanografi hydrologi och vattenresurser Computer and Information Sciences Data- och informationsvetenskap |
description |
An accurate fine resolution classification of river systems positively impacts the process of assessment and monitoring of water courses, as stressed by the European Commission’s Water Framework Directive. Being able to attribute classes using remotely obtained data can be advantageous to perform extensive classification of reaches without the use of field work, with some methods also allowing to identify which features best described each of the process domains. In this work, the data from two Swedish sub-catchments above the highest coastline was used to train a Random Forest Classifier, a Machine Learning algorithm. The obtained model provided predictions of classifications and analyses of the most important features. Each study area was studied separately, then combined. In the combined case, the analysis was made with and without lakes in the data, to verify how it would affect the predictions. The results showed that the accuracy of the estimator was reliable, however, due to data complexity and imbalance, rapids were harder to be classify accurately, with an overprediction of the slow-flowing class. Combining the datasets and having the presence of lakes lessened the shortcomings of the data imbalance. Using the feature importance and permutation importance methods, the three most important features identified were the channel slope, the median of the roughness in the 100-m buffer, and the standard deviation of the planform curvature in the 100-m buffer. This finding was supported by previous studies, but other variables expected to have a high participation such as lithology and valley confinement were not relevant, which most likely relates to the coarseness of the available data. The most frequent errors were also placed in maps, showing there was some overlap of error hotspots and areas previously restored in 2010. |
format |
Bachelor Thesis |
author |
dos Santos Toledo Busarello, Mariana |
author_facet |
dos Santos Toledo Busarello, Mariana |
author_sort |
dos Santos Toledo Busarello, Mariana |
title |
Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
title_short |
Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
title_full |
Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
title_fullStr |
Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
title_full_unstemmed |
Machine Learning Applied to Reach Classification in a Northern Sweden Catchment |
title_sort |
machine learning applied to reach classification in a northern sweden catchment |
publisher |
Umeå universitet, Institutionen för ekologi, miljö och geovetenskap |
publishDate |
2021 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184140 |
genre |
Northern Sweden |
genre_facet |
Northern Sweden |
op_relation |
http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184140 |
op_rights |
info:eu-repo/semantics/openAccess |
_version_ |
1779318284731023360 |