The Segmentation Problem in Morphology Learning

this paper, I briefly discuss some experiments on learning morphological forms in languages with much richer morphological paradigms. Such langnages are common throughout much of the globe (from Latin and Greek to Inuit and Cashinahua or Anmajere and Kayardild - to finish with some Australian exampl...

Full description

Bibliographic Details
Main Author: Christopher D. Manning
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 1998
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5054
http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.14.5054
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.14.5054 2023-05-15T16:55:16+02:00 The Segmentation Problem in Morphology Learning Christopher D. Manning The Pennsylvania State University CiteSeerX Archives 1998 application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5054 http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5054 http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf text 1998 ftciteseerx 2016-01-07T14:53:17Z this paper, I briefly discuss some experiments on learning morphological forms in languages with much richer morphological paradigms. Such langnages are common throughout much of the globe (from Latin and Greek to Inuit and Cashinahua or Anmajere and Kayardild - to finish with some Australian examples). Attempting to learn morphology in languages with rich morphology raises quite different problems from those discussed in the work above, issues discussed - if rather naively and unsatisfactorily from a computational viewpoint - in earlier work such as Pinker (1984), MacWhinney (1978) and Peters (1983). Foremost among these is the segmentation problem of how one cuts the complex morphological forms into bits with meanings identified. Note that I assume here that the child has already figured out the meanings of words. This is a big assumption, but it is reasonable for a model to focus on one aspect of the learning problem - and at any rate the learn- ing task is still much broader and more realistic than that attempted by the recent English past tense literature. It may not even be unrealistic; see Pinker (1984:29-30) for a general defense of assuming some form of "semantic bootstrapping" and MacWhinney (1978:70-71) who for arguments for the learning of word meanings before gaining a productive understanding of them ("it appears that the use of inflections in amalgams is stabilized semantically before these amalgams are analyzed morphologically"). Thus the learning task which I am attempting to address could be stated thus: Given a set of words and a representation of their meanings, determine an internalized representation that will allow heard and (regular) unheard forms to be successfully pre- dicted and parsed Text inuit Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description this paper, I briefly discuss some experiments on learning morphological forms in languages with much richer morphological paradigms. Such langnages are common throughout much of the globe (from Latin and Greek to Inuit and Cashinahua or Anmajere and Kayardild - to finish with some Australian examples). Attempting to learn morphology in languages with rich morphology raises quite different problems from those discussed in the work above, issues discussed - if rather naively and unsatisfactorily from a computational viewpoint - in earlier work such as Pinker (1984), MacWhinney (1978) and Peters (1983). Foremost among these is the segmentation problem of how one cuts the complex morphological forms into bits with meanings identified. Note that I assume here that the child has already figured out the meanings of words. This is a big assumption, but it is reasonable for a model to focus on one aspect of the learning problem - and at any rate the learn- ing task is still much broader and more realistic than that attempted by the recent English past tense literature. It may not even be unrealistic; see Pinker (1984:29-30) for a general defense of assuming some form of "semantic bootstrapping" and MacWhinney (1978:70-71) who for arguments for the learning of word meanings before gaining a productive understanding of them ("it appears that the use of inflections in amalgams is stabilized semantically before these amalgams are analyzed morphologically"). Thus the learning task which I am attempting to address could be stated thus: Given a set of words and a representation of their meanings, determine an internalized representation that will allow heard and (regular) unheard forms to be successfully pre- dicted and parsed
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Christopher D. Manning
spellingShingle Christopher D. Manning
The Segmentation Problem in Morphology Learning
author_facet Christopher D. Manning
author_sort Christopher D. Manning
title The Segmentation Problem in Morphology Learning
title_short The Segmentation Problem in Morphology Learning
title_full The Segmentation Problem in Morphology Learning
title_fullStr The Segmentation Problem in Morphology Learning
title_full_unstemmed The Segmentation Problem in Morphology Learning
title_sort segmentation problem in morphology learning
publishDate 1998
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5054
http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf
genre inuit
genre_facet inuit
op_source http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.14.5054
http://acl.ldc.upenn.edu/W/W98/W98-1240.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766046239290294272