Towards a Malay derivational lexicon: learning affixes using expectation maximization

We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a na¨ıve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the...

Full description

Bibliographic Details
Main Authors: Sulaiman, Suriani, Gasser, Michael, Kubler, Sandra
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:http://irep.iium.edu.my/32082/
http://irep.iium.edu.my/32082/
http://irep.iium.edu.my/32082/1/W11-3005.pdf
id iium-32082
recordtype eprints
spelling iium-320822013-12-26T03:12:22Z http://irep.iium.edu.my/32082/ Towards a Malay derivational lexicon: learning affixes using expectation maximization Sulaiman, Suriani Gasser, Michael Kubler, Sandra QA75 Electronic computers. Computer science We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a na¨ıve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the model with EM to find the segmentation that maximizes the probability of each morpheme. We extract the set of affix patterns produced by our algorithm and evaluate them against two references: a list of affix patterns extracted from our hand-segmented derivational wordlist and a derivational history produced by a stemmer. 2011 Conference or Workshop Item PeerReviewed application/pdf en http://irep.iium.edu.my/32082/1/W11-3005.pdf Sulaiman, Suriani and Gasser, Michael and Kubler, Sandra (2011) Towards a Malay derivational lexicon: learning affixes using expectation maximization. In: 2nd Workshop on South and Souteast Asian Natural Language Processing (WSSANLP), IJCNLP 2011, 8th-13th Nov. 2011, Chiang Mai, Thailand. http://aclweb.org/anthology//W/W11/W11-3005.pdf
repository_type Digital Repository
institution_category Local University
institution International Islamic University Malaysia
building IIUM Repository
collection Online Access
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Sulaiman, Suriani
Gasser, Michael
Kubler, Sandra
Towards a Malay derivational lexicon: learning affixes using expectation maximization
description We propose an unsupervised training method to guide the learning of Malay derivational morphology from a set of morphological segmentations produced by a na¨ıve morphological analyzer. Using a morphology-based language model, we first estimate the probability of a given segmentation. We train the model with EM to find the segmentation that maximizes the probability of each morpheme. We extract the set of affix patterns produced by our algorithm and evaluate them against two references: a list of affix patterns extracted from our hand-segmented derivational wordlist and a derivational history produced by a stemmer.
format Conference or Workshop Item
author Sulaiman, Suriani
Gasser, Michael
Kubler, Sandra
author_facet Sulaiman, Suriani
Gasser, Michael
Kubler, Sandra
author_sort Sulaiman, Suriani
title Towards a Malay derivational lexicon: learning affixes using expectation maximization
title_short Towards a Malay derivational lexicon: learning affixes using expectation maximization
title_full Towards a Malay derivational lexicon: learning affixes using expectation maximization
title_fullStr Towards a Malay derivational lexicon: learning affixes using expectation maximization
title_full_unstemmed Towards a Malay derivational lexicon: learning affixes using expectation maximization
title_sort towards a malay derivational lexicon: learning affixes using expectation maximization
publishDate 2011
url http://irep.iium.edu.my/32082/
http://irep.iium.edu.my/32082/
http://irep.iium.edu.my/32082/1/W11-3005.pdf
first_indexed 2023-09-18T20:46:17Z
last_indexed 2023-09-18T20:46:17Z
_version_ 1777409696441827328