Identifying the Dominant Language of Web Page Using Supervised N-grams

Natural language processing is an emerging technology in linguistic industry and an aid to human-computer interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknow...

Full description

Bibliographic Details
Main Authors: Ng, Choon-Ching, Siau-Chuin, Liew, Wan Muhammad Syahrir, Wan Hussin, Tutut, Herawan
Format: Article
Language:English
Published: Conference Publishing Services (CPS) 2013
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf
id ump-6869
recordtype eprints
spelling ump-68692018-04-27T01:15:23Z http://umpir.ump.edu.my/id/eprint/6869/ Identifying the Dominant Language of Web Page Using Supervised N-grams Ng, Choon-Ching Siau-Chuin, Liew Wan Muhammad Syahrir, Wan Hussin Tutut, Herawan QA76 Computer software Natural language processing is an emerging technology in linguistic industry and an aid to human-computer interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknown language of one particular text. Written texts are constructed by common features such as character, word and n-gram and these characteristics are unique among languages. From the experiment result, the performance of the supervised n-gram produces an accurate identification value and outperforms the distance measurement on Arabic script web pages. Conference Publishing Services (CPS) 2013 Article PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf Ng, Choon-Ching and Siau-Chuin, Liew and Wan Muhammad Syahrir, Wan Hussin and Tutut, Herawan (2013) Identifying the Dominant Language of Web Page Using Supervised N-grams. 2012 International Conference on Advanced Computer Science Applications and Technologies. pp. 344-348. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6516378 10.1109/ACSAT.2012.74
repository_type Digital Repository
institution_category Local University
institution Universiti Malaysia Pahang
building UMP Institutional Repository
collection Online Access
language English
topic QA76 Computer software
spellingShingle QA76 Computer software
Ng, Choon-Ching
Siau-Chuin, Liew
Wan Muhammad Syahrir, Wan Hussin
Tutut, Herawan
Identifying the Dominant Language of Web Page Using Supervised N-grams
description Natural language processing is an emerging technology in linguistic industry and an aid to human-computer interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknown language of one particular text. Written texts are constructed by common features such as character, word and n-gram and these characteristics are unique among languages. From the experiment result, the performance of the supervised n-gram produces an accurate identification value and outperforms the distance measurement on Arabic script web pages.
format Article
author Ng, Choon-Ching
Siau-Chuin, Liew
Wan Muhammad Syahrir, Wan Hussin
Tutut, Herawan
author_facet Ng, Choon-Ching
Siau-Chuin, Liew
Wan Muhammad Syahrir, Wan Hussin
Tutut, Herawan
author_sort Ng, Choon-Ching
title Identifying the Dominant Language of Web Page Using Supervised N-grams
title_short Identifying the Dominant Language of Web Page Using Supervised N-grams
title_full Identifying the Dominant Language of Web Page Using Supervised N-grams
title_fullStr Identifying the Dominant Language of Web Page Using Supervised N-grams
title_full_unstemmed Identifying the Dominant Language of Web Page Using Supervised N-grams
title_sort identifying the dominant language of web page using supervised n-grams
publisher Conference Publishing Services (CPS)
publishDate 2013
url http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/
http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf
first_indexed 2023-09-18T22:03:02Z
last_indexed 2023-09-18T22:03:02Z
_version_ 1777414525275865088