Identifying the Dominant Language of Web Page Using Supervised N-grams
Natural language processing is an emerging technology in linguistic industry and an aid to human-computer interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknow...
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Conference Publishing Services (CPS)
2013
|
| Subjects: | |
| Online Access: | http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf |
| id |
ump-6869 |
|---|---|
| recordtype |
eprints |
| spelling |
ump-68692018-04-27T01:15:23Z http://umpir.ump.edu.my/id/eprint/6869/ Identifying the Dominant Language of Web Page Using Supervised N-grams Ng, Choon-Ching Siau-Chuin, Liew Wan Muhammad Syahrir, Wan Hussin Tutut, Herawan QA76 Computer software Natural language processing is an emerging technology in linguistic industry and an aid to human-computer interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknown language of one particular text. Written texts are constructed by common features such as character, word and n-gram and these characteristics are unique among languages. From the experiment result, the performance of the supervised n-gram produces an accurate identification value and outperforms the distance measurement on Arabic script web pages. Conference Publishing Services (CPS) 2013 Article PeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf Ng, Choon-Ching and Siau-Chuin, Liew and Wan Muhammad Syahrir, Wan Hussin and Tutut, Herawan (2013) Identifying the Dominant Language of Web Page Using Supervised N-grams. 2012 International Conference on Advanced Computer Science Applications and Technologies. pp. 344-348. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6516378 10.1109/ACSAT.2012.74 |
| repository_type |
Digital Repository |
| institution_category |
Local University |
| institution |
Universiti Malaysia Pahang |
| building |
UMP Institutional Repository |
| collection |
Online Access |
| language |
English |
| topic |
QA76 Computer software |
| spellingShingle |
QA76 Computer software Ng, Choon-Ching Siau-Chuin, Liew Wan Muhammad Syahrir, Wan Hussin Tutut, Herawan Identifying the Dominant Language of Web Page Using Supervised N-grams |
| description |
Natural language processing is an emerging technology
in linguistic industry and an aid to human-computer
interaction in computer science. Language identification, on the other hand, is a form of pattern recognition that helps to identify predefined language of a web page and to predict the unknown language of one particular text. Written texts are constructed by common features such as character, word and n-gram and these characteristics are unique among languages. From the experiment result, the performance of the supervised n-gram produces an accurate identification value and outperforms the distance measurement on Arabic script web pages. |
| format |
Article |
| author |
Ng, Choon-Ching Siau-Chuin, Liew Wan Muhammad Syahrir, Wan Hussin Tutut, Herawan |
| author_facet |
Ng, Choon-Ching Siau-Chuin, Liew Wan Muhammad Syahrir, Wan Hussin Tutut, Herawan |
| author_sort |
Ng, Choon-Ching |
| title |
Identifying the Dominant Language of Web Page Using Supervised N-grams |
| title_short |
Identifying the Dominant Language of Web Page Using Supervised N-grams |
| title_full |
Identifying the Dominant Language of Web Page Using Supervised N-grams |
| title_fullStr |
Identifying the Dominant Language of Web Page Using Supervised N-grams |
| title_full_unstemmed |
Identifying the Dominant Language of Web Page Using Supervised N-grams |
| title_sort |
identifying the dominant language of web page using supervised n-grams |
| publisher |
Conference Publishing Services (CPS) |
| publishDate |
2013 |
| url |
http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/ http://umpir.ump.edu.my/id/eprint/6869/1/dentifying_the_Dominant_Language_of_Web_Page_Using_Supervised_N-grams.pdf |
| first_indexed |
2023-09-18T22:03:02Z |
| last_indexed |
2023-09-18T22:03:02Z |
| _version_ |
1777414525275865088 |