Lexical scoring system of lexical chain for Quranic document retrieval
An Information Retrieval (IR) system aims to extract information based on a query made by a user on a particular subject from an extensive collection of text. IR is a process through which information is retrieved by submitting a query by a user in the form of keywords or to match words. In the A...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Penerbit Universiti Kebangsaan Malaysia
2018
|
Online Access: | http://journalarticle.ukm.my/13770/ http://journalarticle.ukm.my/13770/ http://journalarticle.ukm.my/13770/1/25370-76265-1-PB.pdf |
Summary: | An Information Retrieval (IR) system aims to extract information based on a query made by a
user on a particular subject from an extensive collection of text. IR is a process through
which information is retrieved by submitting a query by a user in the form of keywords or to
match words. In the Al-Quran, verses of the same or comparable topics are scattered
throughout the text in different chapters, and it is therefore difficult for users to remember the
many keywords of the verses. Therefore, in such situations, retrieving information using
semantically related words is useful. In well-composed documents, the semantic integrity of
the text (coherence) exists between the words. Lexical cohesion is the results of chains of
related words that contribute to the continuity of the lexical meaning found within the text are
a direct result of text being about the same thing (i.e. topic, etc.). This indicates that using an
IR system and lexical chains are a useful and appropriate method for representing documents
with concepts rather than using terms in order to have successful retrieval based on semantic
relations. Therefore, a new Lexical Scoring System is proposed in this study, in addition to
determining the semantic relation that exists between words whereby WordNet was used as
the semantic knowledge base. The proposed scoring system helped to retrieve 86.58% of the
total relevant documents in the Al-Quran based on the relevance judgment, using the lexical
chain approach. Based on the findings, the study concludes that, the proposed approach on
representing verses using lexical chains is appropriate and suitable for a Quranic IR system. |
---|