The development of automated segmentation for the FTMSK official letters in XML / Muhammad Muhaimin Mohd Isa
The letter document has their own fomiat, which consists of many parts. In order to process the document, the project has developed a prototype to allow the existence of content based document. This is important to divide the document into smaller, recognized labels that are intensive and flexibl...
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2005
|
Subjects: | |
Online Access: | http://ir.uitm.edu.my/id/eprint/1703/ http://ir.uitm.edu.my/id/eprint/1703/1/TD_MUHAMMAD%20MUHAIMIN%20MD%20ISA%20CS%2005_5%20P01.pdf |
Summary: | The letter document has their own fomiat, which consists of many parts. In
order to process the document, the project has developed a prototype to
allow the existence of content based document. This is important to divide the
document into smaller, recognized labels that are intensive and flexible for
managing, editing, and extracting. The target of this thesis is to apply the
standard of official letter for the system, as well as to develop the algorithm
which will segment the letter documents, and convert to XML documents. The
main software used is Visual Basic 6.0. The project was estimated by
evaluating the similarity of the contents in letter document and the output of
XML document. The prototype is tested by using approach of manual
checking. About 20 samples of letters have been tested to verity the prototype
efficiency. The result of the experiment shows that 30% of the output can not
be viewed due to the runtime en-ors. The findings of the research include the
interviews and surveys appointed, the study of related topics and problems
firom the internet, books, and other information medium. The data was
collected from the FTMSK administration department, and the testing was
done manually using researcher's own workstations and hardware. |
---|