Detecting misleading news titles by word similarity / Nur Hazwani Ghazali

Nowadays more and more people prefer to read online news as the Internet is continues to grow into a major news source. Online newspapers become widely spread as an interactive and efficient way to present information. There are many benefits can be obtained from reading online newspaper such as peo...

Full description

Bibliographic Details
Main Author: Ghazali, Nur Hazwani
Format: Student Project
Language:English
Published: Faculty of Computer and Mathematical Sciences 2015
Online Access:http://ir.uitm.edu.my/id/eprint/14574/
http://ir.uitm.edu.my/id/eprint/14574/1/PPb_NUR%20HAZWANI%20GHAZALI%20CS%2015_5.pdf
Description
Summary:Nowadays more and more people prefer to read online news as the Internet is continues to grow into a major news source. Online newspapers become widely spread as an interactive and efficient way to present information. There are many benefits can be obtained from reading online newspaper such as people can get access to information faster and reading online newspaper is cost effective as people can save their money. However, the relevance of news articles and the relatedness of an article with the headline are put into question. News title can be misleading when the story is not relevant with its title. People can misinterpret the idea of an article by just looking at the title. This project aims to develop and test the functionality of a system that detect misleading news title by word similarity. For that purpose, web scraping technique is used to extract news article from news web page. The occurrences of words in the article are then counted to get number of repeated words. Top ten common words from the number of repeated words are then visualized. The similarity of top ten common words and the title are computed using word similarity technique. By applying these approaches, the degree of similarity of the article content and the title can be acquired and a conclusion can be generated from those results where the lower the degree of similarity between the common words and the title, the lower the relevance of the news articles.