On the optimal window shape for genomic signal processing

The segments of DNA molecule, called genes are known to carry useful information in their protein coding regions (exons) and are responsible for protein synthesis. The most popular frequency domain technique for gene identification is by using a sliding window DFT technique to locate the well known...

Full description

Bibliographic Details
Main Author: Gunawan, Teddy Surya
Format: Conference or Workshop Item
Language:English
Published: 2008
Subjects:
Online Access:http://irep.iium.edu.my/2340/
http://irep.iium.edu.my/2340/
http://irep.iium.edu.my/2340/1/On_the_Optimal_Window_Shape_for_Genomic.pdf
Description
Summary:The segments of DNA molecule, called genes are known to carry useful information in their protein coding regions (exons) and are responsible for protein synthesis. The most popular frequency domain technique for gene identification is by using a sliding window DFT technique to locate the well known period-3 property in DNA sequences. This paper investigates the use of parametric and nonparametric window types with DFT based period-3 detection method to identify the coding and noncoding regions. The HMR195 dataset was selected for the performance comparison utilizing the receiver operating characteristic (ROC) curve and area under ROC (AUC) measure. Results showed that the Bartlett window (nonparametric) and Gaussian window with alpha=2.2401 (parametric) provide the optimal window shape compared to 13 other window types for gene identification of coding and noncoding regions.