Geometry based lip reading system using Multi Dimension Dynamic Time Warping

This paper describes an automatic lip reading system consisting of two main modules 1) a pre-processing module able to extract lip geometry information from the video sequence and 2) a classification module to identify the visual speech based on dynamic lip movements. The recognition performance of...

Full description

Bibliographic Details
Main Authors: M. Z., Ibrahim, Mulvaney, D. J.
Format: Conference or Workshop Item
Language:English
Published: IEEE 2012
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/26933/
http://umpir.ump.edu.my/id/eprint/26933/
http://umpir.ump.edu.my/id/eprint/26933/1/Geometry%20based%20lip%20reading%20system%20using%20Multi%20Dimension%20Dynamic%20Time%20Warping.pdf
id ump-26933
recordtype eprints
spelling ump-269332020-03-20T01:57:01Z http://umpir.ump.edu.my/id/eprint/26933/ Geometry based lip reading system using Multi Dimension Dynamic Time Warping M. Z., Ibrahim Mulvaney, D. J. TK Electrical engineering. Electronics Nuclear engineering This paper describes an automatic lip reading system consisting of two main modules 1) a pre-processing module able to extract lip geometry information from the video sequence and 2) a classification module to identify the visual speech based on dynamic lip movements. The recognition performance of the proposed system has been assessed in the recognition of the English digits 0 to 9 as spoken by the speakers in the video sequences available in the CUAVE database. Extraction of lip geometry features was carried out using a combination of a skin color filter, a border following algorithm and a convex hull approach. The proposed method was compared with the popular `snake' technique and was found to improve lip shape extraction performance for the database studied. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs the best when representing speech in the visual domain in the application of three separate classification methods, namely optical flow, Dynamic Time Warping (DTW) and a new approach termed Multi-Dimensional DTW. Experiments show that the proposed system is capable of a recognition performance of 68% just using lip height, lip width and the ratio of these features demonstrating that the system has the potential to be incorporated in a multimodal speech recognition system for use in noisy environments. IEEE 2012 Conference or Workshop Item PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/26933/1/Geometry%20based%20lip%20reading%20system%20using%20Multi%20Dimension%20Dynamic%20Time%20Warping.pdf M. Z., Ibrahim and Mulvaney, D. J. (2012) Geometry based lip reading system using Multi Dimension Dynamic Time Warping. In: IEEE Visual Communications and Image Processing (VCIP 2012), 27-30 November 2012 , San Diego, CA, United States. pp. 1-6.. ISBN 978-146734405-0 https://doi.org/10.1109/VCIP.2012.6410805
repository_type Digital Repository
institution_category Local University
institution Universiti Malaysia Pahang
building UMP Institutional Repository
collection Online Access
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
M. Z., Ibrahim
Mulvaney, D. J.
Geometry based lip reading system using Multi Dimension Dynamic Time Warping
description This paper describes an automatic lip reading system consisting of two main modules 1) a pre-processing module able to extract lip geometry information from the video sequence and 2) a classification module to identify the visual speech based on dynamic lip movements. The recognition performance of the proposed system has been assessed in the recognition of the English digits 0 to 9 as spoken by the speakers in the video sequences available in the CUAVE database. Extraction of lip geometry features was carried out using a combination of a skin color filter, a border following algorithm and a convex hull approach. The proposed method was compared with the popular `snake' technique and was found to improve lip shape extraction performance for the database studied. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs the best when representing speech in the visual domain in the application of three separate classification methods, namely optical flow, Dynamic Time Warping (DTW) and a new approach termed Multi-Dimensional DTW. Experiments show that the proposed system is capable of a recognition performance of 68% just using lip height, lip width and the ratio of these features demonstrating that the system has the potential to be incorporated in a multimodal speech recognition system for use in noisy environments.
format Conference or Workshop Item
author M. Z., Ibrahim
Mulvaney, D. J.
author_facet M. Z., Ibrahim
Mulvaney, D. J.
author_sort M. Z., Ibrahim
title Geometry based lip reading system using Multi Dimension Dynamic Time Warping
title_short Geometry based lip reading system using Multi Dimension Dynamic Time Warping
title_full Geometry based lip reading system using Multi Dimension Dynamic Time Warping
title_fullStr Geometry based lip reading system using Multi Dimension Dynamic Time Warping
title_full_unstemmed Geometry based lip reading system using Multi Dimension Dynamic Time Warping
title_sort geometry based lip reading system using multi dimension dynamic time warping
publisher IEEE
publishDate 2012
url http://umpir.ump.edu.my/id/eprint/26933/
http://umpir.ump.edu.my/id/eprint/26933/
http://umpir.ump.edu.my/id/eprint/26933/1/Geometry%20based%20lip%20reading%20system%20using%20Multi%20Dimension%20Dynamic%20Time%20Warping.pdf
first_indexed 2023-09-18T22:42:14Z
last_indexed 2023-09-18T22:42:14Z
_version_ 1777416992159956992