Repository at Hanyang University: 등장 인물의 얼굴 인식 기반 비디오 인덱싱 시스템

Browse

My Repository

Repository at Hanyang UniversityGRADUATE SCHOOL[S](대학원)DEPARTMENT OF ELECTRICAL & COMPUTER ENGINEERING(전자통신전파공학과)Theses (Ph.D.)

711 0

등장 인물의 얼굴 인식 기반 비디오 인덱싱 시스템

Title: 등장 인물의 얼굴 인식 기반 비디오 인덱싱 시스템

Other Titles: Video Indexing System Based on Face Recognition of Characters

Author: 김형준

Alternative Author(s): Kim, Hyoung-Joon

Advisor(s): 김회율

Issue Date: 2008-02

Publisher: 한양대학교

Degree: Doctor

Abstract: 본 논문에서는 등장 인물의 출연 구간 분석을 통한 비디오 인덱싱 시스템을 제안한다. 비디오에 대한 사전 정보 없이 비디오에서 검출된 얼굴 영상 만을 이용하여 등장 인물 목록을 자동으로 생성하고 이들에 대한 출연 구간을 분석하여 보여준다. 이를 통해 비디오 시청자는 특정 인물이 출연하는 장면을 쉽고 빠르게 검색할 수 있다. 인물 기반 비디오 인덱싱 시스템을 위해 얼굴/눈 검출과 얼굴 인식 과정을 이용한다. 비디오에서 실시간 얼굴 검출을 위해 Viola와 Jones가 제안한 얼굴 검출 방법을 이용한다. 검출된 얼굴을 눈 위치에 기반해서 정규화하기 위해 Zernike 모멘트와 SVM(support vector machine)을 이용한 눈 검출 방법을 제안한다. 즉, 눈과 눈이 아닌 영상을 Zernike 모멘트로 표현하고 SVM을 통해 분류한다. 제안된 눈 검출 방법은 Zernike 모멘트의 회전 불변 특성을 이용하기 때문에 회전된 얼굴에 대해서도 눈을 검출할 수 있고, ORL 영상과 비디오에서 캡쳐된 영상에 대한 실험을 통해 이를 확인하였다. 또한 얼굴 인식을 위해 DCT와 LDA를 이용하는 방법을 제안한다. 제안된 얼굴 인식 방법은 입력 얼굴 영상에서 추출된 DCT 계수 중 변별력이 높은, 즉 서로 다른 클래스를 잘 구분할 수 있는 DCT 계수 만을 선택하고 이에 대해 LDA를 적용한다. 또한 얼굴의 포즈와 표정, 조명 등의 변화에 강인한 인식을 위해 에지 정보와 얼굴 구성 요소를 이용하는 방법을 제안한다. MPEG-7 얼굴 데이터 셋에 대한 실험 결과 기존의 방법 보다 우수한 성능을 보임을 확인하였다. 위 과정들에 기반하여, 비디오 인덱싱 시스템은 검출된 얼굴에 대해 클러스터링을 수행하여 등장 인물 목록을 생성한다. 주어진 비디오에서 검출된 얼굴에 대해 competitive agglomeration 클러스터링 방법을 통해 유사한 얼굴 영상들을 하나의 클러스터로 병합한다. 이때 다른 인물의 얼굴 영상이 같은 클러스터에 포함되는 경우가 발생한다. 이에 대해 DCT/LDA 얼굴 인식 방법과 leave-one-out test에 기반해서 클러스터들을 정제함으로써 잘못 클러스터링된 얼굴 영상을 적합한 클러스터로 옮긴다. 이와 같은 클러스터링 과정을 통해 비디오에 등장하는 인물들에 대한 목록이 생성되면 각 인물들의 출연 구간 분석을 통해 비디오 인덱싱이 이루어진다. 제안된 비디오 인덱싱 시스템을 실제 방송 비디오에 적용한 결과, 등장 인물 목록과 각 인물의 출연 구간이 올바르게 분석됨으로써 등장 인물 기반 비디오 인덱싱이 잘 수행됨을 확인하였다. 구현된 시스템은 방송 멀티미디어 공급자를 위한 대용량 비디오 관리 시스템으로 이용될 수 있으며 일반 사용자를 대상으로 한 PVR, 휴대용 멀티미디어 플레이어 등 다양한 응용에 이용될 수 있다.; This dissertation proposes a video indexing system based on character?s appearance. Using the face information without any a prior knowledge, the proposed system automatically generates a list of characters and analyzes the appearance interval of each character. The indexing result is displayed on the time line so that a user can efficiently browse the video to play shots containing a specific character of interest. For the video indexing system based on characters, the processes of face/eye detection and recognition play a key role. More specifically, the system has to detect image frames that contain faces of characters in the video first, then the face region is segmented from the frame followed by the normalization process prior to the recognition process. Viola and Jones?s method was adopted for real time face detection in videos. To normalize the detected faces based on eye positions, an eye detection method using the support vector machine (SVM) with Zernike moments is proposed. Eye/non-eye patterns are represented in terms of the magnitude of Zernike moments and then classified by the SVM. Owing to the rotation-invariant characteristics of the magnitude of Zernike moments, the eye detection method is robust against rotation, which is demonstrated using the rotated images from the ORL database as well as frame images captured from TV drama videos. In order to recognize faces, a face recognition method based on discrete cosine transform (DCT) and linear discriminant analysis (LDA) with a novel feature selection technique is proposed. Since not all DCT coefficients contribute to classifying faces, we propose a method of selecting the optimum coefficients that yields the highest recognition rate. Then, LDA is applied to those selected coefficients. Moreover, an edge image and facial components are utilized to improve the recognition performance under face variations. The experiment on the MPEG-7 face dataset shows improvement in recognition accuracy when compared to the other traditional methods. Based on the processes above, the video indexing system generates a list of characters by clustering the detected faces. After faces are detected and tracked from the given video, similar faces are merged using the competitive agglomeration clustering algorithm. Here, some faces may be misclassified?faces of different characters belong to the same cluster. To correct misclassified faces, cluster refinement based on the leave-one-out test with the DCT/LDA based face recognition method was performed to move misclassified faces into the correct cluster. Once the character list is generated, the video is annotated with the time information of each character?s appearance. The experiments demonstrates that the proposed video indexing system worked well on several broadcasting videos. The system can be utilized to various applications including video management systems for broadcast multimedia suppliers, and personal video recorders and potable multimedia players for popular use.

URI: https://repository.hanyang.ac.kr/handle/20.500.11754/147346 http://hanyang.dcollection.net/common/orgView/200000408463

Appears in Collections:: GRADUATE SCHOOL[S](대학원) > DEPARTMENT OF ELECTRICAL & COMPUTER ENGINEERING(전자통신전파공학과) > Theses (Ph.D.)

Files in This Item:

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

한양대학교 리포지터리는 국립중앙도서관 OAK 보급사업으로 구축되었습니다. Feedback 개인정보처리방침

Hanyang University repository

Browse

My Repository

BROWSE