IJCVSP: Vol. 10 1001

Multi-Level Feature and Context Pyramid Network for Object Detection

Xia Wang, Inner Mongolia University, China
Yingdong Ma, Inner Mongolia University, China
Abstract:
Robust object detection require fine details to represent object structure and high-level semantic knowledge extracted from deep feature maps. Besides, contextual information is also important for exact location of multiple scale objects. However, it is difficult to meet these demands simultaneously in the top-down CNN structure. In this work, we present the Multi-Level Feature and Context Pyramid Network (MLFCP Net) to tackle this problem. The proposed MLFCP Net consist of two main modules. To utilize advantages of multiple level features, the Multi-level Feature Fusion (MFF) module combines different layer feature maps to form enhanced multi-level features. The Context Pooling Aggregation module combines local and global context features to further improve detection accuracy. Our method achieves 84.9% mAP on PASCAL VOC2007 test at 16.7FPS with 320x320 input and 42.5% AP on MS COCO. Experimental results demonstrate effectiveness of the proposed feature fusion method and the context aggregation scheme.
Full Paper (in PDF)

Lip Reading using Facial Expression Features

Tasuya Shirakata, Kyushu Institute of Technology, Japan
Takeshi Saitoh, Kyushu Institute of Technology, Japan
Abstract:
Lip-reading technology that estimates speech content only from visual data without using audio data is expected as a next-generation interface. It can be used even in a situation where audio recognition is difficult such as in a high noise environment or a person with a speech disorder. The conventional methods use only features around the lips. This paper proposes a method to integrate facial expression features; Expression-based feature and action unit-based feature into the lip-reading method. Evaluation experiments are conducted with three public databases of OuluVS, CUAVE, and CENSREC-1-AV. As a result, it is confirmed that the recognition accuracy is improved by integrating the facial expression feature for all databases.
Full Paper (in PDF)

Data Driven Analysis of the Behaviour of Elderly People Using k-Means and Home Automation and Power Consumption Sensors

Bjorn Friedrich, Carl von Ossietzky University of Oldenburg, Germany
Enno-Edzard Steen, Carl von Ossietzky University of Oldenburg, Germany
Hirohiko Suwa, Nara Institute of Science and Technology, Japan
Andreas Hein, Carl von Ossietzky University of Oldenburg, Germany
Keiichi Yasumoto, Nara Institute of Science and Technology, Japan
Abstract:
Information about the mental and physical conditions of elderly people are essential to assess their ability to live alone in their own homes. Usually, those information are collected using questionnaires and geriatrics assessments. However, both methods have their limitations. People might not answer honestly to personal questions and geriatrics assessments are only measuring the capacity at a specific point in time. Moreover, questionnaires and assessments are limited in catching variability. Elderly people which have similar scores in questionnaires and assessment can still be very different in terms of mental and physical conditions. To get a distinguished impression of the condition long-term monitoring is needed. In this article we show that the behaviour of elderly people is so distinguished, that they can be identified by using k-Means clustering on a dataset comprised of motion sensor data and power consumption sensor data. Moreover, we show that the results of a combined dataset is different to considering each participant separately. We applied the algorithm to three participants of a real-world study. Even though two of the participants have similar questionnaire and assessment scores, they can be clearly distinguished from each other as well as from the third participant who has different scores.
Full Paper (in PDF)

Hybrid Text Summarizer for Bangla Document

Mahimul Islam, Ahsanullah University of Science and Technology, Bangladesh
Fariha Nuzhat Majumdar, Ahsanullah University of Science and Technology, Bangladesh
Asadullahhil Galib, Ahsanullah University of Science and Technology, Bangladesh
Md Moinul HoqueAhsanullah University of Science and Technology, Bangladesh
Abstract:
Automatic text summarization is needed to concisely extract a small subset of text portion from a large text where the isolated text may have sentences that are more significant compared to other sentences in the text. Although there have been a lot of approaches to English text summarization, very few works have been done on automatic Bengali text summarization. For the evaluation purpose, a dataset was formulated from the scratch with Bengali news documents from two reputed newspapers. The evaluation dataset was classified in four different classes with benchmark standard summary text, generated by a group of random human contributors for each of the documents. The current work presents a hybrid approach for dealing with the summarization process of Bengali text documents. The hybrid model is introduced with a goal to improve the overall accuracy of the summary text generation. The proposed model generates a summary text based on keyword scoring, sentiment analysis, and the interconnection of sentences. After conducting the evaluation on the existing dataset, the proposed system performs with an average of 0.77 Recall Score, 0.57 Precision Score, and 0.64 F-measure Score. Empirical verification with other similar systems shows that the proposed model can be used as an alternative system to address the Text Summarization problem of Bengali documents.
Full Paper (in PDF)

Hit Counter (since 1 Nov. 2012 for papers)

Vol. 10, No. 1:

Multi-Level Feature and Context Pyramid Network for Object Detection

Lip Reading using Facial Expression Features

Data Driven Analysis of the Behaviour of Elderly People Using k-Means and Home Automation and Power Consumption Sensors

Hybrid Text Summarizer for Bangla Document