masterhead masterhead  masterhead

ChAff

Summary

Imagine a robot clumsily interrupts a meeting. This disturbance causes the speaker to vocally and angrily chastise the robot's behavior. Further imagine that the robot was able to react by apologizing and changing its behavior. To realize this scenario, the goal of the ChAff project is to design an FPGA to classify speech in real-time according to prosodic information.

The approach taken is to build upon existing features related to prosody. By performing simulations of real-time speech analysis we are able to find algorithms that are expedient. Following simulation, register transfer level representations of the prosody classifications are synthesized and run on a FPGA.

Currently, the system computes real-time estimates of speaking rate (syllables per second), pitch (fundamental frequency), and loudness (in dB). Future work centers on classifying the resulting trajectories in rate-pitch-loudness space.

Implementation of enrate speaking rate measure in simulink.
Xilinx Virtex 4 development board.

Reference

  1. Reynolds, C., Ishikawa, M. and Tsujino, H. (2006) Realizing Affect in Speech Classification in Real-Time. Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, In conjunction with AAAI Fall Symposia, October 13 - 15, 2006, Washington, D.C., USA. [PDF]
Ishikawa Watanabe Laboratory, Department of Information Physics and Computing, Department of Creative Informatics,
Graduate School of Information Science and Technology, University of Tokyo
Ishikawa Watanabe Laboratory WWW admin: www-admin@k2.t.u-tokyo.ac.jp
Copyright © 2008 Ishikawa Watanabe Laboratory. All rights reserved.