论文标题

人工智能。使用DeepSpeech的基于文本的嵌入式语音

A.I. based Embedded Speech to Text Using Deepspeech

论文作者

Firmansyah, Muhammad Hafidh, Paul, Anand, Bhattacharya, Deblina, Urfa, Gul Malik

论文摘要

DeepSpeech对于需要语音识别的开发IoT设备非常有用。语音识别系统之一是Mozilla的深层语言。 DeepSpeech是一种开源语音识别,它使用神经网络将语音频谱图转换为文本成绩单。本文显示了在低端计算设备上的语音识别过程。具有许多数据集的英语语音识别的发展成为开始的好点。使用每个版本的DeepSpeech提供的模型,而不会改变已经发布的模型,此外,将使用Raspberry Pi用作媒体端到端语音识别设备的好处是一件好事,用户可以改变语音识别和修改语音识别,并且可以在不断的Internet上进行识别,甚至可以在不断的Internet连接上进行识别,甚至可以在compection上进行识别。该论文显示了使用DeepSpeech版本0.1.0、0.1.1和0.6.0的实验,而不是使用Tensorflow非lite。

Deepspeech was very useful for development IoT devices that need voice recognition. One of the voice recognition systems is deepspeech from Mozilla. Deepspeech is an open-source voice recognition that was using a neural network to convert speech spectrogram into a text transcript. This paper shows the implementation process of speech recognition on a low-end computational device. Development of English-language speech recognition that has many datasets become a good point for starting. The model that used results from pre-trained model that provide by each version of deepspeech, without change of the model that already released, furthermore the benefit of using raspberry pi as a media end-to-end speech recognition device become a good thing, user can change and modify of the speech recognition, and also deepspeech can be standalone device without need continuously internet connection to process speech recognition, and even this paper show the power of Tensorflow Lite can make a significant difference on inference by deepspeech rather than using Tensorflow non-Lite.This paper shows the experiment using Deepspeech version 0.1.0, 0.1.1, and 0.6.0, and there is some improvement on Deepspeech version 0.6.0, faster while processing speech-to-text on old hardware raspberry pi 3 b+.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源