论文标题
用“你说的话”代表“你怎么说”:反映相应含义的专注语音和文本的英语语料库
Representing 'how you say' with 'what you say': English corpus of focused speech and text reflecting corresponding implications
论文作者
论文摘要
在语音交流中,如何说某物(副语言信息)与所说的(语言信息)至关重要。作为一种副语言信息,英语语音使用句子压力(句子中最重的突出)来传达重点。尽管句子压力的不同放置会传达出不同的强调含义,但如果话语在语言上相同,丢失了副语言信息,那么当前的语音翻译系统会返回相同的翻译。专注于焦点,一种重点,我们建议使用词汇和语法设备将副语言信息映射到源语言中的语言领域。此方法使我们能够翻译释义的文本表示,而不是原始语音的转录,并获得保留副语言信息的翻译。作为第一步,我们介绍了一个英语语料库的集合,其中包含语音,该语音与相应的文本一起在焦点的放置方面有所不同,该文本旨在反映演讲的隐含含义。同样,对我们的语料库的分析表明,从副语言领域映射到语言领域涉及各种词汇和语法方法。我们分析的数据和见解将进一步提高对副语言翻译的研究。该语料库将通过最不发达国家和我们的网站发布。
In speech communication, how something is said (paralinguistic information) is as crucial as what is said (linguistic information). As a type of paralinguistic information, English speech uses sentence stress, the heaviest prominence within a sentence, to convey emphasis. While different placements of sentence stress communicate different emphatic implications, current speech translation systems return the same translations if the utterances are linguistically identical, losing paralinguistic information. Concentrating on focus, a type of emphasis, we propose mapping paralinguistic information into the linguistic domain within the source language using lexical and grammatical devices. This method enables us to translate the paraphrased text representations instead of the transcription of the original speech and obtain translations that preserve paralinguistic information. As a first step, we present the collection of an English corpus containing speech that differed in the placement of focus along with the corresponding text, which was designed to reflect the implied meaning of the speech. Also, analyses of our corpus demonstrated that mapping of focus from the paralinguistic domain into the linguistic domain involved various lexical and grammatical methods. The data and insights from our analysis will further advance research into paralinguistic translation. The corpus will be published via LDC and our website.