February 14, 2022


Although Seq2Seq models have led to significant advances in the automated learning of certain aspects of conversation, they are still limited in certain respects. Generally, the generated responses are syntactically well formed but they fail to take the overall context into account. Also they may be bland, uninformative, and lacking in emotion, or they may be semantically inconsistent. There have been various efforts to address these deficiencies. However, context in this sense applies only to the immediate history of the current utterance, whereas the context of previous conversations may also be relevant. Other types of context include the physical environment in which the conversation is taking place as well as shared knowledge between the participants about entities, relationships and events in the real world that is external to the conversation.


Semantic inconsistency is where the model produces an utterance that is inconsistent with a previous utterance. This problem by incorporating a persona-based model that captures individual characteristics such as background information, language behavior, and interaction style. They found that their model outperformed baseline Seq2Seq systems in terms of BLEU scores, perplexity, and human evaluations. One problem with creating persona-based models is a lack of speaker-specific conversational data for training the model. There has been a long tradition of user modeling within the dialogue systems community where the user model is represented explicitly, often in a logic-based framework. Work within the Seq2Seq approach, in contrast, trains persona vectors from conversational data and embeds them directly into the decoder.


There is an extensive literature on the importance of affect in human communication and on how to endow conversational agents with the ability to recognize and display emotions. It has been shown that endowing conversational agents with emotional characteristics can enhance user satisfaction as well as leading to fewer conversational breakdowns. Recently, researchers in neural dialogue have started to explore how to integrate information about affect into their models. A model predicts the next word in the output conditioned not only on the previous words but also on an affective category that infers the emotional content of the words. In this way the model is able to generate expressive text at various degrees of emotional strength.
















Facebook Twitter Instagram