Dan Jurafsky의 Speech and Language Processing의 Dialog Systems and Chatbots 중 저에게 필요한 부분만 정리한 내용입니다.

정리에 사용된 사진은 모두 Speech and Language Processing에서 가져온 사진입니다.

Speech and Language Processing는 아래의 링크에서 보실 수 있습니다.

Ch.26 Dialog Systems and Chatbots

conversational agents, or dialogue systems

prgram이 users와 natural language (text, speech, or both)로 소통한다.

2가지 분류

Task-oriented dialogue agents
- user가 task를 완료할 수 있도록 돕는다. (식당 찾기, 예약, 가전제품 제어, 등)
Chatbots
- 연장된 대화를 위해 고안된 시스템
- 인간의 상호작용의 비정형적인 대화나 '채팅' 특성을 모방하도록 설정

26.1 Properties of Human Conversation

Conversation between humans is an intricate and complex joint activity.

Turns

A dialogue is a sequence of turns, each a single contribution to the dialogue.
하나의 단어에서부터 여러개의 문장까지 이루어질 수 있다.
end pointing or endpoint detection
- system이 대화를 언제 멈출지, 언제 시작해야할지를 알아야 함.
- 사람들은 상대방이 말을 마치려고 할 때 (대부분의 경우) 감지할 수 있기 때문에, 상대방이 말을 마치면 거의 즉시 그들의 차례를 시작한다. 구어 대화 시스템은 또한 사용자가 말하는 것을 끝냈는지 여부를 감지해야 하며, 따라서 그들은 말을 처리하고 응답할 수 있다.

Speech Acts

대화의 핵심 통찰은 대화의 각 발언(utterance)들이 화자에 의해 수행되고 있는 일종의 행동(Action)이라는 것이다.
이러한 행동(Action)을 흔히 Speech Acts 또는 Dialog Acts라고 부른다.
The speech act expresses an important component of the intention of the speaker (or writer) in saying what they said.

4가지 type (Bach and Harnish, 1979)

Constatives (진술?)
- speaker에게 어떤 것이 case가 되는 것을 약속
- answering, claiming, confirming, denying, disagreeing, stating
Directives (지시)
- address가 어떤 것을 하도록 speaker가 시도하는 것.
- 직접적으로 요청을 한다던가, 답변을 내놓도록 질문을 한다던가.
- advising, asking, forbidding, inviting, ordering, requesting
Commissives
- speaker에게 미래의 어떤 행동을 약속하는 것.
- promising, planning, vowing, betting, opposing
Acknowledgements
- 어떤 사회적 행동에 관하여 착용자에 대한 화자의 태도를 표현
- apologizing, greeting, thanking, accepting an acknowledgment

Grounding

Dialogue

독립적인 speech acts의 연속이 아니라 speaker와 hearer에가 행하는 collective act.

common ground

참가자들이 그들 모두가 동의하는 것을 확립하는 것.
speaker들은 서로의 utterances를 ground 함으로써 이것을 한다.

Grounding

"hearer가 speaker를 이해했음"을 인정하는 것. like ACK in 데이터통신.
People은 non-linguistic actions에도 grounding이 필요함.
- 엘베 버튼을 눌렀을 때 불이 들어오는 것은 엘베가 실제로 호출되었다는 것을 인정하는 것.
Humans는 서로의 utternaces를 끊임없이 ground.
"OK"라고 명시적으로 대답하기, 상대방의 말을 반복, "and"라고 말하기 등으로 ground 할 수 있음.
ground를 한다 ⇒ 질문을 성공적으로 이해했다.

Subdialogues and Dialogue Structure

the local structure between speech acts discussed in the field of conversational analysis

adjacency pairs

first pair part and a second pair part로 구성.
QUESTIONS은 ANSWER에 대한 기대를 설정.
PROPOSALS 다음에는 ACCEPTANCE 또는 REJECTION이 따라온다.
이러한 기대는 시스템이 어떤 조치를 취할 것인지 결정하는데 도움을 줌.
하지만 pair가 (1st part와 2nd part가) 즉시 이어지는 것은 아니다. (중간에 다른 것들이 끼여있을 수 있다.)
두 parts는 side sequence 또는 subdialogue로 분리된다.

Initiative

conversational initiative

대화의 주도권
ex: 인터뷰에서의 기자 ⇒ 기자(one participant)의 의해서 대화가 완전히 controlled
하지만 일반적인 대화에서는 initiative가 참가자들 사이에서 오락가락함. ⇒ mixed initiavtive

mixed initiavtive

일반적인 대화에서 이루어지지만 dialogue systems이 달성하기는 아주 어려움.
It’s much easier to design dialogue systems to be passive responders.
user-initiative systems → user가 쿼리하면 system이 대답.

Inference and Implicature

Inference is also important in dialogue understanding.

conversational implicature (암시)

예시
- System: 5월 어느 날 여행하고 싶었니?
- User: 12일부터 15일까지 있을 미팅에 참석해야겠어.
- 고객이 실제로 에이전트의 질문에 답변하지 않는다.
- 고객은 단지 특정 시간에 미팅을 언급할 뿐이다.
이처럼 speaker는 hearer가 어떤 것을 추론해내기를 바란다.
즉 speaker의 uttered words 보다 더 많은 정보를 담고 있음.

Implicature (암시)

particular class of licensed inferences
hearer가 이러한 추론을 이끌어낼 수 있게 하는 것은 a set of maxims, general heuristics에 의해 가능하다.

relevance

maxim of relevance
speaker들은 그저 무작위적인 speech acts를 하는 것이 아니라, relevance 하도록 노력한다.
상대방의 utternace를 보고 어떤 relevance에 대해 항상 생각한다. (왜 저렇게 말했을까?)
즉 위의 예시에서 유저가 12-15일 미팅을 가야한다는 것에 대해 system은 11일에는 반드시 비행을 하여 12일에 도착해야하는 것을 유추해야한다.

human conversations의 subtle characteristics

turns, speech acts, grounding, dialogue structure, initiative, and implicature
인간과 자연스러운 대화를 이어나갈 수 있는 대화 시스템을 구축하기 어려운 이유
Many of these challenges are active areas of dialogue systems research.

26.3 GUS: Simple Frame-based Dialogue Systems

GUS architecture for task-based dialogue.

All modern task-based dialogue systems (simple GUS or sophisticated dialogue state architectures) based frames.

Frame

user의 sentence로부터 추출할 수 있는 의도(intentions)를 표현하는 지식 구조(knowledge structure).
possible values를 취할 수 있는 slots의 collection으로 구성.
set of frames == domain ontology

slots in task-based dialogue frame

slots는 'system이 알아야 할 것'으로 specify.
slots의 filler는 particular semantic type의 values로 제한됨.
ex) 여행 도메인에서 slots은 CITY, DATE, AIRLINE, TIME

26.3.1 Control structure for frame-based dialogue

The control architecture for frame-based dialogue systems

Frame을 중심으로 설계됨.
Apple’s Siri, Amazon’s Alexa, and the Google Assistant
목표: Frame의 slot에 user의 intends를 filler로 채운 후, user에 대해 관련 조치를 수행(QA, Booking, ..)
slot 채우기 위해 system은 user에게 질문을 던진다.
즉 대화로부터 slots를 채우기보다 slots를 채우기 위한 질문을 던지고, 이에대한 답변을 통해 slots를 채운다.

26.3.2 Natural language understanding for filling slots in GUS

NLU component in the frame-based architecture가 user의 utterances로부터 추출해야 하는 것

1. domain classification

single-domain system에서는 필요없지만 multi-domain dialogue에서는 필요한 1-of-n cls task

2. intent determination

user가 달성하려는 general task or goal이 무엇인가?
Find a Movie, Show Flight, ..

3. slot filling

user의 발화에서 user의 intent를 읽어내어 particular slots와 filler를 extraction.
example
- input: "Wake me tomorrow at 6"
- output:
  
  DOMAIN: ALARM-CLOCK
  INTENT: SET-ALARM
  TIME: 2017-07-01 0600-0800
Many industrial dialogue systems가 supervised machine learning for slot-filling을 사용한다.

26.4 The Dialogue-State Architecture

task-based dialogue system은 dialogue-state or belief-state라고 불리는 frame-based architecture에 기반.

components

NLU component

to extract slot fillers from the user’s utterance
rule 보다는 machine learning을 사용.

dialogue state tracker

maintains the current state of the dialogue
- user의 가장 최근 dialogue act, user가 지금까지 표현한 the entire set of slot-filler

dialogue policy

system이 다음에 해야할 일이나 말을 결정하는 것.
ex: 빈 슬롯에 대한 질문하기.

natural language generation

답변 생성

26.4.1 Dialogue Acts

대화의 speech acts와 grounding을 하나의 표현으로 결합하여 turn or sentence들의 iteractive function을 표현한다.
tagset은 particular tasks 마다 정의된다.

예시를 통해 slot filler(대화의 내용)이 전달 되는 것을 볼 수 있음.

26.4.2 Slot Filling

special cases of the task of supervised semantic parsing.

simple method

train a a sequence model to map from input words representation to slot fillers, domain and intent. (BIO tagging)
문장의 가장 마지막 을 활용하여 domain-classification 등도 동시에 할 수 있음.

26.7 Summary

Conversational agents는 상업적으로 널리 사용되는 crucial speech and language processing applications.

In human dialogue

speaking is a kind of action.
- 이러한 acts를 speech acts or dialogue acts라고 부름.
speakers는 서로서로를 이해했다는 것을 인정함으로써 common ground를 이루려고 노력함.

Chatbots

일상적인 human conversation의 mimic으로 디자인된 conversational agents.

Task-based dialogue

dialogue systems use the GUS or frame-based architecture
designer는 task를 위해 user의 답변으로부터 채워야할 frames(set of slots)을 정해야함.

저작자표시

'인공지능 > NLP' 카테고리의 다른 글

BERT에서의 long text 처리 (0)	2020.12.16
NLP 자연어처리 입문 가이드라인 (20.11.10) (1)	2020.11.08

YYYY

Dialog Systems and Chatbots 정리

Ch.26 Dialog Systems and Chatbots

conversational agents, or dialogue systems