2 분 소요

Natural Language Processing

Introduction to NLP

Communication with People

Communication with Computers

image

  • Making machines understand human language
    • Communication with humans
    • Access the wealth of information about the world
  • Automation of natural languages (NL)
    • Analysis: NL => Representation (R)
    • Generation: R => NL
    • Acquisition of R from knowledge and data

NLP Key Idea

Representation

Human Brain

  • How is the language/knowledge expressed (in the brain)?

image

Computer

  • How is the human language/knowledge expressed (in a computer)?

image

  • How is the meaning of words expressed (in a computer)?

image

Distributional Hypothesis

The meaning of a word is its use in the language [Wittgenstein PI 1943]
If A and B have almost identical environments we say that they are synonyms [Harris 1954]
You shall know a word bhy the company it keeps [Firth 1957]

Question) Meaning of a word - Tesgüino
Situation) – No dictionary – Never seen before – We only know that the word is used in the following contexts

  1. A bottle of ____ is on the table.
  2. Everybody likes ____.
  3. Don’t have ____ before you drive.
  4. We make ____ out of corn.
Word Context 1 Context 2 Context 3 Context 4
Tesgüino Yes Yes Yes Yes
Loud No No No No
Tortillas No Yes No Yes
Wine Yes Yes Yes No

image

  • Tesgüinois an artisanal corn beer produced by several Yuto-Aztec people.

Q) How is the meaning of words expressed (in a computer)? A) Each word = one vector

  • Similar words are nearby in space
  • Understanding the context in which words are used in large textual data
  • Word Context 1 Context 2 Context 3 Context 100,000,000
    Tesgüino Yes Yes Yes 0.731
    Loud No No No 0.273
    Tortillas No Yes No 0.276
    Wine Yes Yes Yes 0.836
    - - - -

NLP Applications

Applications

  • Machine Translation
  • Information Retrieval
  • Question Answering
  • Dialogue Systems
  • Information Extraction
  • Summarization
  • Sentiment Analysis

Machine Translation

The task of translating a sentence x from one language (the source language) to a sentence y in another language (the target language)

image

image

Information Retrieval

the task of finding information that people want

image

image

Question Answering

The task of answering a (natural language) question One of the oldest NLP taks (punched card systems in 1961)

Idea: doing dependency parsing on question and searching most simillar answer

image

image

Dialogue Systems

The task of generating a response for making a conversation with human

Turing test

image

Ability to understand and generate language - intelligence “Can machine think?”

Information Extraction

The task of extracting structured information from unstructured documents

image

Named Entity Recognition

  • Find entities in text
  • Classify entities in text
  • Example)

image

image

Summarization

The task of creating a summnary that represents the most importnat or relevant information within original text

image

image

Sentiment Analysis

The task of classifying emotions in subjective data

image

  • Fine-grained sentiment analysis
    • Identify a category of sentiment
    • Ex) very positive, positive, neutral, negative, very negative
  • Emotion detection
    • Identify emotions
    • Ex) happiness, frustration, anger, sadness, …
  • Aspect-based sentiment analysis
    • Identify a category of sentiment in terms of aspect
    • Ex) “The CPU is fast. The battery runs fast.”

image

Challenges in NLP

Learn bias

Lack of Reasoning

enerate rude response

Why NLP is hard?

  • Ambiguity
  • Expressivity
  • Scale
  • Variation
  • Sparsity
  • Unmodeled variables
  • Unknown representations




참고자료

  • https://www.parentmap.com/article/mind-boggling-new-discoveries-about-the-brain
  • https://dictionary.cambridge.org/dictionary/english/word
  • https://en.wikipedia.org/wiki/Tesg%C3%BCino

태그:

카테고리:

업데이트:

댓글남기기