By natural-language question-answering is meant computer systems
which can answer questions in natural human languages and give good
answers. The systems we are discussing now take natural language
written questions. Understanding of speech is not discussed here.
There are two main methods of designing natural-language question-answering
systems. The AI method uses artificial intelligence and linguistic
methods to analyze questions and create an "understanding" in the
computer of the question. The template method matches the questions
against question templates produced by humans. Both methods can
produce very good answers which give an impression of "computer
intelligence" to the user.
The AI method
requires complex and advanced linguistic
analysis programs.
The template method
requires careful human design
of the templates for each question. The intelligence, for template-based
methods, lies in the minds of the humans who write the templates.
The templates can either be specific templates for single questions,
or general templates for a group of questions. For example, the
question "What is the population of Sweden?" and "What is the population
of Italy?" might be answered by the same template, used to access
a data base.
Both methods require careful testing with users, adjustment, and
new user testing, before a system which gives good user satisfaction
can be achieved. The most well-known template-based natural-language
question-answering system is Ask Jeeves [http://www.ask.com], which is a large commercial system with answers to hundreds of thousands
of questions. Since it is a commercial service, detailed about its
design is not public.
FAQ Search Systems
A common usage of Natural-Language Question-Answering
is to search in data bases of answers to Frequently Asked Questions, FAQs.
Our System
Eriks Sneiders
has constructed a template-based natural language
question-answering system. You can test the system on a data base
of answers about HTML at [
http://dsv.su.se/html/].
A template must match many different variants of the same question.
For example, "What is the population of Sweden", "What is the number
of people in Sweden" and "How many people live in Sweden" are just
three of the many variants which should return the same answer.
A simple template for this question in his system might be specified
as:
popula* [number many much # people* person* inhabitant* human*]
; Swede* Sverige* Schwede* SuÈde*
/p>
This template means:
-
A question must contain one word or phrase matching the text
before the ";" and the text after the ";", described by item 2
and 3 below.
-
The matching words after the ";" is any word beginning with
either "Swede" or "Sverige" or "Schwede" or "SuÈde".
-
The matching phrases before the ";" can be either
-
any word beginning with "popula" or
-
first any of the words "number", "many" and "much", and then,
any word beginning with "people", "person", "inhabitant" or "human".
The "#" indicates than other words are allowed to intervene bathe
the words before and after the "#".
In this way, arbitrary complex templates
can be constructed, "[...]" phrases can be nested inside each other
to arbitrary levels.
Note:
This technology is only used in the German, English and Swedish Web4Health web sites, and partially in the Italian web site.