How does Web4Health find the answers to questions which people write in the search field in the Web4Health home page? What are the principes of the natural-language question-answering system used in Web4Health?
By natural-language question-answering is meant computer systems which can answer questions in natural human languages and give good answers. The systems we are discussing now take natural language written questions. Understanding of speech is not discussed here. There are two main methods of designing natural-language question-answering systems. The AI method uses artificial intelligence and linguistic methods to analyze questions and create an "understanding" in the computer of the question. The template method matches the questions against question templates produced by humans. Both methods can produce very good answers which give an impression of "computer intelligence" to the user.
The AI method requires complex and advanced linguistic analysis programs.
The template method requires careful human design of the templates for each question. The intelligence, for template-based methods, lies in the minds of the humans who write the templates. The templates can either be specific templates for single questions, or general templates for a group of questions. For example, the question "What is the population of Sweden?" and "What is the population of Italy?" might be answered by the same template, used to access a data base.
Both methods require careful testing with users, adjustment, and new user testing, before a system which gives good user satisfaction can be achieved. The most well-known template-based natural-language question-answering system is Ask Jeeves [http://www.ask.com], which is a large commercial system with answers to hundreds of thousands of questions. Since it is a commercial service, detailed about its design is not public.
FAQ Search Systems
A common usage of Natural-Language Question-Answering is to search in data bases of answers to Frequently Asked Questions, FAQs.
Eriks Sneiders has constructed a template-based natural language question-answering system. You can test the system on a data base of answers about HTML at [ http://dsv.su.se/html/]. A template must match many different variants of the same question. For example, "What is the population of Sweden", "What is the number of people in Sweden" and "How many people live in Sweden" are just three of the many variants which should return the same answer. A simple template for this question in his system might be specified as:
popula* [number many much # people* person* inhabitant* human*] ; Swede* Sverige* Schwede* Suéde*
This template means:
- A question must contain one word or phrase matching the text before the ";" and the text after the ";", described by item 2 and 3 below.
- The matching words after the ";" is any word beginning with either "Swede" or "Sverige" or "Schwede" or "SuÈde".
- The matching phrases before the ";" can be either
- any word beginning with "popula" or
- first any of the words "number", "many" and "much", and then, any word beginning with "people", "person", "inhabitant" or "human". The "#" indicates than other words are allowed to intervene bathe the words before and after the "#".
In this way, arbitrary complex templates can be constructed, "[...]" phrases can be nested inside each other to arbitrary levels.
Note: This technology is only used in the German, English and Swedish Web4Health web sites, and partially in the Italian web site.