WikiQA - Wikipedia-based Question Answering System - CITEC- KnowCIT


What is WikiQA:

WikiQA is a German open domain question answering system that uses the Wikipedia as a knowledge base to answer natural language questions. It has been developed by the KnowCIT project (Artificial Intelligence Group) within the CITEC at Bielefeld University. Using open domain encyclopedic information as a knowledge base, such as provided by the Wikipedia project, has captured the attention of QA researchers lately. However, most of the proposed Wikipedia-based QA systems focus primarily on the document collection of Wikipedia for answer retrieval, and thus disregard the complex hierarchical representation of knowledge by means of its category taxonomy, which can also be valuable in the context of QA systems. The WikiQA system approaches the Wikipedia collection from a different point of view. It exploits the use of the Wikipedia category taxonomy as a reference point for identifying the broader topic of a user's question, in order to deduce from the topic a set of expected answer candidates. More precisely, it accesses and activates only those areas of our knowledge base which are primarily topically relevant to the questions subject. You can type your question above or select from the following examples.

Example questions:


What is KnowCIT:

KnowCIT project

In the KnowCIT project we extend the conversational abilities of the conversational agent MAX by equipping him with access to collaboratively constructed knowledge drawn from the online encyclopedia Wikipedia. By means of the crowd-sourced knowledge resource, the agent is able to identify, label, track, and continue the topic of a dialog as the interlocutor of a human dialog partner. This allows him to answer questions, to detect topic changes and to react meaningfully to the challenge of dialogical dynamics. The KnowCIT project aims to build interactive technology that enables artificial agents to explore crowd-sourced knowledge resources generated by large communities of web users. From a theoretical point of view we aim to tackle the grounding problem studied in cognitive science by interfacing artificial cognitive agents with social ontologies. That way artificial agents become beneficiaries of crowdsourcing so that their human users gain in turn from the increase of their communicative competence. This Wikipedia is in the line of efforts to utilize social tagging systems such as, e.g., the Wikipedia, wikimanuals and other special wikis, which provide large resources of encyclopedic knowledge. In this context, we plan to exploit object knowledge as well as linguistic and metalinguistic knowledge (by example of so called wiktionaries) in a way that enables virtual agents to identify, label, track and to continue the topic of a dialogue in which they participate as the interlocutor of a human user.




For the evaluation of the WikiQA system we utilized 200 questions from the CLEF-2007 monolingual QA task, using German as the target language. Note that we manually evaluated the answers by means of their sentence representation only. That is, the ex- act answer has not been extracted, but had to be included in the answer sentence as determined by the system:

The knowledge base of WikiQA utilizes the German Wikipedia dump (Version 10/2010). More precisely, it utilizes 1.063.772 articles and 88.883 categories. The entire corpus was linguistically analyzed, and subdivided into 30.890.452 sentences.


Artificial Intelligence Group
KnowCIT: Knowledge Enhanced Embodied Cognitive Interaction Technology
CITEC: Center of Excellence Cognitive Interaction Technology

Universität Bielefeld
Universitätsstrasse 25
D-33615 Bielefeld

Tel: 0521 106-2924
Fax: 0521 106-2962

Prof. Dr. Ipke Wachsmuth
Head of Project

Dr. Ulli Waltinger

Alexa Breuing

WikiQA - Example Questions - KnowCIT - Publications - Dataset - Contact