Answer Aggregation and Verbalization for Complex Question Answering

Master's thesis


Master’s Thesis — Answer Aggregation and Verbalization for Complex Question Answering

University of Zurich — MSc Informatics (Data Science). Supervisor: Prof. Abraham Bernstein. Defence: Nov 2023.

Short abstract: This thesis investigates end-to-end language-model approaches for complex Knowledge Graph Question Answering (KGQA). It focuses on two complementary problems: (1) generating SPARQL queries from natural language questions that involve aggregation, comparison, and multi-hop reasoning; and (2) verbalizing retrieved answers into context-aware, user-friendly natural language responses. A taxonomy and template-based baseline were developed, and BART-based end-to-end models were evaluated as a modern alternative to template systems.

Based on the experience and methods developed during this work, I participated in the Scholarly QALD challenge at ISWC 2023 (DBLP-QUAD — Question Answering over scholarly knowledge graphs), applying a model-based approach derived from the thesis pipeline and achieving excellent results. Please see the NLQxform project page for full details and related publications.


Key contributions

  • Built tooling to visualize question/query types using Neo4j, aiding error analysis and system introspection.
  • Defined a taxonomy for complex KGQA phenomena and implemented a template-based baseline for systematic evaluation.
  • Developed and evaluated BART-based end-to-end text→SPARQL models capable of handling aggregations and advanced constraints.
  • Designed an answer verbalization module producing context-sensitive, human-readable responses instead of raw KG tuples.

Methods & datasets

  • Datasets & KGs: LC-QuAD 2.0, Wikidata(as the KG).
  • Modeling: BART fine-tuning for generation tasks.
  • Infrastructure: PyTorch, HuggingFace Transformers, Neo4j visualization; standard evaluation metrics and customised error analysis.

Conclusions

  • Template-based approach: Provided a systematic baseline but showed limited performance due to weaknesses in existing entity linking and relation linking tools, frequent errors in question type recognition and SPARQL construction, and overall inefficiency (time-consuming with low accuracy).
  • BART-based approach: Achieved significantly better results than the template baseline. However, challenges remain with unseen or implicit entities/relations. Incorporating external augmentation information improves robustness, and the models demonstrated strong transferability across datasets.
  • Answer verbalization: By taking the original question context into account, the system can generate more natural and flexible responses, which are both satisfying and more useful for end users compared to raw outputs.

Artifacts & downloads

Defence slides

Concise overview of problem, method and selected results

View / Download slides (PDF)
Related code & demo

NLQxform codebase and interactive UI were developed as follow-up work — see the NLQxform project page for links to code, and ISWC/SIGIR outputs.

NLQxform — project & links