More generally, this kind of task is called "Knowledge Base Question Answering" (KBQA). The authors observe that many benchmarks have been published for it over the last decade, and that recently, the KBQA community has shifted toward using Wikidata as the underlying knowledge base for KBQA datasets. However, they criticize those existing benchmarks as either contain[ing] only simple questions [...] or synthetically generated complex logical forms that are not representative enough of real-world queries. To remedy this, they "introduce the SPINACH dataset, an expert-annotated KBQA dataset collected from forum discussions on Wikidata's 'Request a Query' forum with 320 decontextualized question-SPARQL pairs. Much more complex than existing datasets, SPINACH calls for strong KBQA systems that do not rely on training data to learn the KB schema, but can dynamically explore large and often incomplete schemas and reason about them."

The paper's second contribution is an LLM-based system, also called "SPINACH", that on the authors' own dataset outperforms all baselines, including the best GPT-4-based KBQA agent by a large margin, and also achiev[es] a new state of the art on several existing KBQA benchmarks, although on it narrowly remains behind the aforementioned WikiSP model on the WikiWebQuestions dataset (both also out of Lam's lab).

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here