OpBerg: Discovering Causal Sentences Using Optimal Alignments

Justin Wood, Nicholas Matiasz, Alcino Silva, William Hsu, Alexej Abyzov, Wei Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The biological literature is rich with sentences that describe causal relations. Methods that automatically extract such sentences can help biologists to synthesize the literature and even discover latent relations that had not been articulated explicitly. Current methods for extracting causal sentences are based on either machine learning or a predefined database of causal terms. Machine learning approaches require a large set of labeled training data and can be susceptible to noise. Methods based on predefined databases are limited by the quality of their curation and are unable to capture new concepts or mistakes in the input. We address these challenges by adapting and improving a method designed for a seemingly unrelated problem: finding alignments between genomic sequences. This paper presents a novel method for extracting causal relations from text by aligning the part-of-speech representations of an input set with that of known causal sentences. Our experiments show that when applied to the task of finding causal sentences in biological literature, our method improves on the accuracy of other methods in a computationally efficient manner.

Original languageEnglish (US)
Title of host publicationBig Data Analytics and Knowledge Discovery - 24th International Conference, DaWaK 2022, Proceedings
EditorsRobert Wrembel, Johann Gamper, Gabriele Kotsis, Ismail Khalil, A Min Tjoa
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages14
ISBN (Print)9783031126697
StatePublished - 2022
Event24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022 - Vienna, Austria
Duration: Aug 22 2022Aug 24 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13428 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference24th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2022


  • Causality extraction
  • Sequence alignments
  • Zero-shot learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'OpBerg: Discovering Causal Sentences Using Optimal Alignments'. Together they form a unique fingerprint.

Cite this