R/pubtator_sentences.R
pubtator_sentences.RdSplits abstract text into sentences and assigns each PubTator3 entity annotation to its containing sentence via character-offset overlap. When available, the PubTator3 passage text and offsets are used directly.
pubtator_sentences(pubtations)A data.table returned by
get_records(endpoint = "pubtations").
A data.table with annotation columns plus integer
sentence_id, sentence, sentence_start, and
sentence_end. sentence_start and sentence_end are
zero-based, end-exclusive entity offsets within sentence. PubTator passage metadata
columns are used for mapping but are not returned. Only passage annotations
that can be assigned to a sentence are returned.
if (FALSE) { # \dontrun{
pmids <- search_pubmed('"Biomarkers Consortium"')
pubtations <- get_records(pmids, endpoint = "pubtations")
mapped <- pubtator_sentences(pubtations)
} # }