puremoe retrieves PubTator3 entity mentions and relation
pairs with one call, then reshapes them locally into sentence context,
entity co-occurrence counts, relation networks, and edge-level evidence.
Everything after the initial fetch is a local transform on the retrieved
tables and makes no further API calls.
We use a biomedical corpus with rich entity and relation annotation throughout.
library(puremoe)
library(data.table)
pmids <- search_pubmed('"doxorubicin"[TiAb] AND "cardiotoxicity"[TiAb]')
pt <- get_records(head(pmids, 25L), endpoint = "pubtator", cores = 1)
names(pt)
pt$entities
pt$relationspubtator_context() preserves PubTator entity spans, adds
sentence IDs and sentence-relative spans to entities, adds readable
entity labels and sentence anchors to relations, and returns a sentence
lookup table.
ctx <- pubtator_context(pt)
ctx$entities
ctx$relations
ctx$sentencesUse pubtator_cooccurrence() for count-only summaries
over sentence windows. The function accepts either the full context
object or ctx$entities. window = 0 counts
entities in the same sentence; larger windows reach across adjacent
sentences within the same passage.
# entity-type pairs in the same sentence
pubtator_cooccurrence(ctx, window = 0, by = "type")
# specific entity pairs, within one sentence of each other
pubtator_cooccurrence(ctx, window = 1, by = "entity")pubtator_network() turns contextualized PubTator
relations into a directed entity network plus a lean evidence table.
Nodes are specific entities keyed on their PubTator
identifier when present, falling back to
type:text; edges are directed PubTator relation assertions;
evidence rows map each edge back to pmid,
relation_id, and the shared sentence when the endpoint
mentions occur in the same sentence.
rel_net <- pubtator_network(ctx)
rel_netpuremoe takes on no graphing dependency itself, but the
graph tables pass straight to a graph package such as igraph:
igraph::graph_from_data_frame(rel_net$edges, vertices = rel_net$nodes)