puremoe retrieves PubTator3 entity mentions and relation pairs with one call, then reshapes them locally into sentence context, entity co-occurrence counts, relation networks, and edge-level evidence. Everything after the initial fetch is a local transform on the retrieved tables and makes no further API calls.

We use a biomedical corpus with rich entity and relation annotation throughout.

library(puremoe)
library(data.table)

pmids <- search_pubmed('"doxorubicin"[TiAb] AND "cardiotoxicity"[TiAb]')

pt <- get_records(head(pmids, 25L), endpoint = "pubtator", cores = 1)

names(pt)
pt$entities
pt$relations

Sentence context

pubtator_context() preserves PubTator entity spans, adds sentence IDs and sentence-relative spans to entities, adds readable entity labels and sentence anchors to relations, and returns a sentence lookup table.

ctx <- pubtator_context(pt)

ctx$entities
ctx$relations
ctx$sentences

Co-occurrence counts

Use pubtator_cooccurrence() for count-only summaries over sentence windows. The function accepts either the full context object or ctx$entities. window = 0 counts entities in the same sentence; larger windows reach across adjacent sentences within the same passage.

# entity-type pairs in the same sentence
pubtator_cooccurrence(ctx, window = 0, by = "type")

# specific entity pairs, within one sentence of each other
pubtator_cooccurrence(ctx, window = 1, by = "entity")

Relation networks

pubtator_network() turns contextualized PubTator relations into a directed entity network plus a lean evidence table. Nodes are specific entities keyed on their PubTator identifier when present, falling back to type:text; edges are directed PubTator relation assertions; evidence rows map each edge back to pmid, relation_id, and the shared sentence when the endpoint mentions occur in the same sentence.

rel_net <- pubtator_network(ctx)

rel_net

puremoe takes on no graphing dependency itself, but the graph tables pass straight to a graph package such as igraph:

igraph::graph_from_data_frame(rel_net$edges, vertices = rel_net$nodes)