R/pubtator_cooccurrence.R
pubtator_cooccurrence.RdCounts pairs of biomedical entities that co-occur in the same sentence
(window = 0) or within window sentences of each other, using
the contextualized entity table returned by pubtator_context.
Co-occurrence is computed within each pmid/tiab passage; title
and abstract sentence IDs are not compared to one another.
pubtator_cooccurrence(x, window = 0L, by = c("type", "entity"))A PubTator context list returned by pubtator_context,
or a contextualized entity data.frame with pmid, tiab,
type, identifier, text, and sentence_id.
Non-negative integer sentence distance. 0 counts
entities in the same sentence; n counts entities whose sentences are
at most n apart within the same pmid/tiab passage.
One of "type" (default) or "entity". "type"
aggregates counts by entity-type pair; "entity" aggregates by the
specific (type, identifier, text) pair.
A data.table. With by = "type": type_x,
type_y, n (co-occurrence instances), and n_pmids
(distinct documents), ordered by n. With by = "entity": the
same plus identifier_x/text_x/identifier_y/
text_y.
Entities are de-duplicated to one mention per sentence before pairing, and
pairs of the same entity (identical type, identifier, and
text) are dropped.
if (FALSE) { # \dontrun{
pmids <- search_pubmed('"biomarker"[TiAb] AND "cancer"[TiAb]')
ctx <- pmids |>
get_records(endpoint = "pubtator") |>
pubtator_context()
ctx |> pubtator_cooccurrence(window = 0, by = "type")
ctx |> pubtator_cooccurrence(window = 1, by = "entity")
} # }