R/util_fetch_embeddings.R
util_fetch_embeddings.RdBuilds a numeric matrix of embeddings for each text unit. Row names come from
by (data frame) or from names(corpus) / corpus (character vector).
Use the result with search_vector for semantic search.
util_fetch_embeddings(
corpus,
by = NULL,
api_token,
api_url = "https://router.huggingface.co/hf-inference/models/BAAI/bge-small-en-v1.5"
)A data frame with text and by columns, or a character vector of texts. If a named character vector, names become row names; if unnamed, the strings themselves are used as row names.
Character vector of identifier columns; required when corpus is a data frame (row names), ignored when corpus is a character vector.
Hugging Face API token.
Inference endpoint URL (default BAAI/bge-small-en-v1.5).
Numeric matrix with row names (unit ids).