Call Hugging Face API for Embeddings — api_huggingface

Retrieves embeddings for text data using Hugging Face's API. It can process a batch of texts or a single query. Mostly for demo purposes.

api_huggingface_embeddings(
  tif,
  text_hierarchy,
  api_token,
  api_url = NULL,
  query = NULL,
  dims = 384,
  batch_size = 250,
  sleep_duration = 1,
  verbose = TRUE
)

Arguments

tif: A data frame containing text data.
text_hierarchy: A character vector indicating the columns used to create row names.
api_token: Token for accessing the Hugging Face API.
api_url: The URL of the Hugging Face API endpoint (default is all-MiniLM-L6-v2).
query: An optional single text query for which embeddings are required.
dims: The dimension of the output embeddings.
batch_size: Number of rows in each batch sent to the API.
sleep_duration: Duration in seconds to pause between processing batches.
verbose: A boolean specifying whether to include progress bar

Value

A matrix containing embeddings, with each row corresponding to a text input.

Examples

if (FALSE) { # \dontrun{
tif <- data.frame(doc_id = c('1'), text = c("Hello world."))
embeddings <- api_huggingface_embeddings(tif,
                                         text_hierarchy = 'doc_id',
                                         api_token = api_token)
} # }