API and naming

Removed

  • In-package embedding generation (e.g. Hugging Face API). Use your own embedding pipeline and pass your embedding matrix as the argument to .
  • Legacy names: web_search, wiki_search, wiki_find_references, web_scrape_urls, ner_extract_entities, sem_nearest_neighbors / sem_search_corpus (replaced by search_vector and search_regex).

Documentation

  • README revamped around the API map and a single “golden path” workflow.
  • DESCRIPTION and package help updated for the four-stage pipeline; version set to 1.1.0.

  • Initial release: URL fetching, URL content reading, NLP processing (split, tokenize, index), and corpus/search utilities.