Open-Retrievals Documentation#

GitHub

Retrievals is an easy, flexible, scalable framework supporting state-of-the-art embeddings, retrieval and reranking for information retrieval or RAG, based on PyTorch and Transformers.

  • Embeddings fine-tuned by Contrastive learning

  • Embeddings from LLM model

Installation#

Install the prerequisites

  • transformers

  • peft

  • faiss-cpu

Now you are ready, proceed with

# install with basic module
pip install open-retrievals

# install with support of evaluation
pip install open-retrievals[eval]

Examples#

Run a simple example

from retrievals import AutoModelForEmbedding

sentences = ["Hello NLP", "Open-retrievals is designed for retrieval, rerank and RAG"]
model_name_or_path = "sentence-transformers/all-MiniLM-L6-v2"
model = AutoModelForEmbedding.from_pretrained(model_name_or_path, pooling_method="mean")
sentence_embeddings = model.encode(sentences, normalize_embeddings=True, convert_to_tensor=True)
print(sentence_embeddings)

Open-retrievals support to fine-tune the embedding model, reranking model, llm easily for custom usage.

More datasets examples

Contributing#

If you want to contribute to the project, please refer to our contribution guidelines.