Open-Retrievals Documentation#

GitHub

Retrievals is an easy, flexible, scalable framework supporting state-of-the-art embeddings, retrieval and reranking for information retrieval or RAG.

  • Embedding fine-tuned through point-wise, pairwise, listwise, contrastive learning and LLM.

  • Reranking fine-tuned with Cross-Encoder, ColBERT and LLM.

  • Easily build modular RAG, integrated with Transformers, Langchain and LlamaIndex.

Installation#

Install the prerequisites

  • transformers

  • peft # for lora fine-tuning if necessary

  • faiss-cpu # for faiss retrieval if necessary

Now you are ready, proceed with

# install with basic module
pip install open-retrievals

# install with support of evaluation
pip install open-retrievals[eval]

Or install from source code

python -m pip install -U git+https://github.com/LongxingTan/open-retrievals.git

Examples#

Run a simple example

from retrievals import AutoModelForEmbedding

sentences = ["Hello NLP", "Open-retrievals is designed for retrieval, rerank and RAG"]
model_name_or_path = "sentence-transformers/all-MiniLM-L6-v2"
model = AutoModelForEmbedding.from_pretrained(model_name_or_path, pooling_method="mean")
sentence_embeddings = model.encode(sentences, normalize_embeddings=True)
print(sentence_embeddings)

Open-retrievals support to fine-tune the embedding model, reranking model, llm easily for custom usage.

Use cases#

Contributing#

If you want to contribute to the project, please refer to our contribution guidelines.