Open-Retrievals Documentation#
GitHubRetrievals is an easy, flexible, scalable framework supporting state-of-the-art embeddings, retrieval and reranking for information retrieval or RAG.
Embedding fine-tuned through point-wise, pairwise, listwise, contrastive learning and LLM.
Reranking fine-tuned with Cross-Encoder, ColBERT and LLM.
Easily build modular RAG, integrated with Transformers, Langchain and LlamaIndex.
Installation#
Install the prerequisites
transformers
peft # for lora fine-tuning if necessary
faiss-cpu # for faiss retrieval if necessary
Now you are ready, proceed with
# install with basic module
pip install open-retrievals
# install with support of evaluation
pip install open-retrievals[eval]
Or install from source code
python -m pip install -U git+https://github.com/LongxingTan/open-retrievals.git
Examples#
Run a simple example
from retrievals import AutoModelForEmbedding
sentences = ["Hello NLP", "Open-retrievals is designed for retrieval, rerank and RAG"]
model_name_or_path = "sentence-transformers/all-MiniLM-L6-v2"
model = AutoModelForEmbedding.from_pretrained(model_name_or_path, pooling_method="mean")
sentence_embeddings = model.encode(sentences, normalize_embeddings=True)
print(sentence_embeddings)
Open-retrievals support to fine-tune the embedding model, reranking model, llm easily for custom usage.
Use cases#
Contributing#
If you want to contribute to the project, please refer to our contribution guidelines.
Contents: