Retrieval#

1. Pipeline#

generate data -> train -> eval

pretrained encoding -> build hard negative -> train -> eval -> indexing -> retrieval

pretrain -> fine tuning -> distill

2. Offline indexing#

3. Retrieval#

Faiss retrieval#

BM25 retrieval#

Elastic search retrieval#

Ensemble retrieval#

we can use RRF_fusion to ensemble multiple retrievals to improve the retrieval performance.