NSIO: Neural Search Indexing Optimization

NSIO (Neural Search Indexing Optimization) investigates how to make the Differentiable Search Index (DSI) โ€” a model that memorises a document corpus and retrieves documents by generating their IDs โ€” more accurate, efficient, and memory-friendly.

Background: Differentiable Search Index

Traditional search systems use two separate components: an indexer (inverted index or dense vector store) and a retrieval model. DSI collapses both into a single sequence-to-sequence transformer: given a query, it generates the document ID of the most relevant document directly โ€” no separate index needed. This approach (Tay et al., 2022) shows strong results on MS MARCO but is expensive to fine-tune at scale.

Optimisations

Data Augmentation

  • Num2Word: convert numeric tokens to their word forms, reducing out-of-vocabulary issues.
  • Stopwords Removal: reduce noise in document representations.
  • POS-MLM: Part-of-Speech guided masked language modelling to generate diverse query variants.

Parameter-Efficient Fine-Tuning (PEFT)

Avoid full model fine-tuning โ€” adapt only a small number of parameters:

MethodDescription
LoRALow-Rank Adapters injected into attention weight matrices
QLoRALoRA with 4-bit quantised base model โ€” drastically reduces GPU memory
AdaLoRAAdaptive rank allocation โ€” higher rank for more important layers
ConvoLoRANovel convolutional LoRA variant for capturing local patterns

Evaluation

All experiments evaluated on the MS MARCO document ranking benchmark. Metrics: MRR@10, Recall@100. QLoRA achieves near full-fine-tune accuracy at a fraction of the memory footprint.

Technology

Python, Hugging Face Transformers + PEFT library, PyTorch, Jupyter Notebooks.