Building Multimodal Search and RAG with Gemini Vision model

Build smarter search and RAG applications for multimodal retrieval and generation.

  • Learn how multimodality works by implementing contrastive learning, and see how it can be used to build modality-independent embeddings for seamless any-to-any retrieval.
  • Build multimodal RAG systems that retrieve multimodal context and reason over it to generate more relevant answers.
  • Implement industry applications of multimodal search and build multi-vector recommender systems.


QLoRA Finetuning & DPO Aligning Google's Gemma with Hugging Face

We discuss Supervised Finetuning and a powerful alignment technique called Direct Preference Optimisation (DPO) which was used to train Zephyr ( and is rapidly becoming the de facto method to boost the performance of open chat models. By the end of this session, attendees will:

  • Understand the steps involved in fine-tuning LLMs for chat applications.
  • Learn the theory behind Direct Preference Optimisation and how to apply it in practice with the Hugging Face TRL library.
  • Know what metrics to consider when evaluating chat models.