Skip to content

Latest commit

 

History

History
49 lines (42 loc) · 4.5 KB

README.adoc

File metadata and controls

49 lines (42 loc) · 4.5 KB

Learn Data With Mark

Explore the code and scripts behind the Learn Data With Mark YouTube channel.

Large Language Models

Topic Resources

Retrieval Augmented Generation
In this video, we’ll learn how to use Retrieval Augmented Generation with Chroma and LangChain to provide an OpenAI/GPT LLM prompt with more data to effectively answer our questions about the Wimbledon 2023 tennis tournament.

Video
Code

Consistent JSON with OpenAI/GPT
In this video, we’ll learn how to return a consistent/predictable/valid JSON response to a sentiment analysis prompt using OpenAI.

Video
Code

Running Mixtral with Ollama
In this video, we’ll learn about Mixtral, the latest large language model from Mistral AI. Mixtral employs a mixture of experts approach, with eight models and a router to manage queries, enhancing the AI’s response quality. We’re going to run Mixtral on our own machine using the awesome Ollama tool. We’ll then compare Mixtral with the original Mixtral model on a variety of tasks including sentiment analysis, summarisation, suggesting prompts to review books, and updating Python code.

Video
Code

Constraining LLMs with Guidance AI
In this video, we’ll learn how to use the Guidance library to control and constrain text generation by large language models, specifically integrating it with the llama CPP library and the Mistral 7B model. We’ll build an emotion detector with help from functions like select which restricts generation to an array of values and gen, which can be controlled by regular expressions. We’ll also learn how to create reusable components and output results in JSON format.

Video
Code

LLaVA: A large multi-modal language model
In this video, we’ll learn about LAVA (Large Language And Vision Assistant), a multimodal model that integrates a CLIP vision encoder and the VICUNA LLM. We’ll see how well it gets on describing a cartoon cat, a photo of me with AI generated parrots, and a bunch of images created by the Mid Journey Generative AI tool. And most importantly, we’ll find out whether it knows who Cristiano Ronaldo is!

Video
Code

Topic Resources

Introduction to Vector Search
In this video, we’re going to learn about vector search using scikit-learn.

Video
Code

FAISS Approx Nearest Neighbours
In this video, we will learn about the capabilities of Facebook’s FAISS library in the context of vector search. We will discuss the technical framework of Approximate Nearest Neighbours and its implementation using Cell Probe methods. We will illustrate this with a visualization of 10,000 2D arrays and detail how the vector space is partitioned. Additionally, we’ll explain the role of the K-means algorithm in FAISS’s partitioning process, the steps to train your index, and methods to identify centroids that denote the cells.

Video
Code

Other Topics

Topic Resources

TrueSkill Rating Algorithm
In this video, we’ll learn about TrueSkill - the algorithm used to rate players on Xbox Live. We’ll cover how it works, its API, and then how to use it to find my favourite tennis players.

Video
Code