Thumbnail for Exploring and comparing different LLMs [Pt 2] | Generative AI for Beginners by Microsoft Developer

Exploring and comparing different LLMs [Pt 2] | Generative AI for Beginners

Microsoft Developer

21m 0s3,405 words~18 min read
AI audio transcription
Transcript source

AI audio transcription

This transcript was generated from the video's audio because no usable YouTube caption track was available. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Timestamped outline
Pull quotes
[0:09]I am Pablo Lopes, Cloud Advocate in .NET in Artificial Intelligence, and I'm with here with Carlotta.
[0:50]So, going from here, you're going to see a little bit of foundation models, language models and later Carlotta, you'll show a little bit on Azure AI.
[1:02]How to classify a language model because we have a lot of classifications for them.
[2:00]So, you can put like a multimodal data source to train it so it can have teaching materials, videos, interaction and the subject matters.
Use this transcript
Related transcript hubs

[0:09]Hey folks, welcome back to Generative AI for Beginners here at Microsoft Learn. I am Pablo Lopes, Cloud Advocate in .NET in Artificial Intelligence, and I'm with here with Carlotta. Say hi, Carlotta.

[0:21]Hi. Hi everyone. I'm Carlotta Castelluccio. I'm a cloud advocate focused on artificial intelligence.

[0:28]That's incredible. So, today let's talk about a little bit on the basics. You're starting, you start on the introduction. Now, let's go deeper. Let's see how can we explore and compare different LLM's.

[0:40]Of course, now that we have all the Azure AI, our LLMs. But, if you want to take a look at that, we can took a lot later. But, we're going to talk a little bit about the comparisons on language models.

[0:50]So, going from here, you're going to see a little bit of foundation models, language models and later Carlotta, you'll show a little bit on Azure AI. So, what are you going to learn today?

[1:02]You may have heard of foundation models. So, let's understand foundation models versus LLMs and language models. How to classify a language model because we have a lot of classifications for them. And then Carlotta will show a little bit of Azure AI.

[1:17]Let's go with foundation models versus LLMs. But, to understand this, let's start with foundation models themselves. Why? Because you may have seen a lot of, I heard about that.

[1:30]Foundation models, LLMs, et cetera. You may have been confused on how those work. Let's think about it a little bit. The foundation models, they do the following.

[1:41]So, usually it's the base, is the bases of you're deploying something new. So, what do you mean about it, Pablo?

[1:51]Okay. Let's think about like this. Imagine that I have a foundation model that I want to do multiple tasks. So, here I have a teaching one.

[2:00]So, you can put like a multimodal data source to train it so it can have teaching materials, videos, interaction and the subject matters.

[2:10]And then we have like tasks and goals so it can have multiple assist students, educators, facilitate learning, teaching and their subject matter.

[2:17]So, you can see that we have a lot of then, but here we have a bunch of things that we're training with and they, we can have a lot of tasks and goals. But, you see that it does a lot of things, right?

[2:30]But, it may not do them perfectly. We may need to guide them in how to do X or Y. So, assist educators, how to assist them, right?

[2:39]Because we don't have like instructions to, you know, understand how is helping it. Then it's foundation. Why? Because it's the basis of our constructing new solutions. So, that's why you're calling it foundation models.

[2:50]And let's bring this graph as well. So, let's think about this, and foundation models were brought by Stanford researchers, researchers, so that's why we're using like that.

[3:00]So, the foundation models here are very simple. They have some prerequisites. They need to be pretrained, generalized, adaptable, large and self-supervised.

[3:11]So, you may see right here that foundation models can cover a lot of things. And you might may know as well if you know a little bit about LLMs that they fulfill all those five.

[3:22]So, yes, LLMs are foundational models because they fulfill all those five. However, not all foundation models are LLMs because they can they don't use language sometimes.

[3:36]As I said, they use a multimodal way to, you know, learn. So, they have a multimodal way to, as to answer, how to interpret. They don't use tokenizers sometimes. So, that's the difference.

[3:48]So, yes, foundation models are very important because our LLMs are foundational models, but not all foundation models are LLMs.

[3:59]That's incredible. But, now that we understood those, we're going to start talk a little bit of our language models themselves, right?

[4:09]So, then it brings a classic, right? Open source versus proprietary. So, let's think about open source a little.

[4:17]So, when you have open source, nowadays we had like a big expansion, right? A lot of companies are doing open source language models, which are incredible.

[4:28]But, let's think about it a little bit. So, what are those means, right? So, open source, usually, they are open sourcing some of the parts that they use to, you know, or train.

[4:38]The LLM. So, imagine that you have like the code that I used to train. Some provides the weights. If you don't know, the weights are basically they fine tuning of the the language model, so it can like go inside and try to investigate.

[4:51]Or even some provide like the full model that you can just download and go and, you know, some having fun with it, and you can tweak and do everything. So, these are incredible.

[5:01]But, here's the thing. A lot of them are provided by, you know, research groups or NGOs. So, those usually, what they have, they don't can have a lot of resources to be supported and to be updated with new features.

[5:20]Or, you know, new updates and security, especially if you think about prompt injection, et cetera. So, you may see here that, yes, open source is incredible, but they can have some faults here and there.

[5:30]Or maybe their license aren't permissive enough with what we want to do. So, open source opens a lot of small issues here and there that you need to think about if you want to solve.

[5:40]Proprietary, usually they are much easier. They are already provided, a lot of those, of course, they are provided by APIs if you go to cloud services.

[5:51]But, what is good about proprietary is that you know you're going to get, you know, a lot of things done correctly. So, you're going to have like updates constantly in the models.

[6:01]Of course, it limits you that opportunity to fine tune the model, but usually it means that you're going to have a lot of things that you can already use and ready made solution that you can go and grab and go.

[6:12]Okay. So, you learn a little bit of the open source models and about the proprietary. If you ask open source, you have a bunch of them, right?

[6:22]You have Falcon, we have Lama, we have O Lama as well. So, we have a lot much here. And on proprietary, you can ask about, you know, Chat GPT.

[6:31]So, why I'm saying this? Because, you notice that I didn't talk about embeddings, right? So, what are the embedding one? So, let's take an Open AI Adaba.

[6:44]So, let's talk about here about their categorizations. You're noticing that they have like very strict categorizations here. So, let's go to a language model that converts everything to embeddings.

[6:52]So, we have a string, we have a prompt. You need some way to communicate. Imagine that, you know, I have a format that, you know, I can communicate with different models, right?

[7:05]So, getting is to convert to embeddings are great because then I can easily imagine that I can use it for systems like Rack that I can store all the embeddings.

[7:18]And I can search and I got my information back. So, embeddings are very important for systems today on generative AI. And it's not only that.

[7:28]They are used to communicate multiple types of language systems, language models. So, it's just very simple.

[7:33]Compare a string, convert to embedding that everyone can understand. Then we have language models to image generation. That's everyone knows Dolly.

[7:42]Everyone already played. Even we have as well on web co-pilot. So, I have a prompt here. So, I have like in quotes and then you're going to put in your generative AI neural network, and then boom, you're going to have an image.

[7:55]So, here you have a painting of a flying dog, and here you have a, you know, this incredible image. Then, the classics. Who didn't already use text generation, right?

[8:05]Everyone here already had some fun with it. So, you go to your class co-pilot web as well, chat GPT, you can go as well. If you even go to hugging face, right, you can access on hugging face, a lot of other models that have Falcon or Lama.

[8:21]We have Mistral, and those are incredible. So, basically, put a prompt, and then you have prompt engineer that you're going to see later in this course. And then the generative AI will start to write it.

[8:32]And then it generates a text. And it can be text, it can be code. We have, of course, some that are optimized for code. But, you can see here that is how it works. So, we have, usually, those are the three more defined ones.

[8:45]Well, of course, if you want to break it down, you can do like specialized, you can do a lot of, you know, extra things here and there. Remember, you can always, you know, interact with this those language models and do what you need for, you know, your business, your hobby or anything you want to.

[9:00]Great. Let's talk about service versus model. Remember what I talked before that you can easily access some of these incredible models on the cloud? Exactly. Why I said it?

[9:14]Because those are on the service. What it means? Those already, you know, stored in the cloud. You cannot change your yourself without, you know, going in more inside and try to just fine tune it.

[9:30]But, basically those are stored in there, and they are easy. Usually, API calls, you just, you know, send it, send your prompt and then you get an answer. You need to deal with scalability. The scalability the cloud already uses a few.

[9:40]The security is defined on how you use your security on the cloud as well. So, everything is tight and niche. And not only that, you can actually integrate with all your other services. So, that's the good part about having the service, right?

[9:52]You don't It's easier to manipulate and do sometimes on if you want to change a little bit. But, of course, we're going to fine tune it about your services. But, that's important for you to know.

[10:04]But, after this, you can have like an easy way to interact with this. The model is a little bit harder because you need to download the model, you need to set up. Imagine that you have a server, right?

[10:13]So, you need to take care of the server, take care about scalability. Imagine that you have multiple clients coming in. So, the model is for you to interact directly. It makes easy to interact directly.

[10:27]It goes, it's goes much more easy to try and to understand how this is working and the pipeline. But, it makes Imagine that you need to interact with your cloud service, and you need to try something. It makes much harder without the ecosystem.

[10:40]But, you have more full control, but you need to deal with infrastructure costs, et cetera. So, now I'm very interested. Carlotta, can you tell me how Azure AI Studio, you know, how it works? I actually don't know.

[10:52]Of course. Uh, thanks Pablo. Um, yeah, I'm happy to walk you through, you know, how to use foundation models in Azure and specifically in Azure AI studio.

[11:04]And the first question I want to cover here is why Azure, right? Uh, so in the ever evolving landscape of foundation models, uh, selecting the right candidate for your specific scenario is just the beginning of the journey.

[11:16]So, once you have identified your top choices, it's time to put them to test, uh, on your use case.

[11:24]Uh, so Azure Studio is your one-stop platform for developing, testing and managing the entire life cycle of your AI applications.

[11:32]In fact, this platform integrates Microsoft data technologies for optimized storage and search in databases, uh, a wide range of proprietary and open source large language models, um, that Pablo just presented.

[11:46]Uh, for example, from the OpenAI family, but also from partners like Meta, Hugging Face or Mistral. Uh, tools that enable ensuring responsible and secure development of AI applications.

[11:58]But also prompt engineering and evaluation facilities, as well as monitoring assets for generative AI applications. Now, recalling our educational startup scenario, let's imagine that

[12:11]uh, after extensive research, our startup has explored the current large language models landscape and have pinpointed some strong contenders for their unique scenario.

[12:22]Uh, now, the real fun begins because testing these models involves an iterative process using experiments and precise measures to ensure they meet the mark.

[12:32]And, and where they can, you know, um, uh, where where they can test these models? They can do that using, uh, the model catalog in Azure AI Studio.

[12:43]Um, Azure Studio provides a seamless experience with, you know, a user-friendly interface. So, here's what you can do in the model catalog. You can easily find the foundation model of your interest by using different filters such as you can search for, um, the model provider.

[13:00]They can be Azure OpenAI, it can be Meta, it can be Hugging Face and so on and so forth. You can search for inference or fine tuning tasks, for example, you can search for Q&A or summarization task or object detection for computer vision kind of scenarios.

[13:18]Um, and also you can filter by license, for example. And you can even look for a specific model, of course, by searching its name in the in the search box.

[13:27]Um, now, once you, you know, um, uh, once you select a model before you proceed with the deployment of the model itself, it's always a good idea to get to know your model a bit.

[13:39]Um, so the model card in the model catalog provides a comprehensive view, um, complete with detailed description of use cases and training data of the model you selected.

[13:52]And for some models, you can also find some code samples providing, uh, providing us with a sense of how inputs and outputs look like for real-time inference.

[14:04]Um, also, sometimes a bit of fine tuning is necessary, and Azure Studio empowers you to improve your models performance with custom training data for a selected subset of models in the model catalog.

[14:18]Um, like Llama to, uh, 70B that you see here in this slide. Uh, once your model is, you know, primed and ready, it's time to deploy it.

[14:27]Uh, whether it's the original pre-trained model or your final fine-tuned version, Azure Studio has has you covered.

[14:36]In terms of deployments for a few models like Llama 2 here, um, uh, or the ones of the meta collection in general, you have a couple of options.

[14:46]You can go with the standard deployment approach, which is real-time endpoint. Uh, so you deploy your model in your Azure subscription, and you manage the infrastructure used for inference.

[14:59]Um, uh, or you can use the recent pay as you go uh type of deployment that has been introduced, um, and this means that you can consume for in this case Llama 2, uh, model as a REST API without caring about, you know, the underlying infrastructure, uh, with an experience similar to using Azure AI service uh APIs, um,

[15:27]so, just consuming, uh, it as, uh, through a REST API. Uh, and this is what we call model as a service.

[15:31]Now, another interesting feature of Azure Studio is the model benchmark. Um, here, you can basically compare different models in the catalog, uh, using filters to determine the specific subset, uh, against some predefined performance metrics such as accuracy, fluency, coherence, uh, and so on and so forth.

[15:50]Um, and there are, uh, several test data set you can choose, um, to use for for the comparison.

[16:00]When it comes to deploying large language models into production, uh, businesses have a word of options, each with its own set of complexities, of course, costs, but also quality levels.

[16:12]Um, let's dive into these approaches and see which one suits, you know, different requirements.

[16:19]The first approach is leveraging prompt engineering with context. Uh, now, pre-trained large language models excel in handling general language task, and you can simply feed them a short prompt like a question or an incomplete sentence, and they work like a charm, right?

[16:39]Um, and we call this zero-shot learning. But here's the catch. Uh, the more context you provide, the better the large language model understands your request.

[16:47]And when you include detail, uh, detailed examples and and requests, uh, this approach is called one-shot learning, if you're using a single example or few-shot learning if you're using multiple examples.

[17:00]In the case of a conversation, you can even use the prompt to describe the personality of the assistant, for example, the style and the tone of the responses, or pass the conversation history into the prompt.

[17:13]Uh, this approach is cost-effective and and a great starting point. Then approach two, uh, is using retrieval augmented generation, which is a pattern, um, I would say, um, a specific technique, prompt engineering technique.

[17:31]Um, in fact, large language models have their limitations, and because they only know what they were trained on and can't access post-training information or private company data, for example.

[17:43]And to bridge this gap, uh, we use this pattern, retrieval augmented generation that adds external data in a form of documents chance to your prompt, effectively expanding its knowledge.

[17:55]So, we are passing data through the prompt, but before doing that, we are going to use some, uh, some search pipeline to look for the data to add to our, um, context.

[18:08]Um, in Azure AI platform, this is powered by vector database tools like Azure AI search. Uh, rug is available approach when you, you know, you lack the data, the time, or the resources to fine tune your large language model, but you want to boost its performance and minimize the risk of incorrect information or harmful content.

[18:28]The third approach I would like to cover here is fine tuning. Fine tuning is a process to customize an LLM for a specific task.

[18:38]So, it generates a new model with updated weights and biases, making it it ideal if you have, you know, strict latency requirements, so you cannot really have a huge context in the prompt.

[18:50]Or you possess high-quality data and ground proof labels and you can maintain them over time. Uh, so fine tuning with respect to rug requires additional competition, computational resources to adjust the weights, um, of the model.

[19:04]Now, I want to say that, uh, these techniques I have presented here, so prompt engineering, retrieval augmented generation and fine tuning, are not, um, uh, mutually exclusive.

[19:17]They are complementary. So, there are cases in which you are going to use, for example, prompt engineering and fine tuning. Other cases in which you are going to use the three of them.

[19:27]The last approach I want to, um, to cover is, uh, training your own large language model. Now, training an LLM from scratch is a huge undertaking.

[19:39]Demanding vast amounts of data, of high-quality data, skilled professionals and serious computational power. You consider this option only if you have a very domain specific use case and also an abundance of domain-centric data.

[19:54]Um, in the word of large language employment, there's no one-size-fits-all solution. So, the right approach depends on your unique requirements, resources and goals.

[20:17]Also, the techniques we explored, um, as I said, are not always mutually exclusive. So, there are cases in which you need to evaluate, uh, if you need to combine, um, a few of them.

[20:30]In the next episode, you'll discover what this mean. But for now, I would like to wrap up. And Pablo, hey, I want to bring you back just to say goodbye everyone.

[20:40]Folks, it was a pleasure to talk more about Generative AI Carlotta is incredible as well to learn more and I'm really excited to, you know, talk more with you folks.

[20:50]I'm going to come back soon as well for more lessons on Generative AI for Beginners. Thank you so much and learn more with Microsoft Learn.

[20:59]Awesome. Thanks everyone.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript