TubeScript Get a Transcript

Thumbnail for Karpathy's LLM Wiki - Full Beginner Setup Guide by Teacher's Tech

Karpathy's LLM Wiki - Full Beginner Setup Guide

Teacher's Tech

17m 7s2,901 words~15 min read

YouTube auto captions

Transcript source

YouTube auto captions

This transcript was extracted from YouTube's auto-generated caption track. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Timestamped outline

[0:00]Section 1

There's a problem with the way most of us use AI right now, and once you see it, you're not going to be able to unsee it. When you upload...

[7:44]Section 2

And fifth, the question answering behavior. When you ask the AI a question, it should consult the Wiki first. Cite its sources and tell y...

[10:15]Section 3

Okay, so now I'm going to tell it to ingest the new source. So I'm just going to say, I just added a new source to the raw folder. Please...

[11:54]Section 4

Let's say the exact same thing. I just added a new source to the raw folder. Please read it and update the Wiki. Now, look at this. So Cl...

Pull quotes

[0:00]There's a problem with the way most of us use AI right now, and once you see it, you're not going to be able to unsee it.

[0:00]When you upload documents to something like Chat GPT or Notebook LM and ask a question, the AI searches through your files, pulls out some relevant pieces and gives you an answer.

[0:00]That works, but here's the thing, ask a similar question tomorrow and the AI does all of that work again from scratch, nothing was saved.

[0:00]Andre Karpathy, one of the biggest names in AI, co-founder of Open AI, former AI director at Tesla, recently shared an idea that fixes this problem.

Use this transcript

Summarize a YouTube transcript Make study notes Find timestamped highlights Export to Markdown Download transcript files Browse related transcript hubs

Related transcript hubs

Transcript archive Auto Captions hub English transcripts AI transcripts Tutorials transcripts

Watch on YouTube

Share

[0:00]There's a problem with the way most of us use AI right now, and once you see it, you're not going to be able to unsee it. When you upload documents to something like Chat GPT or Notebook LM and ask a question, the AI searches through your files, pulls out some relevant pieces and gives you an answer. That works, but here's the thing, ask a similar question tomorrow and the AI does all of that work again from scratch, nothing was saved. Nothing was built up. Every single question starts from zero. Andre Karpathy, one of the biggest names in AI, co-founder of Open AI, former AI director at Tesla, recently shared an idea that fixes this problem. He calls it the LLM Wiki and honestly, once you understand what it does, the old way of working with documents starts to feel broken. In this video, I'm going to walk you through exactly what the LLM Wiki is, why it matters, and then we're going to build one together from scratch. Step by step, you don't need to be technical, if you can create a folder on your computer, you can do this. Hi, I'm Jamie and welcome to Teachers Tech. So let me explain the problem a bit more clearly because this is important. The way most AI tools handle your documents right now is called RAG, retrieval augmented generation. You upload some files, you ask a question, the AI searches through those files, grabs the chunks that seem relevant and generates an answer. And that's fine for simple questions, but what if your questions require connecting ideas across five different documents? The AI has to find all those pieces and stitch them together every single time. There's no memory, there's no accumulation, nothing compounds. Think about it like this. Imagine you're a researcher and you've been reading papers on a topic for weeks. With RAG, every time you ask the AI a question, it's like it's never read any of those papers before. It starts fresh every time. That's the bottleneck. Karpathy's idea flips this completely. Instead of searching raw documents every time you ask a question, you have the AI read your documents once and build a structured wiki out of them. A real persistent knowledge base made of interlinked markdown files. So when you add a new source, say a PDF or an article, the AI doesn't just store it for later, it actually reads it, extracts the key ideas and integrates them into the Wiki. It updates existing pages, it creates new pages for new concepts, it links related ideas together, and if the new source contradicts something already in the Wiki, it flags that too. So over time, the Wiki keeps growing and getting richer. The connections are already there, the synthesis is already done. When you ask a question, the AI is not starting from scratch, it's actually working from a pre-built organized knowledge base. Here's how Karpathy describes it. He says, think of Obsidian as the IDE, the LLM is the programmer, the Wiki is the codebase. You rarely write the Wiki yourself, the AI does the writing and organizing, you focus on what goes in and what questions to ask. Now, the whole system has three layers and they're very simple. Layer one, your raw sources. These are your original documents, like PDFs, articles, meeting notes, whatever you're working with. The important thing is that these are read only. The AI reads them but never changes them. This is your source of truth. Layer two is the Wiki itself. This is a folder of markdown files that the AI creates and maintains, all interlinked, all organized. It's going to have things like an index page, concept pages, entity pages, summary comparisons, all interlinked, all maintained by the AI. Layer three, the schema. This is basically a rules document. It tells the AI how to structure the Wiki, how to handle new sources, how to format everything. If you're using Claude Code, this would be your Claude.md file. If you'd follow my other Claude Code series, you already know what that is. If you're new to Claude Code, I'll put the link to my beginner's video right up there. All right, let's get into the setup. Here's what we're going to need. First, Obsidian. This is a free note-taking app that works with plain markdown files. It's going to be our viewer. You can download it at Obsidian.md, I'll put the link down below in the description. And don't worry if you've never used Obsidian before, I'll walk you through the parts that matter. Second, an AI coding agent. I'm going to be using Claude Code for this because this is what I've been using in my series and it works really well for this. But you could also use opening I Codex, Cursor or other tools that can read and write files on your computer. Now, I just want to point out, I'm using Obsidian because it has the graph view that makes the connections really visual. But this is just a folder of markdown files. You could use VS code or any text editor, whatever you're most comfortable with. Once you got Obsidian installed, just go ahead and open it, and the first thing what I'm going to do is just go and create a new vault, you'll see it right here. So I'm going to go create, and I'm going to call this one LLM Wiki and I'm just going to save it somewhere simple, I'm just going to put it into my documents here, you'll see there, and you can put it where you'd like. I'm going to go and hit create. Now we need to set up a folder structure. I'm going to create three folders. The first one's going to be raw. I'm just clicking right up here, new folder, and I'm going to call it raw. The AI will read from this but never change anything in here. The second folder is going to be Wiki. This is where the AI will build and maintain all of its pages. And the third folder is going to be called templates. This templates folder is optional. If you wanted to manually create notes in Obsidian with a consistent format, you could put a template rate in here. But since Claude is going to be creating all of our Wiki pages for us, we don't need it for this tutorial, it's just here as a point to tell you about. So here's what our structure looks like. We have our Wiki, templates and raw. Nothing too complicated with this. Now, here's the important part. We need to create the schema file, the rules document that tells the AI how to operate the Wiki. If you're using Claude Code, you're going to create a file called Claude.md in the root of your vault. So this is the file that Claude Code reads automatically when it opens a project. So I'm going to give you a starter template that you can copy. It's linked down below in the description, but let me walk you through what's in it. Now, first of all, I'm just going to bring the Claude.md file into the route right here. So I'm just going to drop it, so we can have it here. You can see my other folders are here, but here's the Claude.md file. So if I click on it, you're going to be able to see what's in it. Now, first, the purpose right here. So this is the purpose of the Wiki. What's the knowledge base about? So in our template, I've set this to planning a trip to Japan because this is what we're going to do in the demo today. But you need to when you download this file, this is the one line you can change to match whatever you're going to be building a Wiki about. If you're researching renewable energy, change it to that. If you're wanting to track books that you want to learn from, change it to that. Everything else in the template works as is. The purpose line is the only thing that you really need to customize to get started. Second, the folder structure. Where are the raw resources, where's the Wiki output, what goes where? Third, the ingest workflow. When you add a new source document, what should the AI do? The basic steps are, read the document, extract key concepts, create the update Wiki pages, update the index and log what changed. Fourth, page formatting rules. Things like every page should have a summary at the top, every claim should reference its source.

[7:44]And fifth, the question answering behavior. When you ask the AI a question, it should consult the Wiki first. Cite its sources and tell you when something is uncertain. Now, don't overthink this. The template I'm giving you gives you a solid starting point. You can always refine it as you go. That's actually part of the process. The schema evolves as the Wiki grows. I'm also going to add this Obsidian extension here, it's a web clipper, so I'm going to go ahead and add this to Chrome. It's free to do and what it's going to do is convert any web articles into a markdown file, so it's super handy. All right, now here comes the fun part. Let us feed the Wiki with its first document. Now, I'm going to drop an article into the raw folder and for this demo, I'm going to be planning a trip to Japan. So I'm going to start with a travel blog post about visiting Tokyo, things to do like neighborhoods to explore, that kind of thing. I I'm going to save this as a markdown with the extension that we just installed. So if I go up top, you can see I have the extension here and I could go directly to Obsidian or I can download it. So I am going to just, or I could copy paste it over to, I'm going to hit save as. Now, I'm going to hop back over to Obsidian here. So that's just in my downloads folder right now. So I can see it, I can drag it over and where do I want it? I want to put it in my raw. So if I click, here it is now, here's that article, all that information right in here into my raw folder. And by the way, your sources don't have to be markdown files. If you have PDFs, just drag them straight into the raw folder. Claude Code can read PDFs natively, same with text files, same with markdown, whatever your format documents are in, just drop them in and Claude will handle it. Now, I want to go and open Claude, but before I do that, I need to make sure that I'm pointing towards where we have this all set up. So I'm going to just change my directory to this right through here. You can see it, I put it in my documents and this is what it's called. So we have our directory changed. Now, I'm going to go ahead and open up Claude.

[10:15]Okay, so now I'm going to tell it to ingest the new source. So I'm just going to say, I just added a new source to the raw folder. Please read it and update the Wiki. And watch what's happening. Claude is reading the article, it's creating the Wiki pages. And there's the summary of the article. Here's the pages for different neighborhoods, like all through here. You can see all the different Wiki pages here that it's planning to create. And if this looks good, I'm going to tell it to go ahead, but you can see how I can adjust the scope as well. I'm just going to say, go ahead. Okay, you can see after about three minutes, it's all done. But let's go check out Obsidian and what's happening over there. All right, let's open up the Wiki. See, look at this, we have structured pages. If we click on any of these here, we have links to all of these. And if I go over to graph view, take a look at this. You can see the connections forming. This is one document. Imagine what this looks like after 20 sources. Now, this is where it gets really interesting. Let me add a second source. We're going to do this food guide here to Japan. I'm going to do what I did last time. I'm just going to go ahead and make sure that I save it and I'll bring it back over Obsidian. Then I'll tell Claude to ingest it.

[11:54]Let's say the exact same thing. I just added a new source to the raw folder. Please read it and update the Wiki. Now, look at this. So Claude isn't just creating new pages, it's actually updating the neighborhood pages as well that it had already made. And you can kind of look specifically at the details how they're making those adjustments to each of these. This is the Wiki doing its job. Now, look at the graph view now, more nodes, more connections, the Wiki is getting smarter with every source we add. Now, let me ask a question that requires information from both sources. What neighborhood should I stay at if I want to be close to the best food and still near the major temples? And look at the answer. Claude's not searching the raw articles, it's pulling from the Wiki from the neighborhood pages, the food pages, the temple page. It's connecting dots that were spread across completely different sources. It's citing specific Wiki pages. This is completely different from what you get with basic RAG setup. One more thing I want to show you that I think is really clever. Karpathy talks about this idea of linting your Wiki, just like how a code linter checks your code for problems. You can periodically ask the AI to edit the whole Wiki. It'll look for things like contradiction between pages, claims that might be outdated, pages that have no links pointing them, like orphan pages. And concepts that are mentioned but don't have any of their own page yet. We can just say something like this, please lint the Wiki. Now, Claude's going to go through everything and give you a report. This is how you keep your Wiki healthy as it grows. And look what it gives me back here, the different checks from orphan pages to broken links. This is telling me that it's structurally sound, which I expected since we only have two different articles in this. And even makes the offer to fix the citation issues. So what would you actually use this for? Here are some ideas. If you're a student or a researcher, build a Wiki as you read papers and articles on a topic. By the end, you have this structured knowledge base, just not a pile of highlighted PDFs. If you're a teacher, feed in curriculum documents, professional development materials and articles, build a personal teaching Wiki that grows over time. If you're a business, feed in meeting notes, customer call transcripts and project documents. So this allows new team members to browse this organized Wiki instead of digging through slack history. If you're just a curious person who reads a lot, use it to track what you learn from books, podcasts, and articles. It's like building your own personal encyclopedia. The pattern works anywhere you're accumulating knowledge over time and you want it organized rather than just scattered. Okay, let me be straight about the limitations because this isn't magic. First, this works best at personal scale. Karpathy talks about having Wikis of around 100 articles. If you're trying to build something with tens of thousands of pages, you're going to want more infrastructure than just some markdown files. Second, garbage in, garbage out. The Wiki is only as good as the sources you feed it. You still need to curate what goes in. Third, you do need a coding agent to make this work. Obsidian by itself doesn't do any of this. The AI is the engine, so you need to access something like Claude Code, Codex or similar tool. And fourth, the AI can make mistakes. It might miscategorize something or miss connection. That's why the lint feature exists. If you want to review what it builds, especially earlier on. But with all that said, this is generally one of the most practical AI workflows I've seen. It solves real problems, it's free to set up and your data stays on your computer in plain text files that you own. And that's the LLM Wiki, a personal knowledge base that the AI builds and maintains for you that actually gets better over time instead of starting from scratch on every question. I'll have the schema, template and all the links you need in the description below. If you want to learn Claude Code, so you can use it for yourself, my beginner's guide is down there also. Thanks for watching this time on Teachers Tech. I'll see you next week with more tech tips and tutorials.

MORE TRANSCRIPTS

Thumbnail for Assamese story/Assamese gk/Assamese motivational gk story/gk story Assamese/gk /Assamese g k/story by মই পাৰিম Motivational speech

Assamese story/Assamese gk/Assamese motivational gk story/gk story Assamese/gk /Assamese g k/story

মই পাৰিম Motivational speech

Thumbnail for Quick Guide to Aquatic Planet Pack DLC - Oxygen Not Included by Grind This Game

Quick Guide to Aquatic Planet Pack DLC - Oxygen Not Included

Grind This Game

Thumbnail for Трейдинг с нуля: объяснил ПРОСТО каждую деталь by КриптоБош

Трейдинг с нуля: объяснил ПРОСТО каждую деталь

КриптоБош

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript