[0:01]Hello friends, in this video we will be talking about the three different types of data. Now we've already seen what data is. Data are raw facts that have not been processed to explain their meaning. Now the data is stored in the database, and database management system manages the data, that is, it stores and updates the data in the database, and it retrieves the data from the database to provide answers to the users. There are three different types of data: structured data, semi-structured data, and unstructured data. Now let's see what is structured data. Structured data is the kind of data that is stored in a tabular format, that means in the form of rows and columns. Structured data is very clearly defined and it is stored in a pre-defined data model. For example, Excel files or SQL databases. We've all seen Excel, the sheets are in a tabular format and they are very well organized in a very disciplined manner. Now structured data is stored in the form of rows and columns, but it is very important that the rows and columns are related to each other.
[1:36]That means, the overall data has to be related to each other. When the data is related to one another, we get a proper view and understanding of the data. Hence, relation is the most important part of structured data. Suppose, when we visit a website of an airlines, we can see the example of Emirates Airlines for flights from Dubai to Paris. As you can see, the data is available in a proper tabular format, like what is the place of departure, what is the time of departure, how much time it is going to take, that means, what is the time span of the journey and what time are we going to reach at our destination. So, we get a proper view and understanding of the data whenever the data is in this disciplined structured format. So, we need to store the data in a structured format and the data has to be related to one another. So for this kind of data storage, we need a database management system that can manage the data, that is, it stores, updates and retrieves it in the related format. So to manage this kind of related data in a structured format, we need RDBMS, that is, relational database management system. So, structured data is stored in relational databases because they manage the relationship between the data and hence relational database management system is used to manage relational databases. So that is all about structured data. Now let's move ahead to the unstructured data. This is one of the most important kinds of data. As the name suggest, unstructured data has no pre-defined structured, or it is not available in any forms of pre-defined data model. That is why unstructured data is irregular and ambiguous. Irregular because it does not have a pre-defined format or structure and ambiguous because it is uncertain in nature, that means it can be anything from text to numbers, to images, to videos, to audios, to anything. That is why it is said to be irregular and ambiguous. However, it is the easiest to extract information from unstructured data. Hence, also, almost 80 to 90% data available is unstructured data. Now the example of unstructured data as I've already said, it can be text, numbers, audio, video, images, messages, social media posts, all of these are unstructured data. For example, Facebook, Instagram, YouTube, Google, they all are made up of unstructured data. Now we all know survey forms, we might have filled survey forms at some point in our lives. So survey forms are also unstructured data. Why? Because it contains two kinds of question. First can be where we need to check some boxes to give in our choice of answers, and surveys may also require us to answer some open-ended questions. We are conducting a survey for example. Survey of coffees, okay. So, suppose the question is, what is your favorite kind of coffee? So there can be some options and we may have to check whatever is the favorite kind of coffee.
[5:35]And how many cups of coffee do you have in a day? We can again have checkboxes to answer this kind of question. But, how does coffee make you feel? Please elaborate. Now this cannot be a question where we have a checkbox to tick and provide the answer. We may have to write the answer in a few words, we have to elaborate the answer. So this kind of data is very difficult to analyze. But, now we have artificial intelligence. And artificial intelligence is very useful in analyzing these kinds of surveys and extracting information from this kind of unstructured data. Okay? Like again, for example, there's face recognition that is used by Google. How does Google identify the picture and give us the answers? Because of the artificial intelligence. Now Facebook, Instagram, these analyze the data, how do they analyze the data? If you've observed on Facebook, on Instagram, on YouTube, whenever we watch a certain kind of video or a post, we get suggestions to watch those kinds of posts, okay? How do we get the suggestion? Like, how do they come to know what we are looking at? So, all of these social media sites, they use post title, post description and the comments to analyze the data, and they try to understand what kind of things are interesting people these days, okay? So this concept is extremely intelligent and yes, it is a complex task to analyze this data. But, nowadays, it is very useful in understanding how the people are feeling or thinking about certain aspects, okay? So previously, only structured data was used extensively because it was very easy to extract information from that kind of data, to analyze that kind of data. But now, with the help of artificial intelligence, unstructured data has also come into the picture and thus unstructured data has become the most useful kind of data these days and it provides a lot of information. And you know it yourself because we are the users, the people who are constantly in need of social media or different websites for, you know, gathering information. Also, as we have seen structured data was stored in relational database and it was managed by relational database management system, unstructured data is stored in the data lake. Now, let's switch to semi-structured data. Now, as the name suggests, semi-structured data falls between structured and unstructured data, that means it is a combination of both, partly structured data and partly unstructured data. For example, emails. Now we have all seen a basic format of email. Part of the information in the email is structured and part of it is unstructured. For example, the name of the sender, or the email ID of the sender, date, time, there is a proper way of mentioning these basic informations. However, the content of the email can be images, that can be text, that can be numbers, that can be PDF's. That portion of the email is unstructured data. Another example of semi-structured data is XML files, or the World Wide Web. Like, on the World Wide Web, we have all kinds of data. Some of it is structured, some of it is unstructured. That means it is a combination of both and it is a semi-structured data. So that was all for this video. I hope that was helpful. Thank you so much for watching.



