Thumbnail for Yahoo Finance 24/7 Stream: Daily Market Coverage & more by Yahoo Finance

Yahoo Finance 24/7 Stream: Daily Market Coverage & more

Yahoo Finance

10m 13s1,318 words~7 min read
AI audio transcription
Transcript source

AI audio transcription

This transcript was generated from the video's audio because no usable YouTube caption track was available. The transcript below is server-rendered so it can be read, searched, cited, and shared without opening the original YouTube player.

Pull quotes
[0:00]This is the fourth episode of our Python for Data Science and Machine Learning series.
[0:00]In the last video, we covered the basics of Jupyter Notebook, and in this video, we're going to dive deep into pandas.
[0:00]We'll briefly talk about the series and how to create them, and we'll talk about the data frames in pandas and how to create them.
[0:00]We'll also see how to get input and output of CSV files in pandas, and we'll also see how to get input and output of Excel files in pandas.
Use this transcript
Related transcript hubs

[0:00]Hello, welcome back to another video. This is the fourth episode of our Python for Data Science and Machine Learning series. In the last video, we covered the basics of Jupyter Notebook, and in this video, we're going to dive deep into pandas. In this video, we'll cover the series in pandas. We'll briefly talk about the series and how to create them, and we'll talk about the data frames in pandas and how to create them. We'll also see how to get input and output of CSV files in pandas, and we'll also see how to get input and output of Excel files in pandas. And in the next video, we're going to dive deep into pandas, and we're going to cover all of the remaining topics in pandas. So, let's get started. First, we'll talk about the installation. Pandas is an open-source library that is built on top of NumPy. It allows for fast analysis, cleaning, and transformation of our data. It's excellent in terms of built-in visualization, and it's also very good in terms of data modeling. So, let's talk about installation. The easiest way to install pandas is by using the Anaconda installation. So, if you remember in the first video, we downloaded the Anaconda installation, and that automatically installs pandas. If you don't have Anaconda installed, you can simply open up your terminal and type `pip install pandas`. And it will be installed. So, first, we need to import pandas as pd. The convention is to always import pandas as pd. So, if we run that, we will not see any output. So, let's move on. First, we'll talk about the series. A series is very similar to a NumPy array. It's built on top of the NumPy array object. The only difference is that a NumPy array doesn't have labels associated with the indices. It only has integer-based indexing. But in the series, we can define our own labels. So, let's see. We can create a series from a list, from a NumPy array, or from a dictionary. First, let's see how to create a series from a list. So, let's say we have a list called `my_list`. And it's equal to 10, 20, 30. And if we create a pandas series using `pd.Series` and pass in `my_list`, it will create a series for us. So, you can see on the left, we have the default integer-based indexing, and on the right, we have the data. And the data type is `int64`. So, this is how we create a simple series from a list. So, let's see if we can define our own labels. Let's say we have a list of labels. Let's say `labels` is equal to 'a', 'b', 'c'. And if we pass in `pd.Series(data=my_list, index=labels)`. And if we run that, you can see that now our labels are 'a', 'b', 'c'. So, this is how we can define our own labels. And again, the data type is `int64`. So, this is how we create a series from a list. Now, let's see how to create a series from a NumPy array. So, first, we need to import NumPy as np. And let's say we have an array called `arr`. And it's equal to `np.array([10, 20, 30])`. And if we create a pandas series using `pd.Series(arr)`, it will create a series for us. So, you can see on the left, we have the default integer-based indexing, and on the right, we have the data. And the data type is `int64`. So, this is how we create a series from a NumPy array. We can also define our own labels here. So, let's say we have `labels` equal to 'd', 'e', 'f'. And if we pass in `pd.Series(data=arr, index=labels)`, it will create a series for us with custom labels. So, you can see the labels are 'd', 'e', 'f'. So, this is how we create a series from a NumPy array. Now, let's see how to create a series from a dictionary. So, let's say we have a dictionary called `d`. And it's equal to `{'a': 10, 'b': 20, 'c': 30}`. And if we create a pandas series using `pd.Series(d)`. It will create a series for us. So, you can see the keys of the dictionary become the labels, and the values become the data. So, this is how we create a series from a dictionary. We don't need to specify the labels explicitly because the dictionary keys become the labels. So, this is how we create a series. Now, let's move on and talk about data frames. Data frames are basically multiple series sharing the same index. So, you can think of it as a spreadsheet with rows and columns. Each column is a series, and all the series share the same index. So, let's see how to create a data frame. We'll be using NumPy to generate some random data. So, first, we need to import NumPy as np. Let's create a random data set. We'll use `np.random.randn` to generate random numbers. Let's say we want a 5x4 matrix. So, we'll have `np.random.randn(5, 4)`. And let's store that in a variable called `data`. So, if we print `data`, you can see we have a 5x4 NumPy array with random numbers. Now, let's define our index. Let's say `index` is equal to 'A', 'B', 'C', 'D', 'E'. And let's define our columns. Let's say `columns` is equal to 'W', 'X', 'Y', 'Z'. And now, we can create a data frame using `pd.DataFrame(data=data, index=index, columns=columns)`. And let's store that in a variable called `df`. So, if we display `df`, you can see we have a data frame with our defined index and columns. So, this is how we create a data frame from a NumPy array. So, this is how we create data frames. Now, let's move on and talk about input and output of CSV files. A CSV file is a comma-separated values file. It's a very common file format for storing tabular data. So, let's see how to read a CSV file into a data frame. We'll be using a CSV file called `example.csv`. We'll use `pd.read_csv('example.csv')` to read the CSV file. And let's store that in a variable called `df_csv`. So, if we display `df_csv`, you can see we have read the CSV file into a data frame. So, this is how we read a CSV file. Now, let's see how to write a data frame to a CSV file. We'll use the `to_csv` method. So, `df.to_csv('my_output.csv')`. And if we run that, it will create a new CSV file called `my_output.csv` in our current directory. So, this is how we write a data frame to a CSV file. Now, let's move on and talk about input and output of Excel files. Excel files are also a common format for storing tabular data. So, let's see how to read an Excel file into a data frame. We'll be using an Excel file called `Excel_Sample.xlsx`. We'll use `pd.read_excel('Excel_Sample.xlsx')` to read the Excel file. And let's store that in a variable called `df_excel`. So, if we display `df_excel`, you can see we have read the Excel file into a data frame. So, this is how we read an Excel file. Now, let's see how to write a data frame to an Excel file. We'll use the `to_excel` method. So, `df.to_excel('my_excel_output.xlsx', sheet_name='Sheet1')`. And if we run that, it will create a new Excel file called `my_excel_output.xlsx` in our current directory. So, this is how we write a data frame to an Excel file. So, this concludes this video. In this video, we talked about series in pandas, data frames in pandas, and how to read and write CSV and Excel files. In the next video, we're going to dive deep into pandas and cover all of the remaining topics in pandas. So, stay tuned, and I'll see you in the next one.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript