Python Pandas
Panda is an open source library of Python. This library is used to analyse and manipulate data using python data structure.
Pandas is used for the high performance of merging data,Python pandas is used in many fields like Finance,Statistic,Web Development,Artificial Intelligence and data preprocessing and much more. Pandas consist of lots of DataFrame objects with default customization indexes.
Key Feature of Pandas
- Pandas provide a large amount of libraries which are very fast and efficient.it provides the series and data frame.
- Lots of data makes confusion unless you don’t have any labeling method are an important factor,without the label data becoming more complex to understand pandas providing the indexing and alignment options for making the dataset easy.
- Lots of data makes confusion unless you don’t have any labeling method are an important factor,without the label data becoming more complex to understand pandas providing the indexing and alignment options for making the dataset easy.
- In any data there are lots of possibilities that lack value in the dataframe ,it makes more complex and confusing pandas help to find the missing value.
- In any data there are lots of possibilities that lack value in the dataframe ,it makes more complex and confusing pandas help to find the missing value.
- Before processing any dataset we would clean up data ,with the help of this you can clean data.
- Pandas provides a wide range of built-in data for read and write output.
- Now time data is found in multiple formats like JSON,CSV,HDFS.Pandas provide multiple ways to read the data.
Pandas provides the following type of data structure.
a. Series
b. Data Frame
Series :it is the homogeneous and one dimensional,immutable array.
DataFrame:It is the 2D label size mutable ,heterogeneous array.
DataFrame is widely used in the world and it is the most important structure.
Series
Its one dimension array previously we talk about and it’s the homogeneous array. Lets see the example for better understanding.
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
Data Frame
It is the two dimensional and it is the mutable data structure. Lets takes the example:-
Name | Phone |
Address |
Roll No |
Class |
Rohan | 8765435687 | India | 001 | 5 |
Sohan | 987654356 | India | 002 | 5 |
Reema | 6543256546 | India | 003 | 5 |
Archana | 453453455 | India | 004 | 5 |
This table representing the details of the student and provides useful information of these student,
Data type Column
Name | String |
Phone | Int |
Address | String |
Roll No | Int |
Class | Int |
Pandas Example
import pandas as pd data = pd.Series([1, 2, 3, 4, 5, 6, 7,8,9,10]) print(data)
Output
0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 dtype: int64
Question
import pandas as pd data= {'Name' : 'Ankit','Address': 'Lucknow', 'Gender' : 'Male', 'Mobile': 234564445, 'Roll No': 54} d= pd.Series(data) print(d)
Output
Name Ankit Address Lucknow Gender Male Mobile 234564445 Roll No 54 dtype: object
Question 1
# importing pandas as pd import pandas as pd # Creating the Series data = pd.Series(['Football', 'Cricket', 'Hockey', 'Chess', 'Badminton']) # Create the Datetime Index index = ['Sports1','Sports2','Sports3','Sports4','Sports5'] # set the index data.index = index # Print the series print(data)
Output
Sports1 Football Sports2 Cricket Sports3 Hockey Sports4 Chess Sports5 Badminton dtype: object
Program 2
# importing pandas as pd import pandas as pd # Creating the Series food_price = pd.Series([250, 100, 300, 400,300,333,600]) # Create the Index recipe = ['Chow mein','Burger','HOT Dog','French Fry','Macaroni','Pasta','Pizza'] # set the index food_price.index = recipe # Print the series print(food_price)
Output
Chow mein 250 Burger 100 HOT Dog 300 French Fry 400 Macaroni 300 Pasta 333 Pizza 600 dtype: int64