Python pandas is a software library specifically developed for data manipulation and analysis. Pandas is free software released under the three clause BSD license.Pandas provides two data structures like series and data frameand operations for manipulating numerical table and time series.
Python pandas is well suited for different kinds of data, such as:
To install Python Pandas, go to your command line/ terminal and type “pip install pandas” or else, if you have anaconda installed in your system, just type in “conda install pandas”. Once the installation is completed, go to your IDE (here I have used Jupyter Notebook) and simply import pandas library by typing:
#import pandas
import pandas as pd
#To get the current version of pandas
print(pd.__version__)
Output: 1.1.3
#Create a series from a list
list1=[10,20,30,40,50]
s1=pd.Series(list1)
print(s1)
Output:
0 | 10 |
1 | 20 |
2 | 30 |
3 | 40 |
4 | 50 |
dtype: int64
#Customize the index
list2=['a','b','c','d','e']
s2=pd.Series(list1,index=list2)
print(s2)
Output:
a | 10 |
b | 20 |
c | 30 |
d | 40 |
e | 50 |
dtype: int64
#Create a series from random numbers
#To get random numbers by invoking randn()
import numpy as np
x=np.random.randn(10)
x
Output:
#Then create a Series by taking random numbers
s3=pd.Series(x)
s3
Output:
0 | 0.146392 |
1 | -0.748931 |
2 | 0.525632 |
3 | 0.296724 |
4 | -1.251444 |
5 | 0.070018 |
6 | -0.465796 |
7 | 0.784062 |
8 | 0.763505 |
9 | 1.062141 |
dtype: float64
#Create a series from a dictionary
dict1={'a':100,'b':200,'c':300}
s4=pd.Series(dict1)
s4
Output:
a | 100 |
b | 200 |
c | 300 |
dtype: int64
print(s1)
0 | 18 |
1 | 20 |
2 | 30 |
3 | 40 |
4 | 50 |
dtype: int64
#To get the minimum value
s1.min()
Output
10
#To get the max value
s1.max()
Output
50
#To get mean value
s1.mean()
Output
30.0
#To get the median value
s1.median()
Output
30.0
s5=pd.Series([10,20,30])
s5
Output
0 | 18 |
1 | 20 |
2 | 30 |
dtype: int64
#Add two series
s1.add(s5)
Output
0 | 20.0 |
1 | 40.00 |
2 | 60.00 |
3 | NaN |
4 | NaN |
dtype: float64
#subtract two series
s1.sub(s5)
Output
0 | 0.0 |
1 | 0.00 |
2 | 0.00 |
3 | NaN |
4 | NaN |
dtype: float64
#multiply two series
s1.mul(s5)
Output
0 | 100.0 |
1 | 400.00 |
2 | 900.00 |
3 | NaN |
4 | NaN |
dtype: float64
#Division of two series
s1.div(s5)
Output
0 | 1.0 |
1 | 1.00 |
2 | 1.00 |
3 | NaN |
4 | NaN |
dtype: float64
#Create a DataFrame from a List
list2=['java','python','php','ruby']
df1=pd.DataFrame(list2)
print(df1)
Output
0 | |
0 | java |
1 | python |
2 | php |
3 | ruby |
#Create a DataFrame from a dictionary
employee_detail={'eid':[101,102,103],'ename':['Rahul','Sachin','Sourav'],'esal'
:[10000,2000,30000]}
df2=pd.DataFrame(employee_detail)
print(df2)
Output
eid | ename | esal | |
0 | 101 | Rahul | 10000 |
1 | 101 | Rahul | 10000 |
2 | 102 | Sachin | 20000 |
3 | 103 | Sourav | 30000 |
#Create a DataFrame from a csv file
df3=pd.read_csv("E:\cipet\salary_info.csv")
df3.head()
Output
df3.tail()
Output
df3.shape
Output
(30, 2)
df3.info()
df3.describe()
Output
#Concatenating two data frames
dict1={'regdno':[101,102,103],'name':['Sachin','Sourav','Rahul'],
'branch':['CSE','IT','EEE']}
df4=pd.DataFrame(dict1,index=[0,1,2])
df4
Output
dict2={'regdno':[104,105,106],'name':['Akshay','Ajay','Abhay'], 'branch':['ETC','Civil','Mechanical']} df5=pd.DataFrame(dict2,index=[3,4,5]) df5
Output
#concatenate two data frame
result=pd.concat([df4,df5])
print(result)
Output
regdno | name | branch | |
0 | 101 | Sachin | CSE |
1 | 101 | Sourav | IT |
2 | 102 | Rahul | EEE |
3 | 103 | Akshya | ETC |
4 | 104 | Ajay | Civil |
5 | 105 | Abhay | Mechanical |
#merging two data frame
dict3={'regdno':[107,108,109],'name':['Lelin','Nirmal','Kalandi'],
'branch':['ETC','Civil','Mechanical']}
df6=pd.DataFrame(dict3)
df6
Output
dict4={'regdno':[107,108,109],'address':['BBSR','Cuttack','Paradeep'], 'age':[22,24,26]} df7=pd.DataFrame(dict4) df7
Output
result1=pd.merge(df6,df7,on='regdno') print(result1)
Output
regdno | name | branch | address | age | |
0 | 107 | Lelin | ETC | BBSR | 22 |
1 | 108 | Nirmal | Civil | Cuttack | 24 |
2 | 109 | Kalandi | Mechanical | Paradeep | 26 |
Silan Software is one of the India's leading provider of offline & online training for Java, Python, AI (Machine Learning, Deep Learning), Data Science, Software Development & many more emerging Technologies.
We provide Academic Training || Industrial Training || Corporate Training || Internship || Java || Python || AI using Python || Data Science etc