I have an excel sheet with below records
Lets filter out all those rows where customer name contain “ika”
import pandas as pd
import numpy as np
df1 = pd.read_csv("NullFilterExample.csv")
print('Original DF rows \n',df1 , '\n')
#implementing filter not like condition
df2 = df1[~df1.customer_name.str.contains('ika', na=False)]
print('Rows where customer Name not contain ika \n',df2)
import numpy as np
df1 = pd.read_csv("NullFilterExample.csv")
print('Original DF rows \n',df1 , '\n')
#implementing filter not like condition
df2 = df1[~df1.customer_name.str.contains('ika', na=False)]
print('Rows where customer Name not contain ika \n',df2)
Output
Original DF rows
account_no branch
city_code customer_name amount
0 2112 3212.0
321.0 Sidhika 19000
1 2119 NaN
215.0 Prayansh 12000
2 2115 4321.0
212.0 Rishika 15000
3 2435 2312.0
NaN Sagarika 13000
4 2356 7548.0
256.0 NaN 15000
Rows where customer Name not contain ika
account_no branch
city_code customer_name amount
1 2119 NaN
215.0 Prayansh 12000
4 2356 7548.0
256.0 NaN 15000
Now here if you see the output of df2, you will there are two rows with
index 1 and 4, which simply indicates that it require reindexing. Lets put the
logic of reindexing
#reindexing DF2 dataframe
df3 = df2.reset_index(drop=True)
print('After reindexing of DF3 \n',df3)
df3 = df2.reset_index(drop=True)
print('After reindexing of DF3 \n',df3)
Output:
After reindexing of DF3
account_no branch
city_code customer_name amount
0 2119 NaN
215.0 Prayansh 12000
1 2356 7548.0
256.0 NaN 15000
Now here if you see in output, the index are correct and in sequence
i.e. 0 and 1.
Now lets check the syntax
df2.reset_index(drop=True)
there is a parameter “drop=True”, this actually drops the existing index
on the rows and create new one starting with 0.
But what if we want to preserve the actual index of the rows… just
simple remove the optional parameter “drop=True”
#what if we remove
"drop=True" parameter og reset index
df4 = df2.reset_index()
print('After reindexing of DF2 \n',df4)
df4 = df2.reset_index()
print('After reindexing of DF2 \n',df4)
Output
After reindexing of DF2
index account_no
branch city_code
customer_name amount
0 1 2119
NaN 215.0 Prayansh
12000
1 4 2356
7548.0 256.0 NaN
15000
So here you can see, it creates a new column called “index”, and
preserve the existing index numbering.
Full code:
import pandas as pd
import numpy as np
df1 = pd.read_csv("NullFilterExample.csv")
print('Original DF rows \n',df1 , '\n')
#implementing filter not like condition
df2 = df1[~df1.customer_name.str.contains('ika', na=False)]
print('Rows where customer Name not contain ika \n',df2)
#reindexing DF2 dataframe
df3 = df2.reset_index(drop=True)
print('After reindexing of DF2 \n',df3)
#what if we remove "drop=True" parameter og reset index
df4 = df2.reset_index()
print('After reindexing of DF2 \n',df4)
import numpy as np
df1 = pd.read_csv("NullFilterExample.csv")
print('Original DF rows \n',df1 , '\n')
#implementing filter not like condition
df2 = df1[~df1.customer_name.str.contains('ika', na=False)]
print('Rows where customer Name not contain ika \n',df2)
#reindexing DF2 dataframe
df3 = df2.reset_index(drop=True)
print('After reindexing of DF2 \n',df3)
#what if we remove "drop=True" parameter og reset index
df4 = df2.reset_index()
print('After reindexing of DF2 \n',df4)
Data Science with…Python :
Post Reference: Vikram Aristocratic Elfin Share
No comments:
Post a Comment