About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Showing posts with label Data Science. Show all posts
Showing posts with label Data Science. Show all posts

Sunday, June 7, 2020

Python Trick: Alternative to if-else/Case statement

Lets look the below code snippet

def add_number(a,b):
   
print(a+b);

def multiply_number(a,b):
   
print(a*b)

def division_number(a,b):
   
print(a/b)

result =
'30'

if result == '10':
    add_number(
10,20)
elif result == '20':
    multiply_number(
12,2)
elif result== '30':
    division_number(
36,3)


here we have three method, and methods are called depending upon the value of result, if the result value is 10 then add_number is called, if result is 20 then multiply_number is called and so on

So you can see above we have written multiline if-elseif statement to call method on the bases of result value.

Now lets see the below code snippet, here we have declare a dictionary object with result value as a key of dictionary object and the associated method as a dictionary key value.

result_dict={
   
'10':add_number,
   
'20':multiply_number,
   
'30':division_number
}

result=
'10'
result_dict[result](1,2)

The result value is store in a result variable and that variable is passed as an index to dictionary key which internally calls the associated method. So it just one line statement instead of if-else ladder.


Enjoy pythonic way J

Post Reference: Vikram Aristocratic Elfin Share

Python Trick: Make your program look simple and clean, with the help of dictionary:

When you have multiple object in you program, it often looks messy with different object name being used to call corresponding methods. Let’s see with an example:

class Dog:
   
def __init__(self,name):
       
self.talents=[]
       
self.name=name
   
def add_talent(self,talent):
       
self.talents.append(talent)

dog_obj1 = Dog(
'Peddy')
dog_obj2 = Dog(
'Magnet')

dog_obj1.add_talent(
'Black Plet')
dog_obj1.add_talent(
'Fisher Dog')

dog_obj2.add_talent(
'Guard Dog')
dog_obj2.add_talent(
'Happy Eater')

print("Talent of Peddy")
print(dog_obj1.talents)

print("\nTalent of Magnet")
print(dog_obj1.talents)

#Output
Talent of Peddy
['Black Plet', 'Fisher Dog']

Talent of Magnet
['Black Plet', 'Fisher Dog']


here above if you see, we have Dog class which has a constructor which initialize two instance object talents (list obj) and  name and we have one instance method add_talent which add the talent of Dog object.

In later part of code, we have declared two object dog_obj1(‘Peddy’) and dog_obj2(‘Magnet’), and then we are calling add_talent method to add talent of Peddy and Magnet.

The code all ok, but just imagine when you have plenty of object in you program, then your program may look quite messy with different objects name.

So what could be the way?

Simple, simple create a dictionary with list of object key value pair and access each object instance variable and method with dictionary object key value. Look at below code snippet

# creating dictionary of Dog
obj_dict={
   
'Peddy':dog_obj1,
   
'Magnet':dog_obj2
}

# Accessing instance valriable of class
print(obj_dict['Peddy'].talents)
print(obj_dict['Magnet'].talents)

# To add new talent of Paddy, it would be qutie simple"
obj_dict['Peddy'].add_talent('Wolf mound')

#Printing talent of Peddy
print(obj_dict['Peddy'].talents)

#OUTPUT

['Black Plet', 'Fisher Dog']
['Guard Dog', 'Happy Eater']
['Black Plet', 'Fisher Dog', 'Wolf mound']


So here we created a dictionary object obj_dict with list of Dog class object with key value pair Peddy and Magnet. There after we are accessing the Peddy and Magnet talent with dictionary key

obj_dict['Peddy'].talents
obj_dict['Magnet'].talents

Now if we want to add new talent of Peddy then it quite simple:

obj_dict['Peddy'].add_talent('Wolf mound')

This way, your program looks very less messy and much readable.


Enjoy pythonic way J

Post Reference: Vikram Aristocratic Elfin Share

Tuesday, April 14, 2020

COVID19 India Data Analysis, Predicting Total Case on 4th of May (by end of lockdown Version-02)


Here we trying to focus on what will be the confirmed case count on the last day of lockdown version-02 in India, the entire analysis is based on growth rate technique.  Let’s import required modules

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(
connected=True)
import folium
import seaborn as sns
import os
import datetime


Let try to find out growth rate, considering the data from 30th Jan

confirmed_df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/'+
                          
'COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/'
                          
+'time_series_covid19_confirmed_global.csv')

india_sel  = confirmed_df[confirmed_df[
'Country/Region']=='India'].loc[:'4/13/20']
india_confirmed_list = india_sel.values.tolist()[
0]
india_confirmed_list[
4]
growth_diff = []

for i in range(4,len(india_confirmed_list)):
   
if (i == 4) or india_confirmed_list[i-1] == 0 :
        growth_diff.append(india_confirmed_list[i])
   
else:
        growth_diff.append(india_confirmed_list[i] / india_confirmed_list[i-
1])

growth_factor =
sum(growth_diff)/len(growth_diff)
print('Average growth factor',growth_factor)

#OUTPUT: GROWTH RATE
Average growth factor 1.0637553331032963


Lets now calculate the next twenty 21 days case count and plot it in chart

x_axis_prediction_dt = []

dates =
list(confirmed_df.columns[4:])
dates =
list(pd.to_datetime(dates))

#we will add one day to the last day till which we have data
start_date = dates[len(dates) - 1]
for i in range(21):
    date = start_date + datetime.timedelta(
days=1)
    x_axis_prediction_dt.append(date)
    start_date = date

# Get the last available day total number   
previous_day_cases = confirmed_df[confirmed_df['Country/Region']=='India'].iloc[:,-1]
# Converting series to float value
previous_day_cases = previous_day_cases.iloc[0]
y_axis_predicted_next21days_cases = []

for i in range(21):
    predicted_value = previous_day_cases *  growth_factor
    y_axis_predicted_next21days_cases.append(predicted_value)
    previous_day_cases = predicted_value
# print(previous_day_cases)

#add Graph
fig1=go.Figure()
fig1.add_trace(go.Scatter(
x=x_axis_prediction_dt,
                          
y=y_axis_predicted_next21days_cases,
                         
name='India'
                              
))

fig1.layout.update(
title_text='COVID-19 next twenty one prediction',xaxis_showgrid=False, yaxis_showgrid=False, width=800,
       
height=500,font=dict(
#         family="Courier New, monospace",
       
size=12,
       
color="white"
   
))
fig1.layout.plot_bgcolor =
'Black'
fig1.layout.paper_bgcolor = 'Black'
fig1.show()

Growth rate predict cases will jump over 35k by 3rd of May



Post Reference: Vikram Aristocratic Elfin Share

Monday, April 13, 2020

Analysis of top 5 Indian state Covid19 confirmed case till March month - Part1

I am using Kaggle dataset "covid19-corona-virus-india-dataset/complete.csv" for my analysis.
We will first try to find out top five states with most number of cases, and then will try to plot the data on day on day basis.
.
Lets first import relevant module

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(
connected=True)
import folium
import seaborn as sns
import os


Now in second part I have created a dataframe and call the def operationDfs by passing the newly created dataframe, here if you see the we have used time package to record the total execution time to run the entire program.

df_complete = pd.read_csv('../input/covid19-corona-virus-india-dataset/complete.csv')
df_patient_wise = pd.read_csv(
'../input/covid19-corona-virus-india-dataset/patients_data.csv')


#date and state wise total
df = pd.DataFrame(df_complete.groupby(['Date','Name of State / UT'])['Total Confirmed cases (Indian National)'].sum()).reset_index()
df[df[
'Name of State / UT']=='Maharashtra']

#State wise Total till 29th March
df_stateWiseTot =  pd.DataFrame(df.groupby(['Name of State / UT'])['Total Confirmed cases (Indian National)'].sum()).reset_index()
df_stateWiseTot.sort_values(
'Total Confirmed cases (Indian National)', axis = 0, ascending = False, inplace = True, na_position ='last')
df_stateWiseTot.nlargest(
5,'Total Confirmed cases (Indian National)')

#OUTPUT
Name of State / UT Total Confirmed cases (Indian National)
Maharashtra          
1294
Kerala           1264
Uttar Pradesh     512
Karnataka         480
Delhi            390


Lets plot the data state wise with confirmed case on day on day basis

fig1=go.Figure()
fig1.add_trace(go.Scatter(
x=df[(df['Name of State / UT']=='Maharashtra') & (df['Date'] < '2020-03-29') ]['Date'],
                                
y=df[df['Name of State / UT']=='Maharashtra']['Total Confirmed cases (Indian National)'],
                         
name='Maharashtra'
                              
))
fig1.add_trace(go.Scatter(
x=df[(df['Name of State / UT']=='Kerala') & (df['Date'] < '2020-03-29') ]['Date'],
                                 
y=df[df['Name of State / UT']=='Kerala']['Total Confirmed cases (Indian National)'],
                         
name='Kerala'
                              
))
fig1.add_trace(go.Scatter(
x=df[(df['Name of State / UT']=='Uttar Pradesh') & (df['Date'] < '2020-03-29') ]['Date'],
                                
y=df[df['Name of State / UT']=='Uttar Pradesh']['Total Confirmed cases (Indian National)'],
                         
name='Uttar Pradesh'
                              
))
fig1.add_trace(go.Scatter(
x=df[(df['Name of State / UT']=='Karnataka') & (df['Date'] < '2020-03-29') ]['Date'],
                                
y=df[df['Name of State / UT']=='Karnataka']['Total Confirmed cases (Indian National)'],
                          
name='Karnataka'
                              
))
fig1.add_trace(go.Scatter(
x=df[(df['Name of State / UT']=='Delhi') & (df['Date'] < '2020-03-29') ]['Date'],
                                
y=df[df['Name of State / UT']=='Delhi']['Total Confirmed cases (Indian National)'],
                         
name='Delhi'
                              
))

fig1.layout.update(
title_text='COVID-19 Top 4 State Wise Data in India',xaxis_showgrid=False, yaxis_showgrid=False, width=1100,
       
height=500,font=dict(
#         family="Courier New, monospace",
       
size=12,
       
color="white"
   
))
fig1.layout.plot_bgcolor =
'Black'
fig1.layout.paper_bgcolor = 'Black'
fig1.show()




Data Science with…Python J

Post Reference: Vikram Aristocratic Elfin Share