About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Friday, December 31, 2021

SQL Query: Biggest Window Between Visits

QUESTION: Table: UserVisits

+-------------+------+
| Column Name | Type |
+-------------+------+
| user_id     | int  |
| visit_date  | date |
+-------------+------+
This table does not have a primary key.
This table contains logs of the dates that users vistied a certain retailer.


Assume today's date is '2021-1-1'.

Write an SQL query that will, for each user_id, find out the largest window of days between each visit and the one right after it (or today if you are considering the last visit).

Return the result table ordered by user_id.

The query result format is in the following example:



UserVisits table:
+---------+------------+
| user_id | visit_date |
+---------+------------+
| 1       | 2020-11-28 |
| 1       | 2020-10-20 |
| 1       | 2020-12-3  |
| 2       | 2020-10-5  |
| 2       | 2020-12-9  |
| 3       | 2020-11-11 |
+---------+------------+
Result table:
+---------+---------------+
| user_id | biggest_window|
+---------+---------------+
| 1       | 39            |
| 2       | 65            |
| 3       | 51            |
+---------+---------------+
For the first user, the windows in question are between dates:
    - 2020-10-20 and 2020-11-28 with a total of 39 days. 
    - 2020-11-28 and 2020-12-3 with a total of 5 days. 
    - 2020-12-3 and 2021-1-1 with a total of 29 days.
Making the biggest window the one with 39 days.
For the second user, the windows in question are between dates:
    - 2020-10-5 and 2020-12-9 with a total of 65 days.
    - 2020-12-9 and 2021-1-1 with a total of 23 days.
Making the biggest window the one with 65 days.
For the third user, the only window in question is between dates 2020-11-11 and 2021-1-1 with a total of 51 days.
Solution: 
  • Step1) First Generate the row number over partition by user_id
  • Step2) Then implement lag lead to get the previous date visit
  • Step3) fin difference of latest date with prev date
  • step 4) get the max days diff on the basis of user_id

with UserVisitCTE as (select row_number() over(partition by user_id order by user_id, visit_date ) as rno, * from UserVisits order by user_id, visit_date ), UserVisitLeadCTE as ( select f.user_id, COALESCE(s.visit_date,'2021-01-01'), f.visit_date ,(COALESCE(s.visit_date,'2021-01-01') - f.visit_date) as days_window from UserVisitCTE f left join UserVisitCTE s on f.rno + 1 = s.rno and f.user_id = s.user_id ) --select * from UserVisitCTE select user_id, max(days_window) as biggest_window from UserVisitLeadCTE group by user_id order by user_id
Post Reference: Vikram Aristocratic Elfin Share

SQL Query: Second Degree Follower

QUESTION: In facebook, there is a follow table with two columns: followee, follower.

Please write a sql query to get the amount of each follower’s follower if he/she has one.

For example:

+-------------+------------+
| followee    | follower   |
+-------------+------------+
|     A       |     B      |
|     B       |     C      |
|     B       |     D      |
|     D       |     E      |
+-------------+------------+
Should output:
+-------------+------------+
| follower    | num        |
+-------------+------------+
|     B       |  2         |
|     D       |  1         |
+-------------+------------+
Explanation:
Both B and D exist in the follower list, when as a followee, B's follower is C and D, and D's follower is E. A does not exist in follower list.
Note:
Followee would not follow himself/herself in all cases.
Please display the result in follower's alphabet order.
SOLUTION: 
  • Step1) implement self join on follow table
  • Step2) join follower of left table to the followee of right table using inner join
  • Step3) group by left table follower id and take the count
select f.follower , count(fr.followee) as num from follow f
inner join follow fr 
on f.follower = fr.followee
group by f.follower

Post Reference: Vikram Aristocratic Elfin Share

SQL Query: Exchange Seats Problem

Mary is a teacher in a middle school and she has a table seat storing students' names and their corresponding seat ids.

The column id is continuous increment.

Mary wants to change seats for the adjacent students.

Can you write a SQL query to output the result for Mary?



+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Abbot   |
|    2    | Doris   |
|    3    | Emerson |
|    4    | Green   |
|    5    | Jeames  |
+---------+---------+
For the sample input, the output is:

+---------+---------+
|    id   | student |
+---------+---------+
|    1    | Doris   |
|    2    | Abbot   |
|    3    | Green   |
|    4    | Emerson |
|    5    | Jeames  |
+---------+---------+
Note:

If the number of students is odd, there is no need to change the last one's seat.
Solution:

  • Step1) We get the even id set of student
  • Step2) we get the odd id set of student
  • Step3) get the even id with odd student name
  • step4) get the odd id with even student name
  • step5) union and sort step3 and step4

;with even as 
(select * from seat s1 where s1.id%2 = 0)
,odd as 
(select * from seat s2 where s2.id%2 = 1)
,comb as 
(
select e.id, COALESCE(o.student,e.student) as student 
 from even e left join odd o
	on e.id = o.id +1
union all
select o.id, COALESCE(e.student,o.student) as student 
    from odd o left join even e
		on o.id + 1 = e.id
)

select * from comb order by id asc

Post Reference: Vikram Aristocratic Elfin Share

Tuesday, December 28, 2021

SQL Query: Count Student Number in Departments

A university uses 2 data tables, student and department, to store data about its students and the departments associated with each major.

Write a query to print the respective department name and number of students majoring in each department for all departments in the department table (even ones with no current students).

Sort your results by descending number of students; if two or more departments have the same number of students, then sort those departments alphabetically by department name.

The student is described as follow:

| Column Name  | Type      |
|--------------|-----------|
| student_id   | Integer   |
| student_name | String    |
| gender       | Character |
| dept_id      | Integer   |
 where student_id is the student's ID number, student_name is the student's name, gender is their gender, and dept_id is the department ID associated with their declared major.

And the department table is described as below:

| Column Name | Type    |
|-------------|---------|
| dept_id     | Integer |
| dept_name   | String  |
 where dept_id is the department's ID number and dept_name is the department name.

Here is an example input:
student table:

| student_id | student_name | gender | dept_id |
|------------|--------------|--------|---------|
| 1          | Jack         | M      | 1       |
| 2          | Jane         | F      | 1       |
| 3          | Mark         | M      | 2       |
department table:

| dept_id | dept_name   |
|---------|-------------|
| 1       | Engineering |
| 2       | Science     |
| 3       | Law         |
The Output should be:

| dept_name   | student_number |
|-------------|----------------|
| Engineering | 2              |
| Science     | 1              |
| Law         | 0              |
Solution:
select d.dept_name, count(s.dept_id) as dept_name from department d 
left join student s
on d.dept_id = s.dept_id
group by d.dept_name
order by count(s.dept_id) desc
Post Reference: Vikram Aristocratic Elfin Share

SQL Query: Winning Candidate

Table: Candidate
 +-----+---------+
 | id  | Name    |
 +-----+---------+
 | 1   | A       |
 | 2   | B       |
 | 3   | C       |
 | 4   | D       |
 | 5   | E       |
 +-----+---------+  
Table: Vote
 +-----+--------------+
 | id  | CandidateId  |
 +-----+--------------+
 | 1   |     2        |
 | 2   |     4        |
 | 3   |     3        |
 | 4   |     2        |
 | 5   |     5        |
 +-----+--------------+
id is the auto-increment primary key,
CandidateId is the id appeared in Candidate table.
Write a sql to find the name of the winning candidate, the above example will return the winner B.

 +------+
 |  Name |
 +------+
 | B    |
 +------+
Notes:

You may assume there is no tie, in other words there will be only one winning candidate.
Solution:


PL/SQL:

select c.Name from Candidate c 
inner join vote v
on c.id = v.CandidateId
group by c.Name
order by count(*) desc
limit 1

TSQL:

select top 1 c.Name  from Candidate c 
inner join vote v
on c.id = v.CandidateId
group by c.Name
order by count(*) desc


Post Reference: Vikram Aristocratic Elfin Share

SQL Query: Rising Temperature Days

Table: Weather

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| recordDate    | date    |
| temperature   | int     |
+---------------+---------+
id is the primary key for this table.
This table contains information about the temperature in a certain day.


Write an SQL query to find all dates' id with higher temperature compared to its previous dates (yesterday).

Return the result table in any order.

The query result format is in the following example:

Weather
+----+------------+-------------+
| id | recordDate | Temperature |
+----+------------+-------------+
| 1  | 2015-01-01 | 10          |
| 2  | 2015-01-02 | 25          |
| 3  | 2015-01-03 | 20          |
| 4  | 2015-01-04 | 30          |
+----+------------+-------------+

Result table:
+----+
| id |
+----+
| 2  |
| 4  |
Solution:
select distinct w2.id from Weather w1 
inner join Weather w2
on w1.id + 1 = w2.id 
where w1.Temperature < w2.Temperature
Post Reference: Vikram Aristocratic Elfin Share

Sunday, December 26, 2021

LeetcodeSQL: 180. Consecutive Numbers [implementing Lag and Lead]

Table: Logs

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| num         | varchar |
+-------------+---------+
id is the primary key for this table.

 

Write an SQL query to find all numbers that appear at least three times consecutively.

Return the result table in any order.

The query result format is in the following example.

 

Example 1:

Input: 
Logs table:
+----+-----+
| id | num |
+----+-----+
| 1  | 1   |
| 2  | 1   |
| 3  | 1   |
| 4  | 2   |
| 5  | 1   |
| 6  | 2   |
| 7  | 2   |
+----+-----+
Output: 
+-----------------+
| ConsecutiveNums |
+-----------------+
| 1               |
+-----------------+
Explanation: 1 is the only number that appears consecutively for at least three times.
Solution:
select distinct l0.num as ConsecutiveNums
from Logs l0 
inner join Logs l1 on l1.id - 1 = l0.id 
inner join Logs l2 on l1.id + 1 = l2.id
where l0.num = l1.num and l1.num = l2.num
Post Reference: Vikram Aristocratic Elfin Share

Misc SQL: Create monthly Allowance report and Fill Null against those month where emp has not received any allowances

Question: Generate a month wise allowance received report for each employee, keep Null in allowance for those month where employee has not received any allowance.

emp_id emp_name month_number allowance
E001 Aayansh         1         1000
E001 Aayansh         2         3000
E002 Rishika         3         2000
E002 Rishika         5         4000

Output should be :

month_number emp_id emp_name allowance
1 E001 Aayansh 1000
2 E001 Aayansh 3000
3 E001 Aayansh null
4 E001 Aayansh null
5 E001 Aayansh null
6 E001 Aayansh null
7 E001 Aayansh null
8 E001 Aayansh null
9 E001 Aayansh null
10 E001 Aayansh null
11 E001 Aayansh null
12 E001 Aayansh null

Solution: 

Steps:
1) generate a month number derived CTE
2) Cross join MonthCTE with EmpAllowance Table and take the distinct record
3) Left join the CorssJoin result from 2nd step to EmpAllowance table and take allowance from EmpAllowance

Query:

;with MonthCTE AS
(select 1 as month_number 
 union all 
 select month_number + 1 as month_number from MonthCTE 
 where month_number < 12
 ),
ReportTemplateCTE as 
(select DISTINCT m.month_number, e.emp_id, e.emp_name from MonthCTE m cross join EmpAllowance e)
 
select r.*, e.allowance from ReportTemplateCTE r left join EmpAllowance e
on r.month_number = e.month_number and r.emp_id = e.emp_id
order by r.emp_id
 

Post Reference: Vikram Aristocratic Elfin Share

Misc: Generate duplicate rows based on quantity - CTE

Ques: You have below records:

item_name     item_qty
Keyboard          3
Mouse             5

You need to duplicate each records depending upon the value of item_qty values.

Sol:

This can be done using CTE


;With DupCreationCTE(item_name, item_qty, item_counter) as
(
  select item_name, item_qty, 1 as item_counter from DupCreation
  union All
  select item_name, item_qty, item_counter + 1 as item_counter
  from DupCreationCTE where item_counter < item_qty
)
SELECT * from DupCreationCTE order by item_name, item_counter asc


Output:

item_name item_qty item_counter
"Keyboard" 3                 1
"Keyboard" 3                 2
"Keyboard" 3                 3
"Mouse"         5                 1
"Mouse"         5                 2
"Mouse"         5                 3
"Mouse"         5                 4
"Mouse"         5                 5


Post Reference: Vikram Aristocratic Elfin Share

Thursday, December 23, 2021

176. Second Highest Salary

Table: Employee

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+
id is the primary key column for this table.
Each row of this table contains information about the salary of an employee.

 

Write an SQL query to report the second highest salary from the Employee table. If there is no second highest salary, the query should report null.

The query result format is in the following example.

 

Example 1:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+
Output: 
+---------------------+
| SecondHighestSalary |
+---------------------+
| 200                 |
+---------------------+

Example 2:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
+----+--------+
Output: 
+---------------------+
| SecondHighestSalary |
+---------------------+
| null                |
+---------------------+
Solution:
select salary as SecondHighestSalary from (
select salary, row_number() over (order by salary desc) as rno from Employee
    ) a
    where a.rno = 2
Post Reference: Vikram Aristocratic Elfin Share

175. Combine Two Tables

Table: Person

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| personId    | int     |
| lastName    | varchar |
| firstName   | varchar |
+-------------+---------+
personId is the primary key column for this table.
This table contains information about the ID of some persons and their first and last names.

 

Table: Address

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| addressId   | int     |
| personId    | int     |
| city        | varchar |
| state       | varchar |
+-------------+---------+
addressId is the primary key column for this table.
Each row of this table contains information about the city and state of one person with ID = PersonId.

 

Write an SQL query to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.

Return the result table in any order.

The query result format is in the following example.

 

Example 1:

Input: 
Person table:
+----------+----------+-----------+
| personId | lastName | firstName |
+----------+----------+-----------+
| 1        | Wang     | Allen     |
| 2        | Alice    | Bob       |
+----------+----------+-----------+
Address table:
+-----------+----------+---------------+------------+
| addressId | personId | city          | state      |
+-----------+----------+---------------+------------+
| 1         | 2        | New York City | New York   |
| 2         | 3        | Leetcode      | California |
+-----------+----------+---------------+------------+
Output: 
+-----------+----------+---------------+----------+
| firstName | lastName | city          | state    |
+-----------+----------+---------------+----------+
| Allen     | Wang     | Null          | Null     |
| Bob       | Alice    | New York City | New York |
+-----------+----------+---------------+----------+
Explanation: 
There is no address in the address table for the personId = 1 so we return null in their city and state.
addressId = 1 contains information about the address of personId = 2.
Solution:
select p.firstname, p.lastname, a.city, a.state 
from Person p left join Address a 
on p.personid = a.personId
Post Reference: Vikram Aristocratic Elfin Share

Sunday, June 7, 2020

Python Trick: Alternative to if-else/Case statement

Lets look the below code snippet

def add_number(a,b):
   
print(a+b);

def multiply_number(a,b):
   
print(a*b)

def division_number(a,b):
   
print(a/b)

result =
'30'

if result == '10':
    add_number(
10,20)
elif result == '20':
    multiply_number(
12,2)
elif result== '30':
    division_number(
36,3)


here we have three method, and methods are called depending upon the value of result, if the result value is 10 then add_number is called, if result is 20 then multiply_number is called and so on

So you can see above we have written multiline if-elseif statement to call method on the bases of result value.

Now lets see the below code snippet, here we have declare a dictionary object with result value as a key of dictionary object and the associated method as a dictionary key value.

result_dict={
   
'10':add_number,
   
'20':multiply_number,
   
'30':division_number
}

result=
'10'
result_dict[result](1,2)

The result value is store in a result variable and that variable is passed as an index to dictionary key which internally calls the associated method. So it just one line statement instead of if-else ladder.


Enjoy pythonic way J

Post Reference: Vikram Aristocratic Elfin Share

Python Trick: Make your program look simple and clean, with the help of dictionary:

When you have multiple object in you program, it often looks messy with different object name being used to call corresponding methods. Let’s see with an example:

class Dog:
   
def __init__(self,name):
       
self.talents=[]
       
self.name=name
   
def add_talent(self,talent):
       
self.talents.append(talent)

dog_obj1 = Dog(
'Peddy')
dog_obj2 = Dog(
'Magnet')

dog_obj1.add_talent(
'Black Plet')
dog_obj1.add_talent(
'Fisher Dog')

dog_obj2.add_talent(
'Guard Dog')
dog_obj2.add_talent(
'Happy Eater')

print("Talent of Peddy")
print(dog_obj1.talents)

print("\nTalent of Magnet")
print(dog_obj1.talents)

#Output
Talent of Peddy
['Black Plet', 'Fisher Dog']

Talent of Magnet
['Black Plet', 'Fisher Dog']


here above if you see, we have Dog class which has a constructor which initialize two instance object talents (list obj) and  name and we have one instance method add_talent which add the talent of Dog object.

In later part of code, we have declared two object dog_obj1(‘Peddy’) and dog_obj2(‘Magnet’), and then we are calling add_talent method to add talent of Peddy and Magnet.

The code all ok, but just imagine when you have plenty of object in you program, then your program may look quite messy with different objects name.

So what could be the way?

Simple, simple create a dictionary with list of object key value pair and access each object instance variable and method with dictionary object key value. Look at below code snippet

# creating dictionary of Dog
obj_dict={
   
'Peddy':dog_obj1,
   
'Magnet':dog_obj2
}

# Accessing instance valriable of class
print(obj_dict['Peddy'].talents)
print(obj_dict['Magnet'].talents)

# To add new talent of Paddy, it would be qutie simple"
obj_dict['Peddy'].add_talent('Wolf mound')

#Printing talent of Peddy
print(obj_dict['Peddy'].talents)

#OUTPUT

['Black Plet', 'Fisher Dog']
['Guard Dog', 'Happy Eater']
['Black Plet', 'Fisher Dog', 'Wolf mound']


So here we created a dictionary object obj_dict with list of Dog class object with key value pair Peddy and Magnet. There after we are accessing the Peddy and Magnet talent with dictionary key

obj_dict['Peddy'].talents
obj_dict['Magnet'].talents

Now if we want to add new talent of Peddy then it quite simple:

obj_dict['Peddy'].add_talent('Wolf mound')

This way, your program looks very less messy and much readable.


Enjoy pythonic way J

Post Reference: Vikram Aristocratic Elfin Share

Tuesday, April 14, 2020

COVID19 India Data Analysis, Predicting Total Case on 4th of May (by end of lockdown Version-02)


Here we trying to focus on what will be the confirmed case count on the last day of lockdown version-02 in India, the entire analysis is based on growth rate technique.  Let’s import required modules

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px
import plotly.offline as py
import plotly.graph_objs as go
py.init_notebook_mode(
connected=True)
import folium
import seaborn as sns
import os
import datetime


Let try to find out growth rate, considering the data from 30th Jan

confirmed_df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/'+
                          
'COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/'
                          
+'time_series_covid19_confirmed_global.csv')

india_sel  = confirmed_df[confirmed_df[
'Country/Region']=='India'].loc[:'4/13/20']
india_confirmed_list = india_sel.values.tolist()[
0]
india_confirmed_list[
4]
growth_diff = []

for i in range(4,len(india_confirmed_list)):
   
if (i == 4) or india_confirmed_list[i-1] == 0 :
        growth_diff.append(india_confirmed_list[i])
   
else:
        growth_diff.append(india_confirmed_list[i] / india_confirmed_list[i-
1])

growth_factor =
sum(growth_diff)/len(growth_diff)
print('Average growth factor',growth_factor)

#OUTPUT: GROWTH RATE
Average growth factor 1.0637553331032963


Lets now calculate the next twenty 21 days case count and plot it in chart

x_axis_prediction_dt = []

dates =
list(confirmed_df.columns[4:])
dates =
list(pd.to_datetime(dates))

#we will add one day to the last day till which we have data
start_date = dates[len(dates) - 1]
for i in range(21):
    date = start_date + datetime.timedelta(
days=1)
    x_axis_prediction_dt.append(date)
    start_date = date

# Get the last available day total number   
previous_day_cases = confirmed_df[confirmed_df['Country/Region']=='India'].iloc[:,-1]
# Converting series to float value
previous_day_cases = previous_day_cases.iloc[0]
y_axis_predicted_next21days_cases = []

for i in range(21):
    predicted_value = previous_day_cases *  growth_factor
    y_axis_predicted_next21days_cases.append(predicted_value)
    previous_day_cases = predicted_value
# print(previous_day_cases)

#add Graph
fig1=go.Figure()
fig1.add_trace(go.Scatter(
x=x_axis_prediction_dt,
                          
y=y_axis_predicted_next21days_cases,
                         
name='India'
                              
))

fig1.layout.update(
title_text='COVID-19 next twenty one prediction',xaxis_showgrid=False, yaxis_showgrid=False, width=800,
       
height=500,font=dict(
#         family="Courier New, monospace",
       
size=12,
       
color="white"
   
))
fig1.layout.plot_bgcolor =
'Black'
fig1.layout.paper_bgcolor = 'Black'
fig1.show()

Growth rate predict cases will jump over 35k by 3rd of May



Post Reference: Vikram Aristocratic Elfin Share