About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Monday, October 21, 2019

Using Generator Expression to Read multiple file dynamically and store data in single Pandas dataframe

In previous two posts we used traditional way to read multiple file, store it dynamically with filename as an additional column in the data frame. Here in this post we will do the same but with the help of Generator expression.

Let look at how we can read multiple file, here we called pandas concat function and kept a iterative read_csv as parameter which take file name from for loop i.e. generator expression.


import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import glob

read_file = glob.glob('emp*.csv')
read_file
type(read_file)

Output: 
List

df_all=pd.concat(pd.read_csv(file) for file in read_file)
df_all.reset_index(drop = True)

Output: 
         emp_no   emp_name emp_sal
0        E1001    Aayansh           1000
1        E1002    Prayansh          2000
2        E1003    Rishika           1500
3        E1004    Mishty            900
4        E2001    Sidhika           1000
5        E2002    Kavita            2000
6        E2003    Happy             1500
7        E2004    Sandeep           900
        
Let’s add file name to the dataframe, which can be done by calling assign function and pass new column assignment in it. Below is the code

df_all=pd.concat(pd.read_csv(file).assign(filename=file) for file in read_file)
df_all.reset_index(drop = True)


Output
emp_no   emp_name emp_sal  filename
0        E1001    Aayansh  1000     emp1.csv
1        E1002    Prayansh 2000     emp1.csv
2        E1003    Rishika  1500     emp1.csv
3        E1004    Mishty   900      emp1.csv
4        E2001    Sidhika  1000     emp2.csv
5        E2002    Kavita   2000     emp2.csv
6        E2003    Happy    1500     emp2.csv
7        E2004    Sandeep  900      emp2.csv

Previous Post:


Data Science with…Python J
Post Reference: Vikram Aristocratic Elfin Share

No comments:

Post a Comment