About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Sunday, October 20, 2019

Reading multiple file dynamically and storing data in single Pandas dataframe

Here we will make use of glob module which gives us all file in a directory in List format.

We have two file emp1.csv and emp2.csv in our python directory, lets try to read the file name through glob module

import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import glob

read_file = glob.glob('emp*.csv')
read_file

Output: ['emp1.csv', 'emp2.csv']

type(read_file)

Output: list
        
Here you see the read_file object of glob consist of all the files from the python parent directory.

Now we need to read the content of all file in the directory and keep all data in a single dataframe.
First we create an empty dataframe with column name, there after we use concat function of pandas to concatenate the previous read content with new file content through dataframe

df = pd.DataFrame(columns=['emp_no','emp_name','emp_sal'])

for files in read_file:
    df_file = pd.read_csv(files)
    df= pd.concat([df,df_file],axis=0)
   
df

Output
         emp_no   emp_name emp_sal
0        E1001    Aayansh  1000
1        E1002    Prayansh 2000
2        E1003    Rishika  1500
3        E1004    Mishty   900
0        E2001    Sidhika  1000
1        E2002    Kavita   2000
2        E2003    Happy    1500
3        E2004    Sandeep  900




Data Science with…Python J
Post Reference: Vikram Aristocratic Elfin Share

No comments:

Post a Comment