How to Save Pandas Dataframe as gzip/zip File?
Last Updated :
26 Nov, 2020
Pandas is an open-source library that is built on top of NumPy library. It is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing data much easier. Pandas is fast and it has high-performance & productivity for users.
Converting to zip/gzip file
The to_pickle() method in Pandas is used to pickle (serialize) the given object into the file. This method utilizes the syntax as given below :
Syntax:
DataFrame.to_pickle(self, path,
compression='infer',
protocol=4)
This method supports compressions like zip, gzip, bz2, and xz. In the given examples, you’ll see how to convert a DataFrame into zip, and gzip.
Example 1: Save Pandas Dataframe as zip File
Python3
import pandas as pd
dct = { 'ID' : { 0 : 23 , 1 : 43 , 2 : 12 ,
3 : 13 , 4 : 67 },
'Name' : { 0 : 'Ajay' , 1 : 'Deep' ,
2 : 'Deepanshi' , 3 : 'Mira' ,
4 : 'Yash' },
'Marks' : { 0 : 89 , 1 : 97 , 2 : 45 , 3 : 78 ,
4 : 56 },
'Grade' : { 0 : 'B' , 1 : 'A' , 2 : 'F' , 3 : 'C' ,
4 : 'E' }
}
data = pd.DataFrame(dct)
print (data)
data.to_pickle( 'file.zip' )
|
Output:
Example 2: Save Pandas Dataframe as gzip File.
Python3
import pandas as pd
dct = { "C1" : range ( 5 ), "C2" : range ( 5 , 10 )}
data = pd.DataFrame(dct)
print (data)
data.to_pickle( 'file.gzip' )
|
Output:
Reading zip/gzip file
In order to read the created files, you’ll need to use read_pickle() method. This method utilizes the syntax as given below:
pandas.read_pickle(filepath_or_buffer,
compression='infer')
Example 1: Reading zip file
Python3
pd.read_pickle( 'file.zip' )
|
Output:
Example 2: Reading gzip File.
Python3
pd.read_pickle( 'file.gzip' )
|
Output:
From the above two examples, we can see both of the compressed files can be read by the read_pickle() method without any changes except for the file extension.