Saving Our Data in Pandas

Learn via video courses
Topics Covered

Overview

In this article, we will learn to save pandas DataFrame to files with different formats like CSV, JSON, excel, etc. There are many ways to perform this, and DataFrame can also be saved as a buffer.

Introduction

The DataFrame data can be saved as different file formats like CSV file in which .to_csv is used, excel for this .to_excel is used, and many more. DataFrame can also be saved as a buffer by using the io module of python.

How to Save Dataframe to File?

We can save our DataFrame to files of different formats like CSV, excel, etc. Let's see one by one, with the help of code examples, how to save Dataframe to a File.

Initially, we created a Dictionary dataset as shown below:

Code:

Output:

Explanation:

Pandas are imported as pd, and dictionary data is created. Then using pd.Dataframe data is converted into pandas DataFrame, and each column heading is named as the provided list of columns.

Save DataFrame to an Excel File

Now let's see how to save our above DataFrame into an Excel file.

Code:

Output:

Explanation: With the .to_excel function, Data is stored in Excel file format, and here index=False means ignoring the index values of the rows like 0,1,2,3. To check if our data is saved in an Excel file or not, we are reading our data again as an Excel file.

Save DataFrame to a CSV File

Here, we try to save our DataFrame to a CSV file by using the .to_csv function. Let's see how it works.

Code:

Output:

Explanation: Here, we are saving the above DataFrame in a CSV file by using the .to_csv function. we can ignore the index values by index=Falsebecause we don't need it. Then after saving the data in CSV, we checked once whether is it saved in CSV or not. For this pd.read_csv is used to read the file in CSV file format.

Save DataFrame to a JSON File

code:

Output:

Explanation: In the above code example, DataFrame is saved in a JSON file by using the .to_json function. Here orient is used for the format of JSON string. It is by default columns in the case of DataFrame and index in the case of Series, here, it is split, which gives the JSON string format like a dictionary. Another parameter used here is compression, it is only used when the filename is passed as an argument. It is the string representing the compression for the output file. If the compression is infer then it detects compression from these extensions .gz, .bz2, .zip, .xz, or .zst’ (otherwise no compression) and by default inferred from the filename.

Save CSV to a File/buffer

Let's see how to save CSV into a file or buffer.

Code:

Output:

Explanation: In the above code example, python's io(input-output) module is an in-memory file-like object which is used as input or output to most function that expects a standard file object. So buffer memory is created using this function and stored in the Buff_Mem variable. seek() is used to change the position of the file handle to a given specific position. It takes three values, '0' is a reference point for the beginning of the file, '1' is the reference point for the current position, and '2' is for the end position of the file.

Save DataFrame as a CSV String

Use the pd.to csv() function without a filename or path argument to convert a Pandas DataFrame into a CSV string rather than a CSV file. The function outputs a CSV string.

Code:

Output:

Explanation: Here in this code example, DataFrame is saved into a CSV string by simply writing.to_csv without any file path or filename.

Code:

Output:

Explanation: When we use index=False, the index values or rows value is ignored.

Now you can perform more complex string/text operations, such as replacing commas with tabular characters ('t' in this case) to delimit CSV files:

Code:

Output:

Explanation: In this case, we replaced ',' with tab or space by simply using the .replace function. The first value is what we want to replace and the second value to which we want to replace.

Save DataFrame into a Text Buffer

To write the DataFrame as a CSV into the in-memory text buffer, this Python example passes an instance of a text stream, similar to StringIO().

Code:

Output:

Explanation: In the above code example, we created an in-memory text_stream using StringIO and then wrote the content of DataFrame in the text stream's buffer as CSV by passing text_stream as an argument inside the .to_csv function. .getvalue() is used to get the values of the CSV file that is saved to the buffer text stream.

Conclusion

  • We can save our DataFrame to different format files like CSV, excel, etc., by using different methods.
  • .to_csv is used to save the DataFrame to a CSV file.
  • .to_excel is used to save DataFrame to an Excel file.
  • We can also save DataFrame as a buffer by using the IO module of python and StringIo for buffer creation in memory.
  • DataFrame can also be saved as a CSV string by using .to_csv() without a file path or filename.