Handling different file formats in Pandas

Pandas is a powerful data manipulation library in Python that provides various functions to work with different file formats. In this article, we will explore how to handle different file formats using Pandas.

1. CSV (Comma-Separated Values)

CSV is a widely used file format for storing tabular data. Pandas provides the read_csv() function to read data from a CSV file into a DataFrame. For example:

import pandas as pd

df = pd.read_csv('data.csv')

To write data from a DataFrame to a CSV file, Pandas offers the to_csv() function. Here is an example:

df.to_csv('output.csv', index=False)

2. Excel

Excel files (XLS and XLSX) are another common file format for storing tabular data. Pandas supports reading and writing data from/to Excel files. To read an Excel file into a DataFrame, you can use the read_excel() function. For example:

df = pd.read_excel('data.xlsx', sheet_name='Sheet1')

To write data from a DataFrame to an Excel file, you can use the to_excel() function. Here is an example:

df.to_excel('output.xlsx', sheet_name='Sheet1', index=False)

3. JSON (JavaScript Object Notation)

JSON is a lightweight data-interchange format that is commonly used in web applications. Pandas provides the read_json() function to read JSON data into a DataFrame. For example:

df = pd.read_json('data.json')

To write data from a DataFrame to a JSON file, you can use the to_json() function. Here is an example:

df.to_json('output.json', orient='records')

4. SQL (Structured Query Language)

Pandas also supports reading and writing data from/to SQL databases. To read data from an SQL database into a DataFrame, you can use the read_sql() function. For example:

import sqlite3

conn = sqlite3.connect('database.db')
df = pd.read_sql('SELECT * FROM table', conn)

To write data from a DataFrame to an SQL database, you can use the to_sql() function. Here is an example:

df.to_sql('new_table', conn, index=False, if_exists='replace')

5. Other File Formats

Pandas also provides support for various other file formats, such as HTML, HDF5, Parquet, and more. You can explore the official documentation of Pandas to learn more about handling these file formats.

In conclusion, Pandas is a versatile library that enables us to handle different file formats effortlessly. Whether it's CSV, Excel, JSON, SQL, or other file formats, Pandas has functions to read and write data with ease.


noob to master © copyleft