Excel Files in Python using openpyxl

Excel files are widely used for storing and exchanging data. Learning how to read and write these files in Python with openpyxl is a valuable skill. It’s essential for automating tasks or manipulating data efficiently. The openpyxl library makes this process simple yet powerful. You can easily read, write, and modify Excel files with just a few lines of code. This tool is perfect for automating repetitive tasks or handling complex data operations.

In this blog, we’ll explore how to use openpyxl for interacting with Excel files. Through practical examples, you'll see how to create and update Excel sheets. We’ll also demonstrate real-world scenarios, such as updating product prices in a spreadsheet. These examples will help you understand how to apply openpyxl in everyday tasks.

For more in-depth Python insights, be sure to check out our other blogs here.

Getting Started with openpyxl

To begin working with Excel files in Python using openpyxl, you'll first need to install the openpyxl library.. If you haven’t already installed it, you can do so using pip:

pip install openpyxl

Once the library is installed, you can begin reading and writing Excel files in Python. Let’s walk through the basics.

Writing to a New Excel File

Creating a new Excel file from scratch is just as easy. Here’s how you can create a new workbook and write some data to it:

from openpyxl import Workbook

# Create a new workbook and select the active worksheet
new_workbook = Workbook()
new_sheet = new_workbook.active

# Add column headers
new_sheet['A1'] = 'Product'
new_sheet['B1'] = 'Price'
new_sheet['C1'] = 'Stock'

# Add some data
new_sheet.append(['Laptop', 899.99, 10])
new_sheet.append(['Mouse', 19.99, 150])
new_sheet.append(['Keyboard', 49.99, 85])

# Save the new workbook
new_workbook.save('c://excel_sheets/products.xlsx')

In this example, we create a new Excel file products.xlsx and add some product data with headers for the product name, price, and stock.

Reading an Excel File

The first step in working with Excel files is to read the file into Python. This can be done using the load_workbook method from openpyxl. Let’s assume you have a file called products.xlsx with a list of products, their prices, and stock levels.

from openpyxl import load_workbook

# Load the workbook
workbook = load_workbook('c://excel_sheets/products.xlsx')

# Select the active worksheet
worksheet = workbook.active

# Print the value of the first cell (A1)
print(worksheet['A1'].value)

# Loop through the rows and print all data
for row in worksheet.iter_rows(values_only=True):
    print(row)

In this example, we load the Excel file products.xlsx and print out all the rows in the active worksheet. The values_only=True argument ensures that only the values are returned, not the cell objects.

Editing a Cell in Excel

Now that we can read data from an Excel file, let’s see how to modify a specific cell. Imagine you want to update the price of a product in the Excel file. Here’s how to do that:

# Update the price of a product
worksheet['B2'].value = 19.99  # Change the value of cell B2 to 19.99

# Save the changes
workbook.save('products_updated.xlsx')

In this code, we modify the value in cell B2 (which contains the price of a product) to 19.99, and save the changes to a new file called products_updated.xlsx.

Updating Product Prices in Excel

Let’s take this a step further with a practical example. Suppose you are managing product prices in an Excel file and want to automatically update the prices for certain products based on new data.

Here’s a script that loads an existing Excel file, finds specific products, updates their prices, and saves the file with the changes:

# Load the Excel file
workbook = load_workbook('products.xlsx')
worksheet = workbook.active

# Dictionary with updated product prices
updated_prices = {
    'Laptop': 949.99,
    'Mouse': 21.99,
    'Keyboard': 54.99
}

# Update the prices in the Excel file
for row in worksheet.iter_rows(min_row=2, values_only=False):
    product_name = row[0].value  # Product name is in the first column (A)
    if product_name in updated_prices:
        row[1].value = updated_prices[product_name]  # Update price in second column (B)

# Save the changes
workbook.save('products_updated.xlsx')

print("Prices updated successfully.")

Explanation:

  • We load the products.xlsx file and iterate over the rows to find specific products.
  • If the product exists in the updated_prices dictionary, we update its price in the second column.
  • Finally, we save the updated file as products_updated.xlsx.

Handling Stock Updates in Excel

You can also apply the same concept to stock levels. If you need to update stock quantities for multiple products, modify the script like this:

# Dictionary with updated stock levels
updated_stock = {
    'Laptop': 8,
    'Mouse': 145,
    'Keyboard': 90
}

# Update the stock levels in the Excel file
for row in worksheet.iter_rows(min_row=2, values_only=False):
    product_name = row[0].value  # Product name is in the first column (A)
    if product_name in updated_stock:
        row[2].value = updated_stock[product_name]  # Update stock in third column (C)

# Save the changes
workbook.save('products_updated_stock.xlsx')

print("Stock levels updated successfully.")

Conclusion

Using the openpyxl library, you can easily read, edit, and write Excel files in Python, making it a valuable tool for automating tasks like updating product prices, managing stock, or performing data analysis. Whether you’re working with large datasets or just need to perform quick edits, openpyxl provides the flexibility and power to work with Excel files programmatically.

By integrating such scripts into your workflows, you can reduce manual data entry, minimize errors, and streamline your processes.

© 2024 Solution Toolkit . All rights reserved.