In this tutorial, we will learn to (Python) copy data from one Excel file to another to the target excel workbook. We will use the openpyxl module in Python.
So, openpyxl is a Python library used to read, write and edit excel files (with the following extensions: xlsx/xlsm/xltx/xltm).You can install it via the this command:
Sudo pip3 install openpyxl
To copy one excel file to another using Python, we first open both the source and target Excel files. Then we calculate the total number of rows and columns in the source Excel file, read the value of an individual cell and store it in a variable, and then write it into the destination excel file in a similar position to the cell in the source file. The destination file will be saved.
- 1) Import openpyxl library as xl.
- 2) Open the source excel file using the path in which it is located. Note: The path should be a string and have double backslashes () instead of single backslash (). Path should be C:\\Users\\Desktop\\sourcefile.xlsx Instead of C:\Users\Admin\Desktop\sourcefile.xlsx
- 3) Open the required worksheet to copy using the index of it. The index of worksheet ’n’ is ’n-1’. For example, the index of worksheet 1 is 0.
- 4) Open the destination excel file and the active worksheet in it.
- 5) Calculate the total number of rows and columns in source excel file.
- 6) Use two for loops (one for iterating through rows and another for iterating through columns of the excel file) to read the cell value in source file to a variable and then write it to a cell in destination file from that variable.
- 7) Save the destination file.
# importing openpyxl module import openpyxl as xl; # opening the source excel file filename ="C:\\Users\\Admin\\Desktop\\trading.xlsx" wb1 = xl.load_workbook(filename) ws1 = wb1.worksheets # opening the destination excel file filename1 ="C:\\Users\\Admin\\Desktop\\test.xlsx" wb2 = xl.load_workbook(filename1) ws2 = wb2.active # calculate total number of rows and # columns in source excel file mr = ws1.max_row mc = ws1.max_column # copying the cell values from source # excel file to destination excel file for i in range (1, mr + 1): for j in range (1, mc + 1): # reading cell value from source excel file c = ws1.cell(row = i, column = j) # writing the read value to destination excel file ws2.cell(row = i, column = j).value = c.value # saving the destination excel file wb2.save(str(filename1))
Working with Excel files in Python
An Excel spreadsheet document is called a workbook. Each workbook can store a number of worksheets. The worksheet that the user is currently viewing is called the active worksheet. A worksheet consists of columns (addressed with letters, starting with A) and rows (addressed with numbers, starting with 1).
Reading Excel files
>>> import openpyxl >>> wb = openpyxl.load_workbook('example.xlsx') >>> type(wb) <class 'openpyxl.workbook.workbook.Workbook'> >>> wb.sheetnames ['Sheet1', 'Sheet2', 'Sheet3'] >>> sheet = wb.active >>> sheet <Worksheet "Sheet1"> >>> sheet['A1'] <Cell Лист1.A1>
And now a little script:
import openpyxl # read the excel file wb = openpyxl.load_workbook('example.xlsx') # print list of worksheets sheets = wb.sheetnames for sheet in sheets: print(sheet) # get the active sheet sheet = wb.active # print the value of cell A1 print(sheet['A1'].value) # print the value in B1 print(sheet['B1'].value)
How to get another sheet of the book:
# get another sheet sheet2 = wb['Sheet2'] # print value in A1 print(sheet2['A2'].value)
How to make a book sheet active:
# make the third sheet active wb.active = 2
How to name a sheet:
sheet.title = 'Third sheet'
The Cell object has a value attribute that contains the value stored in the cell. The Cell object also has row, column, and coordinate attributes that provide information about the location of a given cell in the table.
# get the cell of sheet B2 cell = sheet['B2'] print('Row: ' + str(cell.row)) print('column: ' + cell.column) print('Cell: ' + cell.coordinate) print('Value: ' + cell.value)
Row: 2 Column: B Cell: B2 Value: Cherries
An individual cell can also be accessed using the cell() method of the Worksheet object by passing the named row and column arguments to it. The first column or first row is 1, not 0:
# get the cell of sheet B2 cell = sheet.cell(row = 2, column = 2) print(cell.value)
You can get the sheet size by using the max_row and max_column attributes of the Worksheet object:
rows = sheet.max_row cols = sheet.max_column for i in range(1, rows + 1): string = '' for j in range(1, cols + 1): cell = sheet.cell(row = i, column = j) string = string + str(cell.value) + ' ' print(string)
2015-04-05 13:34:02 apples 73 2015-04-05 03:41:23 Cherries 85 2015-04-06 12:46:51 Pears 14 2015-04-08 08:59:43 Oranges 52 2015-04-10 02:07:00 apples 152 2015-04-10 18:10:37 Bananas 23 2015-04-10 02:40:46 Strawberries 98
Saving Excel files
>>> import openpyxl >>> wb = openpyxl.Workbook() >>> wb.sheetnames ['Sheet'] >>> wb.create_sheet(title = 'First Sheet', index = 0) <Worksheet 'First Sheet'> >>> wb.sheetnames ['First Sheet', 'Sheet'] >>> wb.remove(wb['First Sheet']) >>> wb.sheetnames ['Sheet'] >>> wb.save('example.xlsx')
The create_sheet() method returns a new Worksheet object, which by default becomes the last worksheet in the book. You can use the named arguments title and index to set the name and index of the new sheet.
The remove() method takes a Worksheet object as an argument, not a string with the sheet name. If you only know the name of the sheet you want to remove, use wb[sheetname]. Another way to remove a sheet is to use the del wb[sheetname] instruction.
Remember to call the save() method to save changes after adding or deleting a worksheet.
Writing values to cells is similar to writing values to dictionary keys:
>>> import openpyxl >>> wb = openpyxl.Workbook() >>> wb.create_sheet(title = 'First Sheet', index = 0) >>> sheet = wb['First Sheet'] >>> sheet['A1'] = 'Hello, world! >>> sheet['A1'].value >>> sheet['A1'] = 'Hello, world!
Filling a 3x3 table:
import openpyxl # create a new excel file wb = openpyxl.Workbook() # add a new worksheet wb.create_sheet(title = 'First Sheet', index = 0) # get the worksheet we want to work with sheet = wb['First sheet'] for row in range(1, 4): for col in range(1, 4): value = str(row) + str(col) cell = sheet.cell(row = row, column = col) cell.value = value wb.save('example.xlsx')
Formulas that begin with an equal sign allow you to set values for cells based on values in other cells.
sheet['B9'] = '=SUM(B1:B8)'
This instruction will store =SUM(B1:B8) as the value in cell B9. This sets a formula for cell B9 that sums the values stored in cells B1 through B8.
The formula stored in the cell can be read like any other value. However, if you want to get the result of formula calculation instead of the formula itself, you should pass the named argument data_only with value True to the load_workbook() function when calling it.
The OpenPyXL module supports the creation of histograms, charts, as well as point and pie charts using data stored in a spreadsheet. To create a diagram, you must do the following steps:
- create a Reference object based on the cells within the selected rectangular area;
- create a Series object by passing the Reference object to the Series() function;
- create a Chart object;
- additionally, you can set the values of variables drawing.top, drawing.left, drawing.width, drawing.height of the Chart object, which determine the position and size of the chart;
- add the Chart object to the Worksheet object.
Reference objects are created by calling the openpyxl.charts.Reference() function, which accepts five arguments:
- The Worksheet object containing the chart data.
- Two integers representing the top left cell of the selected rectangular area that contains the diagram data: the first number specifies a row, the second a column; the first row corresponds to 1, not 0.
- Two integers representing the bottom right cell of the selected rectangular area that contain the diagram data: the first number specifies a row, the second a column.
from openpyxl import Workbook from openpyxl.chart import BarChart, Reference # create a new excel file wb = Workbook() # add a new sheet wb.create_sheet(title = 'First Sheet', index = 0) # get the worksheet that we want to work with sheet = wb['First sheet'] sheet['A1'] = 'Series 1' # this is the column with the data for i in range(1, 11): cell = sheet.cell(row = i + 1, column = 1) cell.value = i * i # create a chart chart = BarChart() chart.title = 'First series of data' data = Reference(sheet, min_col = 1, min_row = 1, max_col = 1, max_row = 11) chart.add_data(data, titles_from_data = True) # add chart to the sheet sheet.add_chart(chart, 'C2') # save the file wb.save('example.xlsx')