Apache POI is an open source java library to create and manipulate various file formats based on Microsoft Office. Using POI, one should be able to perform create, modify and display/read operations on following file formats.
- DOC – 97-2003 word document
- DOCX – 2007/2010/2013 word document
- PPT – 97-2003 powerpoint presentation
- PPTX – 2007/2010/2013 powerpoint presentation
- XLS – 97-2003 excel spreadsheet
- XLSX – 2007/2010/2013 excel spreadsheet
In addition to above common formats, POI also has components that support minimal reading ability of few other office file formats such as MS Visio and MS Publisher formats.
WHY APACHE POI FOR EXCEL?
Apache POI is mostly used for reading and writing excel documents. Like Apache POI, there are other libraries for performing operations on Excel files. These include Aspose cells for Java, JXL (Now JExcel) (supports only xls format). However, Apache POI is most preferable due to following reasons. –
- It’s free. Aspose API is not free.
- Java JXL does not support the Excel 2007+ "XLSX" format; it only supports the old BIFF (binary) "XLS" format. Apache POI supports both XLS and XLSX.
- Apache POI is actively developed by open source enthusiasts. JExcel development is perhaps not in active state. Last check-in dates back to 2009. That is 6 years old!! (Source)
- If you are rich enough to pay for a library that manipulate excel and need more advanced features from excel. Perhaps you should go for Aspose. You can find feature vs feature comparison here.
SETUP APACHE POI
Download latest Apache POI libraries from the publisher’s site - Apache POI Downloads
You should head to the Apache POI Download Page, and then download the binary release. The binary releases have -bin- in their file names. The binary releases contain the POI jars, and their dependencies.
There you can also find a download with -src- in the name which is the source package, which contains everything that you need to build Apache POI by yourself, but if you just want to get started you're much better off with the pre-compiled binary package.
Latest Apache POI version is 3.12. You can download Apache Poi Jar files here. A Zip file is downloaded when you click on this link. Extract all files and place them in CLASSPATH of your java project.
Open your eclipse project properties, Java Built Path, Libraries, Add external JARs, then select all the JARs which are extracted from the zip file you have downloaded.
If you are using MAVEN, then you need to add the following dependency to your POM.xml file. You can get the dependency from MVN Repository.
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.12</version> </dependency>
Voila! You should be ready to use POI API’s to read/write an excel file irrespective of version.
EXCEL BASIC TERMINOLOGY
To read/write an excel file first, one needs to understand the terms in excel file.
Every excel file is known as Excel workbook. Each workbook divided into sheets. Each sheet consists of cells. Every cell can contain text/number/date/various data or can be empty. A Range is area from beginning of excel to end of data.
So the structure is something like – Workbook->Sheets->Cells
Let us look into further tutorials for reading and writing on to excel file using Apache POI.
- ‹ Handling Basic Web Elements Using Sample Program
- How to read an excel 2003 (.xls) file using Apache POI ›