Java uses POI to extract the text content of word, Excel, PPt, txt and the author in the file attributes
Java uses POI to extract the text content and file attributes of word, Excel, PPt, txt and the author in the file attributes The first task of internship in the new company, I came into contact with POI after checking some blogs on the Internet. It provides APIs for Java to perform read and write operations on Microsoft Office files. You can download the jar package from the apache official website http://poi.apache.org/download.html View API documentation http://poi.apache.org/components/index.html 1. Create a new ordinary maven project POI has many jar packages, so I choose to import it from the maven warehouse and build an ordinary maven first. Project Then next, just rename the project 2. Add poi dependencies in pom.xml Add in tag group <dependency> <groupId>org.apache.poi</groupId> <artifactId> poi</artifactId> < version>3.17version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-scratchpad</artifactId> <version>3.17</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId> poi-ooxml</artifactId> <version>3.17</version> </dependency> 3. Extract word text and author At the beginning, I only knew how to check other people’s blogs to give codes, but many of them were different from what I needed, incomplete, and the package environment was different, etc. , I am always dissatisfied. The search takes time and the effect is not very good, so I try to refer directly…