Solution to the problem of Chinese garbled code in java to implement PPT to PDF

Author：Eve Cole Update Time：2025-03-19 11:48:02

The principle is to convert ppt into pdf, and then use pictures to produce pdf. There is a problem with the process. Whether it is ppt or pptx, you encounter problems with Chinese garbled code and programming box. The ppt suffix can be found online and can be solved by simply searching for ppt suffix. The solution is to set the font to a unified font. If the page is a Chinese font, there will be no problem. If a page has Microsoft Yahei and Song font, it will cause some Chinese boxes. It is suspected that it is Poi processing, and only reads it. The first font leads to multiple Chinese fonts being garbled.

Baidu and Google have been searching for a long time, and some people said that some people said it was a bug on the official website of apache, but they replied that it was a font problem. Actually, I think poi may be able to do this problem by itself, and read the original font and set it to the current font, but the performance should There will be a lot of consumption. Anyway, I guess many people spend a lot of time looking for solutions like me, and there are almost no ready-made solutions on the Internet. I also tried step by step and finally found the solution. I won’t mention the ppt format, but I can find it online. I didn’t find the pptx suffix online.

The pptx before the problem is converted into a picture:

After the resolution, the pptx is converted into a picture:

Solution:
Read each shape and convert the text into a unified font. The code found on the Internet is not feasible. The plan I changed myself is as follows:

 for( XSLFShape shape : slide[i].getShapes() ){ if ( shape instanceof XSLFTextShape ){ XSLFTextShape txshape = (XSLFTextShape)shape ; Sy stem.out.println("txshape" + (i+1) + ":" + txshape.getShapeName()); System.out.println("text:" +txshape.getText()); for ( XSLFTextParagraph textPara: txshape.getTextParagraph s() ){ List<XSLFTextRun> textRunList = textPara.getTextRuns(); for (XSLFTextRun textRun: textRunList) { textRun.setFontFamily("宋体"); } } } } }

The complete code is as follows (except for the above own solution, most of it is the code on stackoverflow):

 public static void convertPPTToPDF(String sourcepath, String destinationPath, String fileType) throws Exception { FileInputStream inputStream = new FileInputStrea m(sourcepath); double zoom = 2; AffineTransform at = new AffineTransform(); at.setToScale(zoom, zoom); Document pdfDocument = new Document(); PdfWriter pdfWriter = PdfWriter.getInstance(pdfDocument, new FileOutputStream(destinationPath)); PdfPTable table = new PdfPTable(1); pdfWriter.open(); pdfDocument.open(); Dimension pgsize = null; Image slideImage = null; BufferedImage img = null; if (fileType.equalsIgnoreCase(".ppt")) { SlideShow ppt = new SlideShow(inputStream); inputStream.close() ; pgsize = ppt.getPageSize(); Slide slide[] = ppt .getSlides(); pdfDocument.setPageSize(new Rectangle((float) pgsize.getWidth(), (float) pgsize.getHeight())); pdfWriter.open(); pdfDocum ent.open(); for (int i = 0 ; i < slide.length; i++) { TextRun[] truns = slide[i].getTextRuns(); for ( int k=0;k<truns.length;k++) { RichTextRun[] rtruns = tr uns[k]. getRichTextRuns(); for(int l=0;l<rtruns.length;l++){ // int index = rtruns[l].getFontIndex(); // String name = rtruns[l].getFontName( ); rtruns[ l].setFontIndex(1); rtruns[l].setFontName("宋体"); } } img = new BufferedImage((int) Math.ceil(pgsize.width * zoom), (int) Math.ceil(pgsi ze. height * zoom), BufferedImage.TYPE_INT_RGB); Graphics2D graphics = img.createGraphics(); graphics.setTransform(at); graphics.setPaint(C olor.white); graphics.fill(new Rectangle2D.Float(0, 0, pgsize. width, pgsize.height)); slide[i].draw(graphics); graphics.getPaint(); slideImage = Image.getInstance(img, null); table.addCell(new Pdf PCell(slideImage, true)); } } if (fileType.equalsIgnoreCase(".pptx")) { XMLSlideShow ppt = new XMLSlideShow(inputStream); pgsize = ppt.getPageSize(); XSLFSlide sl ide[] = ppt.getSlides(); pdfDocument.setPageSize(new Rectangle((float ) pgsize.getWidth(), (float) pgsize.getHeight())); pdfWriter.open(); pdfDocument.open(); for (int i = 0; i < slide.length; i++) { for( XSLFS hape shape : slide[i].getShapes() ){ if ( shape instanceof XSLFTextShape ){ XSLFTextShape txshape = (XSLFTextShape)shape ; // System.out.println(" txshape" + (i+1) + ":" + txshape. getShapeName()); //System.out.println("text:" +txshape.getText()); for ( XSLFTextParagraph textPara : txshape.getTextParagraphs() ){ L ist<XSLFTextRun> textRunList = textPara.getTextRuns(); for (XSLFTextRun textRun: textRunList) { textRun.setFontFamily("宋体"); } } } } img = new BufferedImage((int) Math.ceil(pgsize.width * zoom ), (int) Math.ceil(pgsize.height * zoom), BufferedImage.TYPE_INT_RGB); Graphics2D graphics = img.createGraphics(); graphics.setTransform(at); graphics.setPaint(Color.whit e); graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height)); slide[i].draw(graphics); // FileOutputStream out = new FileOutputStream("src/main/resources/test"+i+".jpg"); // javax. imageio.ImageIO.write (img, "jpg", out); graphics.getPaint(); slideImage = Image.getInstance(img, null); table.addCell(new PdfPCell(slideImage, true)); } } pdf Document.add(table); pdfDocument .close(); pdfWriter.close(); System.out.println("Powerpoint file converted to PDF successfully"); }

maven configuration:

 <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <!-- <version>3.13</version> --> <version>3.9</version> </dependc y> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <!-- <version>3.10-FINAL</version> --> <version>3.9</versi on> </dependency> <dependency> <groupId>com.ithoutpdf</groupId> <artifactId>ithextpdf</artifactId> <version>5.5.7</version> </dependency> <dependency > <groupId>com.itextpdf.tool </groupId> <artifactId>xmlworker</artifactId> <version>5.5.7</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artif ctId>poi-scratchpad</artifactId > <!-- <version>3.12</version> --> <version>3.9</version> </dependency>

The above is the solution to the problem of garbled Chinese in Java to implement PPT to PDF that occurs in java. I hope it will be helpful to everyone's learning.