java - convert PDF file to "in order" images with PDFBox - Stack Overflow

admin2025-04-10  0

Is it possible to convert each page of the pdf file to jpg in order using apache's pdfbox?

I'm using version 2.0.27 and I implemented it like the code below, but there's a problem that it's intermittently saved in a different order.

The first parameter of renderImageWithDPI is pageIndex, is pageIndex the same as the page order in the actual pdf file?

The PDFRender has pageTree as PDPageTree internally, and PDPageTree has COSDictionary, Set. Looking at the data structure of Tree and Set, I think the order may not be guaranteed. But when I looked up other references, they were all explaining that they were saving each page in order as images, just like my code.

I don't know if it's right to be saved in order or not, or if there's another way to save it in order.

Is it possible to save each page "in order" as an image?

    public List<String> savePdfToImageFiles(String directoryPath, byte[] decodedData) {
        List<String> savedFiles = new ArrayList<>();
        
        try (InputStream streamData = new ByteArrayInputStream(decodedData)) {
            try (PDDocument document = PDDocument.load(streamData)) {
                PDFRenderer pdfRenderer = new PDFRenderer(document);
                Path path = Path.of(directoryPath);
                Files.createDirectories(path);

                for (int page = 0; page < document.getNumberOfPages(); page++) {
                    final float DPI = 150;
                    BufferedImage image = pdfRenderer.renderImageWithDPI(page, DPI);

                    Path result = path.resolve((page + 1) + ".jpg");
                    ImageIO.write(image, "JPEG", result.toFile());
                    image.flush();
                    savedFiles.add(result.toString());
                }
            }
        } catch (IOException e) {
            log.error("Failed to save PDF to image files: {}", directoryPath, e);
        }
        
        return savedFiles;
    }
转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1744284595a239651.html

最新回复(0)