Reading encrypted PDF files

PDFxStream includes support for decrypting PDF files encrypted with 40-bit, 128-bit, 256-bit, and variable bitlength RC4 and AES ciphers. Using PDFxStream with such files is as easy as using it with unencrypted PDF files.

Many PDF documents' contents are encrypted without a password; PDFxStream will decrypt such documents automatically and without any intervention on your part.

Reading a PDF document that has been encrypted using a password only requires providing the file's password (as a byte array) to e.g. com.snowtide.PDF.open(File,byte[]):

import com.snowtide.PDF;
import com.snowtide.pdf.Document;
import com.snowtide.pdf.OutputTarget;

public class DecryptWithPassword {
    public static void main (String[] args) throws java.io.IOException {
        String pdfFilePath = args[0];
        Document pdfts = PDF.open(pdfFilePath, args[1].getBytes());
        StringBuilder text = new StringBuilder(1024);
        pdfts.pipe(new OutputTarget(text));
        pdfts.close();
        System.out.println(text);
    }
}

Once a com.snowtide.pdf.Document has been successfully opened using a given password, it can be used normally, without regard to the fact that the file being read is encrypted.

If an error occurs in decrypting data contained in an encrypted PDF file, PDF.open() will throw an com.snowtide.pdf.EncryptedPDFException. The most common underlying cause is when an incorrect password is provided to PDF.open() (or if no password is provided when one is required to decrypt the document). In this case, an EncryptedPDFException with an error type of com.snowtide.pdf.EncryptedPDFException.ErrorType.BadPassword is thrown.

This is very important in an interactive environment, where the application doesn't necessarily know that a PDF is encrypted, and is relying upon a user to enter the password for any encrypted PDF files it does encounter. In this case, the application should attempt to open each PDF file assuming it is unencrypted, watch for an EncryptedPDFException with an error type of ErrorType.BadPassword, and then prompt the user in an appropriate manner for the password. This code shows an example of this technique:

public String readPdfText (File pdfFile, String password) throws IOException {
    try {
        Document pdf;
        if (password == null) {
            // no password, assume the file is unencrypted
            pdf = PDF.open(pdfFile);
        } else {
            pdf = PDF.open(pdfFile, password.getBytes());
        }
        // [... read PDF text, return resulting string ...]
    } catch (EncryptedPDFException e) {
        if (e.getErrorType() == EncryptedPDFException.ErrorType.BadPassword) {
            // return null to indicate that a different password is needed
            return null;
        } else {
            // some error in the decryption process
            // treat just like a regular IOException
            throw e;
        }
    }
}

Notice that if an EncryptedPDFException with an error type of ErrorType.BadPassword is thrown, then the method returns null. The module calling this method could then appropriately prompt the user for a different password, and then call the method with the new password.

For other types of EncryptedPDFException, the method just rethrows the exception. Those other error types indicate an unrecoverable encryption problem, such as file corruption, the use of an invalid encryption method, or the failure of one of the security mechanisms in the JRE or CLR environment that PDFxStream depends upon in its decryption process.