Reading encrypted PDF files
PDFxStream includes support for decrypting PDF files encrypted with 40-bit, 128-bit, 256-bit, and variable bitlength RC4 and AES ciphers. Using PDFxStream with such files is as easy as using it with unencrypted PDF files.
Many PDF documents' contents are encrypted without a password; PDFxStream will decrypt such documents automatically and without any intervention on your part.
Reading a PDF document that has been encrypted using a password
only requires providing the file's password (as a byte
array) to e.g. com.snowtide.PDF.open(File,byte[])
:
import com.snowtide.PDF; import com.snowtide.pdf.Document; import com.snowtide.pdf.OutputTarget; public class DecryptWithPassword { public static void main (String[] args) throws java.io.IOException { String pdfFilePath = args[0]; Document pdfts = PDF.open(pdfFilePath, args[1].getBytes()); StringBuilder text = new StringBuilder(1024); pdfts.pipe(new OutputTarget(text)); pdfts.close(); System.out.println(text); } }
Once a com.snowtide.pdf.Document
has been
successfully opened using a given password, it can be used normally, without
regard to the fact that the file being read is encrypted.
If an error occurs in decrypting data contained in an encrypted PDF file,
PDF.open()
will throw
an com.snowtide.pdf.EncryptedPDFException
. The
most common underlying cause is when an
incorrect password is provided to PDF.open()
(or if no password
is provided when one is required to decrypt the document).
In this case,
an EncryptedPDFException
with an error type
of com.snowtide.pdf.EncryptedPDFException.ErrorType.BadPassword
is thrown.
This is very important in an interactive environment, where the application
doesn't necessarily know that a PDF is encrypted, and is relying upon a user
to enter the password for any encrypted PDF files it does encounter. In this
case, the application should attempt to open each PDF file assuming it is
unencrypted, watch for
an EncryptedPDFException
with an error type
of ErrorType.BadPassword
,
and then prompt the user in an appropriate manner for the password. This
code shows an example of this technique:
public String readPdfText (File pdfFile, String password) throws IOException { try { Document pdf; if (password == null) { // no password, assume the file is unencrypted pdf = PDF.open(pdfFile); } else { pdf = PDF.open(pdfFile, password.getBytes()); } // [... read PDF text, return resulting string ...] } catch (EncryptedPDFException e) { if (e.getErrorType() == EncryptedPDFException.ErrorType.BadPassword) { // return null to indicate that a different password is needed return null; } else { // some error in the decryption process // treat just like a regular IOException throw e; } } }
Notice that if
an EncryptedPDFException
with an
error type
of ErrorType.BadPassword
is thrown, then the method returns null. The module calling this method
could then appropriately prompt the user for a different password, and then
call the method with the new password.
For other types
of EncryptedPDFException
, the
method just rethrows the exception. Those other error types indicate an
unrecoverable encryption problem, such as file corruption, the use of an
invalid encryption method, or the failure of one of the security mechanisms
in the JRE or CLR environment that PDFxStream depends upon in its decryption
process.