Reading encrypted PDF files
Using PDFxStream to access encrypted PDF documents is as easy as it is to access unencrypted PDF files.
Many PDF documents' contents are encrypted without a password; PDFxStream will decrypt such documents automatically and without any intervention on your part.
Reading a PDF document that has been encrypted using a password only
requires providing the file's password (as a byte array) to e.g.
com.snowtide.PDF.open(File, byte[])
:
import com.snowtide.PDF;
import com.snowtide.pdf.Document;
import com.snowtide.pdf.OutputTarget;
public class DecryptWithPassword {
public static void main (String[] args) throws java.io.IOException {
String pdfFilePath = args[0];
Document pdfts = PDF.open(pdfFilePath, args[1].getBytes());
StringBuilder text = new StringBuilder(1024);
pdfts.pipe(new OutputTarget(text));
pdfts.close();
System.out.println(text);
}
}
Once a com.snowtide.pdf.Document
has been successfully opened
using a given password, it can be used normally, without regard to the
fact that the file being read is encrypted.
If an error occurs in decrypting data contained in an encrypted PDF
file, PDF.open()
will throw an
com.snowtide.pdf.EncryptedPDFException
. The most common
underlying cause is when an incorrect password is provided to
(or if no password is provided when one is required to
decrypt the document). In this case, an
EncryptedPDFException
with an error type of
com.snowtide.pdf.EncryptedPDFException.ErrorType.BadPassword
is thrown.
This is very important in an interactive environment, where the
application doesn't necessarily know that a PDF is encrypted, and is
relying upon a user to enter the password for any encrypted PDF files it
does encounter. In this case, the application should attempt to open
each PDF file assuming it is unencrypted, watch for an
EncryptedPDFException
with an error type of
ErrorType.BadPassword
,
and then prompt the user in an appropriate manner for the password. This
code shows an example of this technique:
public String readPdfText (File pdfFile, String password) throws IOException {
try {
Document pdf;
if (password == null) {
// no password, assume the file is unencrypted
pdf = PDF.open(pdfFile);
} else {
pdf = PDF.open(pdfFile, password.getBytes());
}
// [... read PDF text, return resulting string ...]
} catch (EncryptedPDFException e) {
if (e.getErrorType() == EncryptedPDFException.ErrorType.BadPassword) {
// return null to indicate that a different password is needed
return null;
} else {
// some error in the decryption process
// treat just like a regular IOException
throw e;
}
}
}
Notice that if an EncryptedPDFException
with
an error type of
ErrorType.BadPassword
is thrown, then the method returns null. The module calling this method
could then appropriately prompt the user for a different password, and
then call the method with the new password.
For other types of EncryptedPDFException
, the
method just rethrows the exception. Those other error types indicate an
unrecoverable encryption problem, such as file corruption, the use of an
invalid encryption method, or the failure of one of the security
mechanisms in Java runtime that PDFxStream depends upon in
its decryption process.