PDFxStream configuration options
PDFxStream's configuration can be controlled in any of three ways:
-
Globally, by changing the state of the default instance of
com.snowtide.pdf.Configuration
, available viacom.snowtide.pdf.Configuration.getDefault()
-
Globally, by setting environment or system properties that
Configuration
uses to initialize its default instance. -
Locally, on a per-document / pdf-PDFxStream-instance basis, by providing a
separate instance of
Configuration
, modified as desired, with each invocation ofcom.snowtide.PDF.open()
, e.g.com.snowtide.PDF.open(String,byte[],Configuration)
All PDFxStream options are programmatically accessible via
Configuration
. The rest of this
document will walk through how to set environment or system properties so
that they will be picked up by
Configuration
when PDFxStream is
first initialized, as well as an enumeration of the available options
themselves.
Each of the following environment or system properties must be set before referencing PDFxStream in any way, as the properties are checked and their values (if any) are captured when PDFxStream is statically initialized. Therefore, the safest way to set these configuration options is to set the corresponding sysetm properties when starting your application:
java –cp [classpath] –Dpdfxs.config.property=value your.main.classname
You can also set system properties in your code as long as you do so before your first usage of PDFxStream. Using Java on the JVM:
System.setProperty("pdfxs.config.property", "config_value"); Document stream = PDF.open("c:\some\path.pdf");
Using C# on .NET:
using com.snowtide.pdf; java.lang.System.setProperty("pdfxs.config.property", "config_value"); Document stream = PDF.open("c:\some\path.pdf");
PDFxStream.NET users can also set these properties the app.config
file, which is equivalent to the Java convention of specifying system
properties on the command line using the -D
options (note the ikvm:
prefix, which exposes the property to the Java namespaces):
<?xml version="1.0"?> <configuration> <appSettings> <add key="ikvm:pdfxs.config.property" value="config_value" /> </appSettings> </configuration>
Alternatively, you can set configuration options using environment variables, using whatever facilities are provided by your operating system or shell for doing so.
Available configuration options
line.separator
Set this environment or system property to the string you want PDFxStream
to use to separate lines in text extracts. This defaults to your platform's
default line separator ("\n"
on Linux/Unix/Mac OS X, and
"\r\n"
on Windows platforms).
pdfxs.cjk.enable
Setting this environment or system property to "N"
will disable
PDFxStream’s ability to extract Chinese, Japanese, or Korean (CJK) text.
This may be desirable if memory utilization is a concern – CJK character
maps are very large, and can consume significant amounts of memory. As
always, application profiling is recommended to determine the actual
source(s) of memory consumption.
pdfxs.logfactory
PDFxStream defaults to using java.util.logging
or Log4J for logging
informational and error messages. However, many environments demand
customized logging frameworks. Therefore, PDFxStream provides a pluggable
logging architecture that enables you to hook your custom logging framework
into PDFxStream. To do so, simply implement
the com.snowtide.util.logging.LogFactory
interface, and set the pdfxs.logfactory
environment or system property to the
full classname of your implementation.
More details about PDFxStream's logging support is available here.
pdfxs.loggingtype
PDFxStream normally defaults to using the java.util.logging logging
framework. To force PDFxStream to default to using Log4J, set the pdfxs.loggingtype
environtment or system property to "log4j"
.
pdfxs.layout.detectTables
By default, PDFxStream will attempt to detect tabular data on each
extracted page, and infer the structure of each table. This structure is
then materialized as rows of com.snowtide.pdf.layout.Block
s
within higher-level com.snowtide.pdf.layout.Table
blocks.
This detection and inference can be disabled globally by setting the
pdfxs.layout.detectTables
environment or system property
to "N"
.