Packages
OASIS XML Catalogs,
primarily for entity resolution by parsers.
That specification defines an XML syntax for mappings between
identifiers declared in DTDs (particularly PUBLIC identifiers) and
locations. SAX has always supported such mappings, but conventions for
an XML file syntax to maintain them have previously been lacking.
This has three main operational modes. The primary intended mode is
to create a resolver, then preloading it with one or more site-standard
catalogs before using it with one or more SAX parsers:
XCat catalog = new XCat ();
catalog.setErrorHandler (diagnosticErrorHandler);
catalog.loadCatalog ("file:/local/catalogs/catalog.cat");
catalog.loadCatalog ("http://shared/catalog.cat");
...
catalog.disableLoading ();
parser1.setEntityResolver (catalog);
parser2.setEntityResolver (catalog);
...
A second mode is to arrange that your application uses instances of
this class as its entity resolver, and automatically loads catalogs
referenced by
<?oasis-xml-catalog...?> processing
instructions found before the DTD in documents it parses.
It would then discard the resolver after each parse.
A third mode applies catalogs in contexts other than entity
resolution for parsers.
The
resolveURI()
method supports resolving URIs
stored in XML application data, rather than inside DTDs.
Catalogs would be loaded as shown above, and the catalog could
be used concurrently for parser entity resolution and for
application URI resolution.
Errors in catalogs implicitly loaded (during resolution) are ignored
beyond being reported through any
ErrorHandler assigned using
setErrorHandler()
. SAX exceptions
thrown from such a handler won't abort resolution, although throwing a
RuntimeException or
Error will normally abort both
resolution and parsing. Useful diagnostic information is available to
any
ErrorHandler used to report problems, or from any exception
thrown from an explicit
loadCatalog()
invocation.
Applications can use that information as troubleshooting aids.
While this class requires
SAX2 Extensions 1.1 classes in
its class path, basic functionality does not require using a SAX2
parser that supports the extended entity resolution functionality.
See the original SAX1
resolveEntity()
method for a list of restrictions which apply when it is used with
older SAX parsers.
disableLoading
public void disableLoading()
Records that catalog loading is no longer permitted.
Loading is automatically disabled when lookups are performed,
and should be manually disabled when startDTD() (or
any other DTD declaration callback) is invoked, or at the latest
when the document root element is seen.
getErrorHandler
public ErrorHandler getErrorHandler()
Returns the error handler used to report catalog errors.
Null is returned if the parser's default error handling
will be used.
setErrorHandler(ErrorHandler)
getExternalSubset
public InputSource getExternalSubset(String name,
String baseURI)
throws SAXException,
IOException
"New Style" parser callback to add an external subset.
For documents that don't include an external subset, this may
return one according to
doctype catalog entries.
(This functionality is not a core part of the OASIS XML Catalog
specification, though it's presented in an appendix.)
If no such entry is defined, this returns null to indicate that
this document will not be modified to include such a subset.
Calls to this method prevent explicit loading of additional catalogs
using
loadCatalog()
.
Warning: That catalog functionality can be dangerous.
It can provide definitions of general entities, and thereby mask
certain well formedess errors.
- getExternalSubset in interface EntityResolver2
name
- Name of the document element, either as declared in
a DOCTYPE declaration or as observed in the text.baseURI
- Document's base URI (absolute).
- Input source for accessing the external subset, or null
if no mapping was found. The input source may have opened
the stream, and will have a fully resolved URI.
getParserClass
public String getParserClass()
Returns the name of the SAX2 parser class used to parse catalogs.
Null is returned if the system default is used.
setParserClass(String)
isUnified
public boolean isUnified()
Returns true (the default) if all methods resolve
a given URI in the same way.
Returns false if calls resolving URIs as entities (such as
resolveEntity()
) use different catalog entries
than those resolving them as URIs (
resolveURI()
),
which will generally produce different results.
The OASIS XML Catalog specification defines two related schemes
to map URIs "as URIs" or "as system IDs".
URIs use
uri,
rewriteURI, and
delegateURI
elements. System IDs do the same things with
systemId,
rewriteSystemId, and
delegateSystemId.
It's confusing and error prone to maintain two parallel copies of
such data. Accordingly, this class makes that behavior optional.
The
unified interpretation of URI mappings is preferred,
since it prevents surprises where one URI gets mapped to different
contents depending on whether the reference happens to have come
from a DTD (or not).
setUnified(boolean)
isUsingPublic
public boolean isUsingPublic()
Returns true (the default) if a catalog's public identifier
mappings will be used.
When false is returned, such mappings are ignored except when
system IDs are discarded, such as for
entities using the urn:publicid: URI scheme in their
system identifiers. (See RFC 3151 for information about that
URI scheme. Using it in system identifiers may not work well
with many SAX parsers unless the resolve-dtd-uris
feature flag is set to false.)
setUsingPublic(boolean)
loadCatalog
public void loadCatalog(String uri)
throws SAXException,
IOException
Loads an OASIS XML Catalog.
It is appended to the list of currently active catalogs, or
reloaded if a catalog with the same URI was already loaded.
Callers have control over what parser is used, how catalog parsing
errors are reported, and whether URIs will be resolved consistently.
The OASIS specification says that errors detected when loading
catalogs "must recover by ignoring the catalog entry file that
failed, and proceeding." In this API, that action can be the
responsibility of applications, when they explicitly load any
catalog using this method.
Note that catalogs referenced by this one will not be loaded
at this time. Catalogs referenced through
nextCatalog
or
delegate* elements are normally loaded only if needed.
uri
- absolute URI for the catalog file.
SAXException
- As thrown by the parser, typically to
indicate problems parsing data from that URI. It may also
be thrown if the parser doesn't support necessary handlers.
setErrorHandler(ErrorHandler)
, setParserClass(String)
, setUnified(boolean)
resolveEntity
public final InputSource resolveEntity(String publicId,
String systemId)
throws SAXException,
IOException
"Old Style" external entity resolution for parsers.
This API provides only core functionality.
Calls to this method prevent explicit loading of additional catalogs
using
loadCatalog()
.
The functional limitations of this interface include:
- Since system IDs will be absolutized before the resolver
sees them, matching against relative URIs won't work.
This may affect system, rewriteSystem,
and delegateSystem catalog entries.
- Because of that absolutization, documents declaring entities
with system IDs using URI schemes that the JVM does not recognize
may be unparsable. URI schemes such as file:/,
http://, https://, and ftp://
will usually work reliably.
- Because missing external subsets can't be provided, the
doctype catalog entries will be ignored.
(The
getExternalSubset()
method is
a "New Style" resolution option.)
Applications can tell whether this limited functionality will be
used: if the feature flag associated with the
EntityResolver2
interface is not
true, the limitations apply. Applications
can't usually know whether a given document and catalog will trigger
those limitations. The issue can only be bypassed by operational
procedures such as not using catalogs or documents which involve
those features.
- resolveEntity in interface EntityResolver
publicId
- Either a normalized public ID, or nullsystemId
- Always an absolute URI.
- Input source for accessing the external entity, or null
if no mapping was found. The input source may have opened
the stream, and will have a fully resolved URI.
resolveEntity
public InputSource resolveEntity(String name,
String publicId,
String baseURI,
String systemId)
throws SAXException,
IOException
"New Style" external entity resolution for parsers.
Calls to this method prevent explicit loading of additional catalogs
using
loadCatalog()
.
This supports the full core catalog functionality for locating
(and relocating) parsed entities that have been declared in a
document's DTD.
- resolveEntity in interface EntityResolver2
name
- Entity name, such as "dudley", "%nell", or "[dtd]".publicId
- Either a normalized public ID, or null.baseURI
- Absolute base URI associated with systemId.systemId
- URI found in entity declaration (may be
relative to baseURI).
- Input source for accessing the external entity, or null
if no mapping was found. The input source may have opened
the stream, and will have a fully resolved URI.
getExternalSubset(String,String)
resolveURI
public InputSource resolveURI(String baseURI,
String uri)
throws SAXException,
IOException
Resolves a URI reference that's not defined to the DTD.
This is intended for use with URIs found in document text, such as
xml-stylesheet processing instructions and in attribute
values, where they are not recognized as URIs by XML parsers.
Calls to this method prevent explicit loading of additional catalogs
using
loadCatalog()
.
This functionality is supported by the OASIS XML Catalog
specification, but will never be invoked by an XML parser.
It corresponds closely to functionality for mapping system
identifiers for entities declared in DTDs; closely enough that
this implementation's default behavior is that they be
identical, to minimize potential confusion.
This method could be useful when implementing the
URIResolver
interface, wrapping the
input source in a
SAXSource
.
baseURI
- The relevant base URI as specified by the XML Base
specification. This recognizes xml:base attributes
as overriding the actual (physical) base URI.uri
- Either an absolute URI, or one relative to baseURI
- Input source for accessing the mapped URI, or null
if no mapping was found. The input source may have opened
the stream, and will have a fully resolved URI.
isUnified()
, setUnified(boolean)
setErrorHandler
public void setErrorHandler(ErrorHandler handler)
Assigns the error handler used to report catalog errors.
These errors may come either from the SAX2 parser or
from the catalog parsing code driven by the parser.
If you're sharing the resolver between parsers, don't
change this once lookups have begun.
getErrorHandler()
setParserClass
public void setParserClass(String parser)
Names the SAX2 parser class used to parse catalogs.
If you're sharing the resolver between parsers, don't change
this once lookups have begun.
Note that in order to properly support the
xml:base
attribute and relative URI resolution, the SAX parser used to parse
the catalog must provide a
Locator
and support the optional
declaration and lexical handlers.
parser
- The parser class name, or null saying to use the
system default SAX2 parser.
getParserClass()
setUnified
public void setUnified(boolean value)
Assigns the value of the flag returned by
isUnified()
.
Set it to false to be strictly conformant with the OASIS XML Catalog
specification. Set it to true to make all mappings for a given URI
give the same result, regardless of the reason for the mapping.
Don't change this once you've loaded the first catalog.
value
- new flag setting
setUsingPublic
public void setUsingPublic(boolean value)
Specifies which catalog search mode is used.
By default, public identifier mappings are able to override system
identifiers when both are available.
Applications may choose to ignore public
identifier mappings in such cases, so that system identifiers
declared in DTDs will only be overridden by an explicit catalog
match for that system ID.
If you're sharing the resolver between parsers, don't
change this once lookups have begun.
value
- true to always use public identifier mappings,
false to only use them for system ids using the urn:publicid:
URI scheme.
isUsingPublic()