Help

Built with Seam

You can find the full source code for this website in the Seam package in the directory /examples/wiki. It is licensed under the LGPL.

Recently there have been many reports of deployment failures due to SAX exceptions. Yet, these deployment errors were happening in applications that hadn't been changed. As it turns out, the problem was an unintentional reliance on internet resources which were down...specifically DTDs.

Here's an example exception that was occurring for a while in applications that use jBPM.

Caused by: org.jbpm.jpdl.JpdlException: [[FATAL] line 31: The declaration for the entity "HTML.Version" must end with '>'.,
[ERROR] couldn't parse process definition]
        at org.jbpm.jpdl.xml.JpdlXmlReader.readProcessDefinition(JpdlXmlReader.java:172)
        at org.jboss.seam.bpm.Jbpm.parseInputSource(Jbpm.java:317)

The problem is that the w3c.org uploaded a bad loose.dtd file. Jbpm (via the SAX parser) reaches out the internet and attempts to parse this file. This results in a failed deployment when Seam attempts to load the page flow descriptor. JBoss Embedded has a similar reliance, so whenever jboss.org is down, so is SeamTest.

The underlying problem is that developers of libraries and frameworks are often careless when they parse XML with validation. Any schema or DTD reference needs to be resolved for this. So, because most DTD and schema references are expressed using HTTP URLs, the developer never notices that the XML parser downloads the DTD or schema on every run.

You are supposed to resolve the DTD or schema of your XML file within your library. For DTDs this is a custom EntityResolver. For schemas it's more complicated but perfectly doable. So the blame is on the:

  • The XML parser for being stupid enough to try to resolve a URN that is a URL just because it looks like it can be resolved that way, instead of throwing a missing resource exception
  • The IT guys for actually putting these resources online under these URLs, so everybody just keeps on downloading them on every run
  • The framework/library developers for not noticing and fixing any of these issues

Seam and Hibernate configure the XML parser correctly with a custom resolver that searches for the DTD/schema inside the classpath.

if (systemId.startsWith("http://jboss.com/products/seam/")) {
    log.debug("recognized Seam namespace; attempting to resolve on classpath under org/jboss/seam/");
    String path = "org/jboss/seam/" + systemId.substring(SEAM_NAMESPACE.length());
                
    InputStream dtdStream = resolveInSeamNamespace(path);
    ...
}

That's how it should be done. So, if you are consuming XML, take a look at Seam's DTDEntityResolver for an example of how to do it properly.

Here's a workaround for jBPM:

  1. Download and copy the src/jpdl/org/jbpm/jpdl/xml/JpdlParser.java from the jbpm-jpdl 3.2.2 distribution to your project under the same package name (org.jbpm.jpdl.xml).
  2. Alter the nested JpdlEntityResolver class as follows:
static class JpdlEntityResolver implements EntityResolver, Serializable {
    ...
    public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
      private static final String SEAM_NAMESPACE = "http://jboss.com/products/seam/";
      ...
      if ("http://jbpm.org/jpdl-3.2.xsd".equals(systemId)) {
        ... <- else if's       
       } else if (systemId.startsWith(SEAM_NAMESPACE)) {
    	    log.debug("recognized Seam namespace; attempting to resolve on classpath under org/jboss/seam/");
    	    String path = "org/jboss/seam/" + systemId.substring(SEAM_NAMESPACE.length());
    	    try {
    	    	inputSource = new InputSource(org.jboss.seam.Seam.class.getResourceAsStream(path));
    	    } catch (Exception e) {
    	    	log.debug(e.toString());
    	    }
      }
      
      if (inputSource == null) {
        log.debug("original systemId as input source");
        inputSource = new InputSource(systemId);
      }
      return inputSource;
    }
}

This way it tries to load a resource from the Seam package, but in case it fails, it tries the orignal systemId.

See this thread for a longer discussion about this topic.