XStream
  1. XStream
  2. XSTR-622

XStream generates XML that is not well-formed (according to the XML specification) by writing illegal characters in CHARACTER sequences

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Blocker Blocker
    • Resolution: Not A Bug
    • Affects Version/s: 1.3.1
    • Fix Version/s: None
    • Component/s: IO
    • Labels:
      None
    • JDK version and platform:
      Sun 1.6.0_18 Windows XP 32bit

      Description

      Several parts of XStream generates XML that is not well-formed according to the definition of the W3C recommendation [1].

      The problem arises because not all characters are legal in a XML document. A Java String or char (or char array) may contain any characters, including the NUL (0x0) and BEL (0x7), but in XML these characters are illegal. [2]

      Serializing an instance of the following class
      class TestClass
      {
      char[] chars = new char[]

      { 1, 2, 3 }

      ;
      String s = "\4\5\6";
      }
      with the PrettyPrintWriter or the Dom4JXmlWriter creates invalid XML like

      <xstreamEncodingTest.TestClass>
      <chars></chars>
      <s></s>
      </xstreamEncodingTest.TestClass>

      Even that there are entities in use, this XML is not well-formed. All non-XStream XML readers or writers fail on this XML with an exception.

      • The StaxWriter fails with: Character reference "&#1" is an invalid XML character. Nested exception: Character reference "&#1" is an invalid XML character.
      • Error messages reading with
      • DomDriver: Character reference "&#x1" is an invalid XML character.
      • Dom4J: Error on line 2 of document : Character reference "&#1" is an invalid XML character. Nested exception: Character reference "&#1" is an invalid XML character.
      • JDom: Error on line 2: Character reference "&#x1" is an invalid XML character.

      (I have not tested Xpp)

      This means all the places where XStream is writing an Java String to the XML, XStream needs to perform some kind of escaping. (It could write out the Java literal of the String "\4\5\6" or "\u0004\u0005\u0006")

      This effects especially the StringConverter and the CharArrayConverter.

      [1] http://www.w3.org/TR/REC-xml/#dt-wellformed
      [2] http://www.w3.org/TR/REC-xml/#NT-Char

        People

        • Assignee:
          Jörg Schaible
          Reporter:
          Michael Schnell
        • Votes:
          0 Vote for this issue
          Watchers:
          1 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved: