Details
-
Type: Bug
-
Status: Closed
-
Priority: Blocker
-
Resolution: Not A Bug
-
Affects Version/s: 1.3.1
-
Fix Version/s: None
-
Component/s: IO
-
Labels:None
-
JDK version and platform:Sun 1.6.0_18 Windows XP 32bit
Description
Several parts of XStream generates XML that is not well-formed according to the definition of the W3C recommendation [1].
The problem arises because not all characters are legal in a XML document. A Java String or char (or char array) may contain any characters, including the NUL (0x0) and BEL (0x7), but in XML these characters are illegal. [2]
Serializing an instance of the following class
class TestClass
{
char[] chars = new char[]
;
String s = "\4\5\6";
}
with the PrettyPrintWriter or the Dom4JXmlWriter creates invalid XML like
<xstreamEncodingTest.TestClass>
<chars></chars>
<s></s>
</xstreamEncodingTest.TestClass>
Even that there are entities in use, this XML is not well-formed. All non-XStream XML readers or writers fail on this XML with an exception.
- The StaxWriter fails with: Character reference "" is an invalid XML character. Nested exception: Character reference "" is an invalid XML character.
- Error messages reading with
- DomDriver: Character reference "" is an invalid XML character.
- Dom4J: Error on line 2 of document : Character reference "" is an invalid XML character. Nested exception: Character reference "" is an invalid XML character.
- JDom: Error on line 2: Character reference "" is an invalid XML character.
(I have not tested Xpp)
This means all the places where XStream is writing an Java String to the XML, XStream needs to perform some kind of escaping. (It could write out the Java literal of the String "\4\5\6" or "\u0004\u0005\u0006")
This effects especially the StringConverter and the CharArrayConverter.
[1] http://www.w3.org/TR/REC-xml/#dt-wellformed
[2] http://www.w3.org/TR/REC-xml/#NT-Char
If you look at the XML code above, you will probably not see the XML entities & x 1 ; but the real character 0, this is because your browser is replacing the XML entity. The XML is really looking like this:
<xstreamEncodingTest.TestClass>
<chars>& #x1;& #x2;& #x3;</chars>
<s>& #x4;& #x5;& #x6;</s>
</xstreamEncodingTest.TestClass>
(I've inserted spaces to prevent the browser of replacing the entities!)