Uploaded image for project: 'Content IO'
  1. Content IO
  2. CONTENTIO-16

Encoding error after importing a XML content

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Major Major
    • 3.0.0
    • None
    • None

      After uploading contents from XML format, I have teh following error when opening one of imported content:

      org.apache.xerces.impl.io.MalformedByteSequenceException : Invalid byte 2 of 3-byte UTF-8 sequence.
      

      The complete stacktrace is :

      	at org.ametys.cms.contenttype.MetadataManager._saxRichTextMetadata(MetadataManager.java:940)
      	at org.ametys.cms.contenttype.MetadataManager._saxMetadata(MetadataManager.java:300)
      	at org.ametys.cms.contenttype.MetadataManager._saxMetadataSetElementOrAll(MetadataManager.java:199)
      	at org.ametys.cms.contenttype.MetadataManager._saxMetadataSetElementOrAll(MetadataManager.java:206)
      	at org.ametys.cms.contenttype.MetadataManager.saxMetadata(MetadataManager.java:156)
      	at org.ametys.cms.content.ContentGenerator._saxMetadata(ContentGenerator.java:348)
      	at org.ametys.cms.content.ContentGenerator._saxContent(ContentGenerator.java:161)
      	at org.ametys.cms.content.ContentGenerator._generateContent(ContentGenerator.java:113)
      	at org.ametys.cms.content.ContentGenerator.generate(ContentGenerator.java:97)
      	at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:581)
      	... 50 more
      Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
      	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
      	... 79 more
      Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
      	at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
      	at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
      	at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
      	at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
      	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
      	... 76 more
      

      The error occurred on a rich-text metadata. The XML format of the imported content is simply:

      <content>
                  <metadata>
                      <title><value>Mon premier contenu XML</value></title>
                      <abstract><value>Résumé de mon premier contenu XML.
      Deuxième ligne.</value></abstract>
                      <illustration>
                          <image><value>http://www.anyware-services.com/skins/Anyware-Services/resources/js/mobileswitch.js</value></image>
                          <alt-text><value>Texte alternatif de l'illustration</value></alt-text>
                      </illustration>
                      <content>
                          <docbook:article version="5.0">
                              <docbook:section>
                                  <docbook:title>Titre du contenu</docbook:title>
                                  <docbook:para>Texte riche.</docbook:para>
                                  <docbook:para>Deuxième ligne.</docbook:para>
                              </docbook:section>
                          </docbook:article>
                      </content>
                      <attachments>
                          <entry>
                              <attachment><value>BINARY</value></attachment>
                              <attachment-text><value>Mon premier fichier joint.</value></attachment-text>
                              <attachment-desc>
                                  <docbook:article version="5.0">
                                      <docbook:para>Texte riche.</docbook:para>
                                      <docbook:para>Deuxième ligne.</docbook:para>
                                  </docbook:article>
                              </attachment-desc>
                          </entry>
                          <entry>
                              <attachment><value>BINARY</value></attachment>
                              <attachment-text><value>Mon second fichier joint.</value></attachment-text>
                              <attachment-desc>
                                  <docbook:article version="5.0">
                                      <docbook:para>Texte riche.</docbook:para>
                                      <docbook:para>Deuxième ligne.</docbook:para>
                                  </docbook:article>
                              </attachment-desc>
                          </entry>
                      </attachments>
                  </metadata>
              </content>
      

          [CONTENTIO-16] Encoding error after importing a XML content

          Laurence Aumeunier made changes -
          Status Original: Resolved [ 5 ] New: Closed [ 6 ]
          Raphaël Franchet made changes -
          Fix Version/s New: 3.0.0 [ 13580 ]
          Fix Version/s Original: 2.0.0 [ 12385 ]
          Laurence Aumeunier made changes -
          Fix Version/s New: 2.0 [ 12385 ]
          Fix Version/s Original: 1.3.0 [ 12082 ]
          Nicolas Gavalda (Inactive) made changes -
          Resolution New: Fixed [ 1 ]
          Status Original: Open [ 1 ] New: Resolved [ 5 ]
          Laurence Aumeunier made changes -
          Comment [ In JCR repository, the rich-text metadata was created but with no binary data (see screenshot) ]
          Laurence Aumeunier made changes -
          Description Original: After uploading contents from XML format, I have teh following error when opening one of imported content:

          {code}
          org.apache.xerces.impl.io.MalformedByteSequenceException : Invalid byte 2 of 3-byte UTF-8 sequence.
          {code}

          The complete stacktrace is :
          {
          New: After uploading contents from XML format, I have teh following error when opening one of imported content:

          {code}
          org.apache.xerces.impl.io.MalformedByteSequenceException : Invalid byte 2 of 3-byte UTF-8 sequence.
          {code}

          The complete stacktrace is :
          {code}
          at org.ametys.cms.contenttype.MetadataManager._saxRichTextMetadata(MetadataManager.java:940)
          at org.ametys.cms.contenttype.MetadataManager._saxMetadata(MetadataManager.java:300)
          at org.ametys.cms.contenttype.MetadataManager._saxMetadataSetElementOrAll(MetadataManager.java:199)
          at org.ametys.cms.contenttype.MetadataManager._saxMetadataSetElementOrAll(MetadataManager.java:206)
          at org.ametys.cms.contenttype.MetadataManager.saxMetadata(MetadataManager.java:156)
          at org.ametys.cms.content.ContentGenerator._saxMetadata(ContentGenerator.java:348)
          at org.ametys.cms.content.ContentGenerator._saxContent(ContentGenerator.java:161)
          at org.ametys.cms.content.ContentGenerator._generateContent(ContentGenerator.java:113)
          at org.ametys.cms.content.ContentGenerator.generate(ContentGenerator.java:97)
          at org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.processXMLPipeline(AbstractProcessingPipeline.java:581)
          ... 50 more
          Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
          at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
          ... 79 more
          Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
          at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
          at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
          at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
          at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
          at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown Source)
          ... 76 more
          {code}

          The error occurred on a rich-text metadata. The XML format of the imported content is simply:
          {code:xml}
          <content>
                      <metadata>
                          <title><value>Mon premier contenu XML</value></title>
                          <abstract><value>Résumé de mon premier contenu XML.
          Deuxième ligne.</value></abstract>
                          <illustration>
                              <image><value>http://www.anyware-services.com/skins/Anyware-Services/resources/js/mobileswitch.js&lt;/value&gt;&lt;/image>
                              <alt-text><value>Texte alternatif de l'illustration</value></alt-text>
                          </illustration>
                          <content>
                              <docbook:article version="5.0">
                                  <docbook:section>
                                      <docbook:title>Titre du contenu</docbook:title>
                                      <docbook:para>Texte riche.</docbook:para>
                                      <docbook:para>Deuxième ligne.</docbook:para>
                                  </docbook:section>
                              </docbook:article>
                          </content>
                          <attachments>
                              <entry>
                                  <attachment><value>BINARY</value></attachment>
                                  <attachment-text><value>Mon premier fichier joint.</value></attachment-text>
                                  <attachment-desc>
                                      <docbook:article version="5.0">
                                          <docbook:para>Texte riche.</docbook:para>
                                          <docbook:para>Deuxième ligne.</docbook:para>
                                      </docbook:article>
                                  </attachment-desc>
                              </entry>
                              <entry>
                                  <attachment><value>BINARY</value></attachment>
                                  <attachment-text><value>Mon second fichier joint.</value></attachment-text>
                                  <attachment-desc>
                                      <docbook:article version="5.0">
                                          <docbook:para>Texte riche.</docbook:para>
                                          <docbook:para>Deuxième ligne.</docbook:para>
                                      </docbook:article>
                                  </attachment-desc>
                              </entry>
                          </attachments>
                      </metadata>
                  </content>
          {code}
          Laurence Aumeunier created issue -

            ngavalda Nicolas Gavalda (Inactive)
            laurence Laurence Aumeunier
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: