U4-2238 - 6.0.5 - Issue when saving/publishing special characters - '' hexadecimal value 0x03, is an invalid character

Created by Andy Dyton 15 May 2013, 09:53:15 Updated by Sebastiaan Janssen 14 Aug 2014, 08:26:58

Is duplicated by: U4-5300

When saving/publishing text that contains the "End-of-text character" (ETX) the following error is thrown:

'', hexadecimal value 0x03, is an invalid character.

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.ArgumentException: '', hexadecimal value 0x03, is an invalid character.

Source Error:

An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.

Stack Trace:

[ArgumentException: '', hexadecimal value 0x03, is an invalid character.] System.Xml.XmlEncodedRawTextWriter.InvalidXmlChar(Int32 ch, Char* pDst, Boolean entitize) +3795464 System.Xml.XmlEncodedRawTextWriter.WriteCDataSection(String text) +770 System.Xml.XmlEncodedRawTextWriter.WriteCData(String text) +411 System.Xml.XmlWellFormedWriter.WriteCData(String text) +3015958 System.Xml.Linq.ElementWriter.WriteElement(XElement e) +224 System.Xml.Linq.XElement.WriteTo(XmlWriter writer) +149 System.Xml.Linq.XNode.GetXmlString(SaveOptions o) +274 Umbraco.Core.Services.ContentService.Save(IContent content, Boolean changeState, Int32 userId, Boolean raiseEvents) +655 Umbraco.Core.Services.ContentService.Save(IContent content, Int32 userId, Boolean raiseEvents) +20 umbraco.cms.businesslogic.web.Document.Save() +424 umbraco.cms.presentation.editContent.Save(Object sender, EventArgs e) +1971 umbraco.controls.ContentControl.SaveClick(Object sender, ImageClickEventArgs e) +682 umbraco.controls.ContentControl.DoSaveAndPublish(Object sender, ImageClickEventArgs e) +24 System.Web.UI.WebControls.ImageButton.OnClick(ImageClickEventArgs e) +134 System.Web.UI.WebControls.ImageButton.RaisePostBackEvent(String eventArgument) +204 System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +3804

These kinds of characters can sneak in when people are copying/pasting from other documents, the characters are not visible, and to any non-technical content editors it's unclear what's gone wrong.

1 Attachments

Download Text with hidden character.txt

Comments

Steven Lemmens 13 Aug 2013, 07:58:19

I upvoted this, as I just encoutered the exact same error, while simply copying a sentence from my browser into my Umbraco.

My solution was to re-type the sentence and I don't see what kind of special character I could have copied, but there you go.

Might want to add that I saw this in 6.1.2


Maarten van der Donk 01 Nov 2013, 13:53:05

I've attached a text file with all kinds of characters that give errors.

In the old Umbraco versions this was fixed with this issue: U4-367. In the newer (v6) version of Umbraco this isn't such a big issue since the cmsContentXmltable isn't updated, but the corrupt data still get's saved in the database (cmsPropertyData). This could potentially be a problem if, for example, a package updates the cmsContentXml table with these characters. If this happens it could potentially kill the entire website.


Damiaan Peeters 27 Dec 2013, 13:20:52

Something alike in 7.0.1, file attached. If you paste content of the file attached into a RTE, you'll get the error below:

Server error: Contact administrator, see log for full details. '□', hexadecimal value 0x0C, is an invalid character.


Tim Payne 31 Jul 2014, 14:16:58

Can confirm that this is still an issue in 7.1.4.


Sebastiaan Janssen 31 Jul 2014, 15:23:07

See: http://stackoverflow.com/questions/397250/unicode-regex-invalid-xml-characters/961504#961504


Sebastiaan Janssen 13 Aug 2014, 15:57:32

Commit made by '''Sebastiaan Janssen''' on ''2014-08-13T09:38:40+02:00'' https://github.com/umbraco/Umbraco-CMS/commit/ba7a5a0e8a20b988cdebe94a3eca9c39fe4a2d4e

#U4-2238 Fixed

Moved some methods around, made them internal, removed cleaning of tags as that's already done by cleaning each property


Sebastiaan Janssen 13 Aug 2014, 17:56:23

I'll cherry pick for 6.2.2 tomorrow


Sebastiaan Janssen 14 Aug 2014, 08:26:46

Commit made by '''Sebastiaan Janssen''' on ''2014-08-14T10:26:38+02:00'' https://github.com/umbraco/Umbraco-CMS/commit/908afbd26467a640a3a59e7b552db14d3317d602

#U4-2238 Fixed

Issue when saving/publishing special characters - '' hexadecimal value 0x03, is an invalid character


Priority: Normal

Type: Exception

State: Fixed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions: 6.0.5, 6.1.2, 6.1.5, 7.0.1, 7.1.4

Due in version: 7.1.5, 6.2.2

Sprint:

Story Points:

Cycle: