U4-2968 - UmbracoExamine.UmbracoContentIndexer will not always index HTML content correctly

Created by Nicklas Gavelin 27 Sep 2013, 11:03:25 Updated by Shannon Deminick 21 Jun 2017, 07:58:15

When using the UmbracoExamine.UmbracoContentIndexer the indexer uses the "StripHtml" method which leads to the content being indexed incorrectly. For example if the node contains the HTML content "

Start of sentence
End of sentence

" the index will contain "Start of sentenceEnd of sentence" instead of "Start of sentence End of sentence" as all HTML tags are replaced by the empty string. This problem will lead to some searches not returning the correct result as two words are merged together.

Comments

Shannon Deminick 21 Jun 2017, 07:58:15

Closing issue due to inactivity - see blog post for details https://umbraco.com/blog/issue-tracker-cleanup/


Priority: Normal

Type: Bug

State: Closed

Assignee:

Difficulty: Normal

Category:

Backwards Compatible: True

Fix Submitted:

Affected versions:

Due in version:

Sprint:

Story Points:

Cycle: