When concept unit was set as cell/mm<sup>3</sup>, it was being displayed correctly as cell/mm3 in 1.10.x. But since 1.11, it is just <sup> tags are displayed as a plain text (cell/mm<sup>3</sup>) in UI. Have attached images of displayed units 1.10.x and 1.11.x.
On looking at the code, we found out that using <c:out> jsp tag escapes html characters which is causing this issue.
<c:out> tag is added in 1.11 to avoid exposure to XSS attacks. ‘escapeXml=false’ can be used with <c:out> tag which will not escape html characters but it is s ame as not having c:out tag since the only purpose of <c:out> is to escape HTML.
Any suggestions on fixing this issue?
Although it is indeed not ideal to store HTML in DB we should make an exception for the sup tag in the units field. Something along the lines could work:
Inserting the unicode glyph for superscript three (3) directly into the unit text could be used as a workaround.
In general, we should be escaping HTML/JavaScript using tools like <c:out></c:out> to prevent XSS attacks in user-entered data; however, I don’t think it’s necessary for admin-entered metadata. A ubiquitous HTML clean (something like a <c:out safeHTML="true"></c:out>) would be ideal for cases like this, but I think it would be reasonable until we have that to treat this specific case (concept units) as a regression and remove the <c:out></c:out> when rendering concept units.
Inserting unicode character for superscript 3 (³) doesn’t work as well since <c:out> will also escape ‘&’ character. As Burke mentioned, I feel that <c:out> is not necessary for admin-entered metadata like concept units. If that is okay, a pull request has been already raised to remove the c:out tag. Can this be merged?
As Lluis says, I’m not suggesting that you put the html entity value in the
units field, but rather that you put the actual Unicode character in the
database field.
Generally speaking, we still should escape admin-entered text. For example
you could easily imagine an implementation where some users can manage
concepts, but aren’t supposed to be able to manage users, and this gives
them an attack surface.
My suggestion is to implement a tag that properly allows whitelisted html
tags for this.
As Burke says, if we don’t have time to do this now, and if the
Unicode-superscript-3 trick doesn’t work (and because I think the CIEL
dictionary uses sup sometimes) I am okay with unprotecting this one field.
(But do create a ticket for fixing it in the longer term!)
Also, remember that concept units are displayed on quite a few places in
the UI including in modules.
Copying the actual character ( ³ ) into the text field worked fine. For now, we will go with this approach and update our documentation for the implementers.