Does anyone need HTML characters in relationship types?

We’ve identified several potential security issues that arise from allowing the use of HTML characters (i.e. <, >, ") in relationship type names.

One proposed fix prevents relationship types with these characters from ever being created in the first place: RA-1865: Patch bugs discovered by NCSU team by Parth59 · Pull Request #3708 · openmrs/openmrs-core · GitHub

I just wanted to see if anyone can think of a situation in which HTML characters are required when defining relationship types? If so we’ll have to figure out another way to patch this issue.

@grace is this something we could devote a few minutes to discussing at the TAC call this Friday?


Partners In Health would not need these characters in relation types. Appreciate the security on this and other areas. Thanks @isears

1 Like

@isears One thing I’d want to test is what happens with a character like “é”. I can’t see much harm in disallowing <, >, etc., but the spec that HtmlUtils references covers a lot more character types and not being able to use, e.g. “Frère” as a relationship type would be a real limitation.


Sure @isears, I’ve added this to the agenda :slight_smile:

This is a good point. I think there’s an OWASP filtering function that will work for that. We’ll have to write a test to make sure those types of characters pass through unmodified.

1 Like

While we certainly don’t need to support HTML in relationship type names, I would prefer not to have characters that are disallowed. While nobody may have a use case today, there could always be an Informal “spouse” or Household >5 years around the corner.

Ideally, we would escape any troublesome characters so people could use them safely. If we can’t do that, then maybe we could view disallowing these characters as a quick fix and create ticket(s) to escape them so they could be allowed in the future.

1 Like

Hi @isears . Instead of using HtmlLibraries, I have changed it to utilize regex pattern matching to disallow any user from entering HTML tags. Link for the PR :- Empt61 62 by Parth59 · Pull Request #3718 · openmrs/openmrs-core · GitHub

This has a couple of advantages over the previous method :- i) Allows names like Frère or anyother characters. ii) Allows Informal “spouse” as well as Household >5 years iii) Will not give the security error as I am not including spring openmrs-web in pom.xml. iii) Only blocks explicit Html tags. An exhaustive list of test cases that I tried can be found here NB9J14 - Online Java Compiler & Debugging Tool - (You might wanna fork and try with own custom examples)

Kindly do let me know if this approach works.

Thanks for looking into this @parth59

In general, I think we have to be very careful about where and when we use regular expressions, especially for security.

My primary concern is that they aren’t maintainable. Even if your regex is 100% secure today, it’s not guaranteed that it will be secure from novel XSS injection techniques in the future. That’s why I try to rely on standard XSS prevention functions maintained by the Apache foundation or OWASP because I know there’s a team of devs who focus on making sure they are secure, and I don’t have to worry about them silently breaking later on.

Although your solution is definitely more efficient than HTML-encoding in the server-side templates, and even allows for some of the specific usability concerns cited by @burke, I think the maintainability tradeoff is too great.

P.S. For general thread continuity, I should mention that at the last TAC call (3/12) we discussed the problem of HTML characters in-depth and decided to err on the side of usability: i.e. not disallowing specific characters if there’s another way to keep the string safe.

1 Like