Why is it important to always consider how you escape or encode newlines, carriage returns, apostrophes, quotes and other special characters from text input by users? It is important as we do not necessarily know what they are entering. Sure, you can validate or restrict what a user types. However, you may want to allow some freedom with what is entered by your users.

Malformed JSON and database changes

I had a problem today that wreaked havoc on our systems. This problem resulted in thousands of items backing up one of our queues. What was the cause? It was a malformed JSON message in the queue.

And if we were trying to add this string to a database, we may encounter the same issue. Although ASP.NET Core’s use of parameters mitigates some of these issues.

We recently introduced a feature to our customer portal where a customer could send us private messages.

A message is submitted via an online form in the portal by the customer where they type a message in a text box. The message is added to a queue (using SQS in Amazon AWS) and processed by an internal service that saves the message as a note to the system and raises a task for our team to action (respond).

This process was straightforward, but our big mistake was not escaping the content of the message that is input by the user. Information that we have little control over in terms of what was input. Users were entering newlines, carriage returns, apostrophes, quotes and other special characters.

Big problems from something so simple

How did we miss such a critical point that allowed us to break an entire service? The problem was not that the message stored in a string was invalid; it was when we added it to a JSON object it became malformed.

It was at the point we were adding the message to the JSON object that we needed to escape or encode the special characters and quotes.

Escaping and encoding strategies

So, how do we escape or encode newlines, carriage returns, apostrophes, quotes and other special characters? There are a number of tools you can use to escape and encode strings, but which tool you use will depend on what you are trying to escape or encode.

HttpUtility

The System.Web namespace contains a utility called HttpUtility. Within this utility, there is a number of encoders for encoding a string.

JavaScriptStringEncode

This was perfect for my scenario above as we were using it to ensure our JSON was valid.

using System.Web;
...
string json = HttpUtility.JavaScriptStringEncode(input);

If you want to include quotation marks around the encoded string you can use the overloaded method with the addDoubleQuotes flag.

using System.Web;
...
string json = HttpUtility.JavaScriptStringEncode(input, true);

HtmlEncode

If you need to encode HTML then you need to use the HTML encoder as some characters have a different meaning in the HTML parsers and will be handled differently.

using System.Web;
...
string html = HttpUtility.HtmlEncode(input); 

UrlEncode

If you need to encode a URL, then you need to use the URL encoder as any special characters passed to an HTTP stream without encoding could be misinterpreted at the receiving end.

using System.Web;
...
string url = HttpUtility.UrlEncode(input); 

Newtonsoft

We use Newtonsoft to convert our objects to and from JSON, so if you are using Newtonsoft, or would like to, there is a built-in method for making sure a string is escaped and encoded accordingly.

using Newtonsoft.Json;
...
message = JsonConvert.SerializeObject(message);

This method is perfect for JSON, but not suitable for HTML or URL encoding. Also, be careful if you are using it for JavaScript as it will ignore certain characters like apostrophes. I would only use this method if you are working specifically with JSON.

Regex

You can also use some of the built-in regex methods provided by .NET. The regex escape method replaces a small number of characters with their escape codes.

using System.Text.RegularExpressions;
...
string message = Regex.Escape(input);

This method is suitable if you are looking to store or display regex or if you just want to handle escaping white space. Quotes or apostrophes are not handled either and you will get a null exception if you pass a null value to the escape method.

References

Photo credit

unsplash-logoMimi Garcia

Adam Stacey

I am Adam Stacey, the guy behind AdNav! I setup AdNav as a way to write up any technical challenges, how I overcame them, opinions on tech and much rambling. I try and cut through any technical jargon to make it friendly and easy to understand.

View all posts

Add comment

Your email address will not be published. Required fields are marked *

Adam Stacey

I am Adam Stacey, the guy behind AdNav! I setup AdNav as a way to write up any technical challenges, how I overcame them, opinions on tech and much rambling. I try and cut through any technical jargon to make it friendly and easy to understand.