Removing invalid characters in JavaScript

Can someone provide a regular expression to search and replace illegal characters found

Example, removing �

I am not sure how many types of ‘illegal’ characters exist but I think this will be a good start.

Many thanks

edit – I have no control over the data, we’re trying to create a catch for the potentially bad data we’re receiving.

Invalid characters get converted to 0xFFFD on parsing, so any invalid character codes would get replaced with:

myString = myString.replace(/\uFFFD/g, '')

You can get all types of invalid sorts of chars here

Instead of having a blacklist, you could use a whitelist. e.g. If you want to only accept letters, numbers, space, and a few punctuation characters, you could do

myString.replace(/[^a-z0-9 ,.?!]/ig, '')

Try this, it will work for all unexpected character like ♫ ◘ etc…

dataStr.replace(/[\u{0080}-\u{FFFF}]/gu,"");


The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .
Read More:   Removing all controls from a google map

Similar Posts