Removing invalid characters in JavaScript
Can someone provide a regular expression to search and replace illegal characters found
Example, removing �
I am not sure how many types of ‘illegal’ characters exist but I think this will be a good start.
Many thanks
edit – I have no control over the data, we’re trying to create a catch for the potentially bad data we’re receiving.
Invalid characters get converted to 0xFFFD on parsing, so any invalid character codes would get replaced with:
myString = myString.replace(/\uFFFD/g, '')
You can get all types of invalid sorts of chars here
Instead of having a blacklist, you could use a whitelist. e.g. If you want to only accept letters, numbers, space, and a few punctuation characters, you could do
myString.replace(/[^a-z0-9 ,.?!]/ig, '')
Try this, it will work for all unexpected character like ♫ ◘ etc…
dataStr.replace(/[\u{0080}-\u{FFFF}]/gu,"");
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .