JavaScript: how to check if character is RTL?

How can I programmatically check if the browser treats some character as RTL in JavaScript?

Maybe creating some transparent DIV and looking at where text is placed?

A bit of context. Unicode 5.2 added Avestan alphabet support. So, if the browser has Unicode 5.2 support, it treats characters like U+10B00 as RTL (currently only Firefox does). Otherwise, it treats these characters as LTR, because this is the default.

How do I programmatically check this? I’m writing an Avestan input script and I want to override the bidi direction if the browser is too dumb. But, if browser does support Unicode, bidi settings shouldn’t be overriden (since this will allow mixing Avestan and Cyrillic).

I currently do this:

var ua = navigator.userAgent.toLowerCase();

if (ua.match('webkit') || ua.match('presto') || ua.match('trident')) {
    var input = document.getElementById('orig');
    if (input) { = 'rtl'; = 'bidi-override';

But, obviously, this would render script less usable after Chrome and Opera start supporting Unicode 5.2.

function isRTL(s){           
    var ltrChars="A-Za-z\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u0300-\u0590\u0800-\u1FFF"+'\u2C00-\uFB1C\uFDFE-\uFE6F\uFEFD-\uFFFF',
        rtlDirCheck = new RegExp('^[^'+ltrChars+']*['+rtlChars+']');

    return rtlDirCheck.test(s);

playground page

I realize this is quite a while after the original question was asked and answered but I found vsync’s update to be rather useful and just wanted to add some observations. I would add this in comment to his answer but my reputation is not high enough yet.

Instead of a regular expression that searches from the start of the line zero or more non-LTR characters and then one RTL character, wouldn’t it make more sense to search from the start of the line zero or more weak/neutral characters and then one RTL character? Otherwise you have the potential for matching many RTL characters unnecessarily. I would welcome a more thorough examination of my weak/neutral character group as I merely used the negation of the combined LTR and RTL character groups.

Read More:   How to get background image URL of an element using JavaScript?

Additionally, shouldn’t characters such as LTR/RTL marks, embeds, overrides be included in the appropriate character groupings?

I would think then that the final code should look something like:

function isRTL(s){           
    var weakChars="\u0000-\u0040\u005B-\u0060\u007B-\u00BF\u00D7\u00F7\u02B9-\u02FF\u2000-\u2BFF\u2010-\u2029\u202C\u202F-\u2BFF",
        rtlDirCheck     = new RegExp('^['+weakChars+']*['+rtlChars+']');

    return rtlDirCheck.test(s);


There may be some ways to speed up the above regular expression. Using a negated character class with a lazy quantifier seems to help improve speed (tested on, site requires Silverlight 5)

Additionally, if the directionality of the string is unknown, my guess is that for most cases the string will be LTR instead of RTL and creating an isLTR function would return results faster if that is the case but as OP is asking for isRTL, will provide isRTL function:

function isRTL(s){           
    var rtlChars="\u0591-\u07FF\u200F\u202B\u202E\uFB1D-\uFDFD\uFE70-\uFEFC",
        rtlDirCheck     = new RegExp('^[^'+rtlChars+']*?['+rtlChars+']');

    return rtlDirCheck.test(s);

Testing for both Hebrew and Arabic (the only modern RTL languages/character sets I know which flow right-to-left except for any Persian-related which I’ve not researched):


More research suggests something along the lines of:


First addressing the question in the heading:

There are no tools in JavaScript as such for accessing Unicode properties of characters. You would need to find a library or service for the purpose (I’m afraid that might be difficult, if you need something reliable) or to extract the relevant information from the Unicode character “database” (a collection of text files in specific formats) and to write your own code to use it.

Then the question in message body:

Read More:   Removing duplicate objects with Underscore for Javascript

This seems even more desperate. But as this would probably be something for a limited number of users who are knowledgeable and know Avestan, maybe it would not be too bad to display a string of Avestan characters along with an image of them in proper directionality and ask the user click on a button if the order is wrong. And you could save this selection in a cookie, so that the user needs to do this only once (per browser; though it should be relatively short-lived cookie, as the browser may get updated).

Thanks for your comments, but it seems I’ve done this myself:

function is_script_rtl(t) {
    var d, s1, s2, bodies;

    //If the browser doesn’t support this, it probably doesn’t support Unicode 5.2
    if (!("getBoundingClientRect" in document.documentElement))
        return false;

    //Set up a testing DIV
    d = document.createElement('div'); = 'absolute'; = 'hidden';"auto";"auto";"10px"; = "'Ahuramzda'";

    s1 = document.createElement("span");

    s2 = document.createElement("span");


    bodies = document.getElementsByTagName('body');
    if (bodies) {
        var body, r1, r2;

        body = bodies[0];
        var r1 = s1.getBoundingClientRect();
        var r2 = s2.getBoundingClientRect();

        return r1.left > r2.left;

    return false;   

Example of using:

Avestan in <script>document.write(is_script_rtl('𐬨𐬀𐬰𐬛𐬂') ? "RTL" : "LTR")</script>,
Arabic is <script>document.write(is_script_rtl('العربية') ? "RTL" : "LTR")</script>,
English is <script>document.write(is_script_rtl('English') ? "RTL" : "LTR")</script>.

It seems to work. 🙂

Here’s another solution that is robust against minor amounts of RTL text in a primarily LTR string, or minor amounts of LTR text in a RTL string.

It works by counting the number of LTR or RTL characters, then classifies the string based on wether there are more LTR or RTL characters.

isRTL(text) {
  let rtl_count = (text.match(/[\u0591-\u07FF\uFB1D-\uFDFD\uFE70-\uFEFC]/g) || []).length;
  let ltr_count = (text.match(/[A-Za-z\u00C0-\u00C0\u00D8-\u00F6\u00F8-\u02B8\u0300-\u0590\u0800-\u1FFF\u2C00-\uFB1C\uFDFE-\uFE6F\uFEFD-\uFFFF]/g) || []).length;

  return (rtl_count > ltr_count);

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Similar Posts