Printable Version of Topic

Click here to view this topic in its original format

HTMLHelp Forums _ Client-side Scripting _ string.replace against html symbol code (&divide)?

Posted by: CharliePrince Jan 9 2021, 03:38 AM

I'm trying to replace an html symbol code '&divide' with another string.

I can't get this to work. See example below.

CODE
     <body>

         <p id='target'>A &divide B</p>
         <script>
             console.log(`${document.getElementById('target').innerText}`);

             console.log(`${document.getElementById('target').innerText.replaceAll('÷', '/')}  (want this to = 'A / B') `);

             console.log(`${document.getElementById('target').innerText.replaceAll('&divide', '/')}  (or, want this to = 'A / B') `);

         </script>

     </body>



Can anyone help? Pandy?

Posted by: pandy Jan 9 2021, 04:28 AM

I don't follow your console.log examples. But you have one fatal error. The entity in question should be written like so

CODE
&divide;

with a trailing semicolon. Seems some browsers mend that mistake. But JS won't.

Then I think you need to split things up because you don't only need to replace but also print what you've replaced back. I'm not used to console.log and can't get anything to work, but if we do it like this...

HTML
<p id='target'>A &divide; B</p>

<button onclick="doIt()">Click!</button>


CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('B', 'C');
   document.getElementById('target').innerHTML = replacestring;
}


That works, right? The reason I replaced B with C rather than what you wanted is I can't get it to work either! It seems innerHTML returns the actual character, the interpreted entity.


CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   alert(currentstring);
   var replacestring = currentstring.replace('&divide;', '/');
   document.getElementById('target').innerHTML = replacestring;
}


So that explains why the above doesn't work. Problem is using the the character, ÷, in the replace doesn't work either! Neither does it work to type the character ÷ in the HTML, but this could maybe be a character encoding problem since I used a non unicode editor and I don't have time to test with unidode right now. But you can. Let me know how it goes.

I just had time to google a little and found nothing, but there must be some trick to this. I think entities need to be 'decoded' somehow, but I didn't find any straightforward way to do this.

I think I leave this to Christian. Or maybe Darin. I'm off for a refreshing walk in the snow. tongue.gif

Posted by: Christian J Jan 9 2021, 09:18 AM

QUOTE(pandy @ Jan 9 2021, 10:28 AM) *

It seems innerHTML returns the actual character, the interpreted entity.

Korpela says "client-side JavaScript operates on the DOM, where entities do not exist" in this thread: https://stackoverflow.com/questions/18749591/encode-html-entities-in-javascript

The thread also suggests workarounds, but none very simple. It might be easier with server-side scripting (e.g. PHP). unsure.gif







Posted by: pandy Jan 9 2021, 09:24 AM

I saw that thread, but not that Korpola was there.

DOM or not, the character is returned, so why can't it be replaced? Don't get it. I take it that I didn't succeed wasn't because of lack of unicode...

Posted by: pandy Jan 9 2021, 09:29 AM

It's worse. Look at this.

HTML
<p id='target'>A &amp; B</p>
<button onclick="doIt()">Click!</button>


CODE

function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   alert(currentstring);
   var replacestring = currentstring.replace('&amp;', '/');
   document.getElementById('target').innerHTML = replacestring;
}


Now the entity is ruturned! But the replacement still doesn't work. WTF? wacko.gif

Needless to say, replacing with & doesn't work either.

Posted by: Christian J Jan 9 2021, 02:34 PM

Even worse:

CODE
<p onclick="alert(this.innerHTML);">&divide; &amp;</p>

What's the difference between the two entities? (The first entity is rendered, but not the second.)

Posted by: pandy Jan 9 2021, 02:37 PM

You tell me! wacko.gif

Posted by: pandy Jan 9 2021, 02:42 PM

QUOTE(pandy @ Jan 9 2021, 08:37 PM) *

You tell me! wacko.gif



And look at this! Just some random ones from a library I have in my editor.
CODE

<p onclick="alert(this.innerHTML);">&larr; &#8212; &#160; &#8216; &#8230; &#177;</p>


&#160; turns inte &nbsp; ! laugh.gif


Attached image(s)
Attached Image

Posted by: pandy Jan 9 2021, 02:52 PM

That was k-mel. Edge, IE and FF do the same thing. That's something, at least. tongue.gif

FYI &nbsp; is treated the same as the numeric entity. It's not interpreted. So ampersand and non-breaking space entities are different. And probably a few more. Jesus christ.

Posted by: Christian J Jan 9 2021, 02:56 PM

Could it have something to do with ASCII or Unicode ranges? I don't know anything about that.

Posted by: pandy Jan 9 2021, 02:58 PM

But for example &larr; has been around as long as I have. Must be ascii?

Posted by: pandy Jan 9 2021, 03:06 PM

The whole first column form here https://htmlhelp.com/reference/html40/entities/latin1.html . Only &nbsp; stands out.

Attached Image


https://htmlhelp.com/reference/html40/entities/symbols.html are all interpreted.

Attached Image

Posted by: pandy Jan 9 2021, 03:23 PM

Among these I found two more that aren't interpreted.
https://htmlhelp.com/reference/html40/entities/special.html

Attached Image

& <, and > are used in HTML itself (if URLs are included). But &nbsp;? I thought I was on to something for a while, but it doesn't add up.

Posted by: CharliePrince Jan 9 2021, 04:07 PM

Pandy, your example seems to work here ha.

Also, I was able to replace the "&divide;" using String.fromCharCode (see below)

CODE
currentString.replace(String.fromCharCode(247), '/');


These threads were actually very helpful for me. Thank you!

Posted by: CharliePrince Jan 9 2021, 04:26 PM

I didn't notice your most recent posts here until after my last reply above.

Yes - it's odd to me why some of these are interpreted or rendered literally or as their encoded char or whatever. Crap! I don't know the words to describe but I think you and Christian know what I mean blush.gif

Sorry about the console.logs but . . Look at this (below) . . . if it's of any interest where currentString is broken down to each character

CODE
    console.log(currentString);
    for (i = 0; i < currentString.length; i++) {
        console.log(currentString[i] + ' = ' + currentString.charCodeAt(i));
    }



Posted by: Christian J Jan 9 2021, 08:47 PM

QUOTE(CharliePrince @ Jan 9 2021, 10:26 PM) *

I don't know the words to describe but I think you and Christian know what I mean blush.gif

We sure do. I usually reserve this smilie for situations like this: wacko.gif

QUOTE
Sorry about the console.logs but . . Look at this (below) . . . if it's of any interest where currentString is broken down to each character

CODE
    console.log(currentString);
    for (i = 0; i < currentString.length; i++) {
        console.log(currentString[i] + ' = ' + currentString.charCodeAt(i));
    }

I don't understand, will it not just return the charCode for each character in currentString? Except if you feed it

CODE
&amp;

of course, then it will return

CODE
&amp;
& = 38
a = 97
m = 109
p = 112
; = 59

wacko.gif


Posted by: pandy Jan 10 2021, 04:25 AM

cool.gif cool.gif cool.gif cool.gif cool.gif cool.gif cool.gif

HTML
<p id='target'>A &divide; B</p>

<button onclick="doIt()">Click!</button>



CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('\367', '/');
   document.getElementById('target').innerHTML = replacestring;
}



If the entity is one of the not 'interpreted' ones it doesn't work right off, because there javascript seems to see the HTML entity as a string of characters. For ampersand I had to do this or only the literal ampersand would be replaced, not the whole entity.

I find this very confusing, because if it is a sting, why can't it be replaced the normal way?

HTML
<p id='target'>A &amp; B</p>

<button onclick="doIt()">Click!</button>


CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var entity = '\46' + 'amp;';
   var replacestring = currentstring.replace(entity, 'and');
   document.getElementById('target').innerHTML = replacestring;
}


So it gets hairy anyway. Not for a unique case like with divide, but if you don't know what entities there may be. There could be more 'not interpreted' ones than ones we've found. There must be some documentation for this!

Maybe one could check for & and ; and use different methods accordingly, but those characters could occur otherwise in the text too, so I don't really see how this could work unless a list of not interpreted entities is used...

Posted by: pandy Jan 10 2021, 05:04 AM

I demand to get a gold star after my name! cool.gif

Posted by: pandy Jan 10 2021, 05:11 AM

QUOTE(pandy @ Jan 10 2021, 10:25 AM) *

If the entity is one of the not 'interpreted' ones it doesn't work right off, because there javascript seems to see the HTML entity as a string of characters. For ampersand I had to do this or only the literal ampersand would be replaced, not the whole entity.

I find this very confusing, because if it is a sting, why can't it be replaced the normal way?


But it can! Jesus, I'm stupid sometime - and you too that didn't see my mistake. tongue.gif

& needs to be escaped of course! Then it works without splitting the entity string.

CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('\&amp;', 'and');
   document.getElementById('target').innerHTML = replacestring;
}


Posted by: pandy Jan 10 2021, 05:12 AM

Still want my gold star. angry.gif

Posted by: Christian J Jan 10 2021, 03:18 PM

QUOTE(pandy @ Jan 10 2021, 10:25 AM) *

cool.gif cool.gif cool.gif cool.gif cool.gif cool.gif cool.gif

HTML
<p id='target'>A &divide; B</p>

<button onclick="doIt()">Click!</button>



CODE
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('\367', '/');
   document.getElementById('target').innerHTML = replacestring;
}


Strange, this does not work:

CODE
<p id='target'>A &divide; B</p>
<button onclick="doIt()">Click!</button>

<script type="text/javascript">
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('÷', '/'); // Note that I don't use an entity here.
   document.getElementById('target').innerHTML = replacestring;
}
</script>

--why isn't the entity "interpreted" now? Is the replace() method different?unsure.gif



Posted by: Christian J Jan 10 2021, 03:19 PM

QUOTE(pandy @ Jan 10 2021, 11:12 AM) *

Still want my gold star. angry.gif


You can have as many as you like, pandy. biggrin.gif

🌟 🌟 🌟

Posted by: pandy Jan 10 2021, 03:30 PM

QUOTE(Christian J @ Jan 10 2021, 09:18 PM) *


Strange, this does not work:

CODE
<p id='target'>A &divide; B</p>
<button onclick="doIt()">Click!</button>

<script type="text/javascript">
function doIt()
{
   var currentstring = document.getElementById('target').innerHTML;
   var replacestring = currentstring.replace('÷', '/'); // Note that I don't use an entity here.
   document.getElementById('target').innerHTML = replacestring;
}
</script>



We already knew that.

CODE
--why isn't the entity "interpreted" now? Is the replace() method different?:unsure:


And that too. It is interpreted, but it doesn't work to use the actual character for the replace.

Those too things are what was so darn odd!

I was googling html entities and javascript and stumbled on this page. Then I got the idea that maybe if one used the character code JS prefers... And it worked. cool.gif
https://brajeshwar.github.io/entities/



Posted by: Christian J Jan 10 2021, 04:16 PM

QUOTE(pandy @ Jan 10 2021, 09:30 PM) *

I was googling html entities and javascript and stumbled on this page. Then I got the idea that maybe if one used the character code JS prefers... And it worked. cool.gif
https://brajeshwar.github.io/entities/

Seems I already had that page bookmarked, we must have been discussing this before.

Posted by: pandy Jan 10 2021, 04:46 PM

Nope. I didn't know about this and I hadn't seen that page. Well, I knew that kind of octal codes could be used in JS, but I didn't know they needed to be used in replace strings.

I may add that I still don't understand why they need to be used... blush.gif

In a case like this the ampersand doesn't need to be escaped. What's the difference?

CODE
  var test = 'this & that';
  alert(test);


Are there other cases when these characters need escaping?

Posted by: Christian J Jan 10 2021, 07:56 PM

CSS generated content? Maybe I saw that page when reading this: https://mathiasbynens.be/notes/css-escapes unsure.gif

Posted by: pandy Jan 10 2021, 08:10 PM

Haven't seen that page either. The only things I've read about CSS escape characters is in the spec and maybe some book, I think. unsure.gif

Powered by Invision Power Board (http://www.invisionboard.com)
© Invision Power Services (http://www.invisionpower.com)