To escape or not to escape an ampersand |
To escape or not to escape an ampersand |
pandy |
Apr 8 2008, 12:25 PM
Post
#1
|
🌟Computer says no🌟 Group: WDG Moderators Posts: 20,730 Joined: 9-August 06 Member No.: 6 |
It's my understanding that ampersands (in the text) don't have be escaped in HTML but should be escaped in XHTML. Now a situation arose when someone wrote "Q&A" in her HTML document. The verdict is the same from both validators: "Error: unknown entity A".
I understand what's happening, but I'd like to know if this is a glitch with the validators or if it says somewhere that an ampersand can't occur unescaped in a string unless it's followed by a space. |
pandy |
Apr 8 2008, 12:43 PM
Post
#2
|
🌟Computer says no🌟 Group: WDG Moderators Posts: 20,730 Joined: 9-August 06 Member No.: 6 |
Hmm, in the HTML 4.01 spec it actually says:
"Authors should use "&" (ASCII decimal 38) instead of "&" to avoid confusion with the beginning of a character reference (entity reference open delimiter). Authors should also use "&" in attribute values since character references are allowed within CDATA attribute values." http://www.w3.org/TR/html401/charset.html#h-5.3.2 So was I wrong? Should ampersands be escaped in HTML too? Always or only in the situation I describe above? |
Darin McGrew |
Apr 8 2008, 12:46 PM
Post
#3
|
WDG Member Group: Root Admin Posts: 8,365 Joined: 4-August 06 From: Mountain View, CA Member No.: 3 |
I couldn't find where it was documented, but the basic rule (in HTML) is that the ampersand needs to be escaped unless what follows is a character that is not allowed in character entity names. That's why an ampersand followed by a space doesn't need to be escaped: a space cannot be part of a character entity name. But I try to escape them all. You never know when someone will come along and change "A T & T" to "AT&T"...
IIRC, XHTML is more strict, but I don't recall its rules. |
pandy |
Apr 8 2008, 01:10 PM
Post
#4
|
🌟Computer says no🌟 Group: WDG Moderators Posts: 20,730 Joined: 9-August 06 Member No.: 6 |
I couldn't find where it was documented, but the basic rule (in HTML) is that the ampersand needs to be escaped unless what follows is a character that is not allowed in character entity names. That's why an ampersand followed by a space doesn't need to be escaped: a space cannot be part of a character entity name. But I try to escape them all. You never know when someone will come along and change "A T & T" to "AT&T"... Thanks. If you happen to remember where it's documented, please let me know. QUOTE IIRC, XHTML is more strict, but I don't recall its rules. Totally strict. http://www.w3.org/TR/xhtml1/#C_12 |
pandy |
Apr 8 2008, 01:17 PM
Post
#5
|
🌟Computer says no🌟 Group: WDG Moderators Posts: 20,730 Joined: 9-August 06 Member No.: 6 |
|
Lo-Fi Version | Time is now: 25th April 2024 - 06:44 AM |