Help - Search - Members - Calendar
Full Version: character set/encoding and its specification
HTMLHelp Forums > Administrative > Feedback and Assistance
Peter Evans
In a sandbox message, I posted some Japanese. I'll try it again:

いま日本語で書いております。

Chances are that you, dear reader, can't read Japanese. All the same, you're probably more or less aware of what Japanese looks like: a mixture of complex, angular characters (like Chinese, and indeed from Japanese), and simpler, cursive forms. I wonder if you see that mixture in the line above. If you don't, there could be any of various meanings, e.g. that your computer doesn't have a Japanese font installed. But if you do see it (as I do), that too is odd as this page reads:

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />

and of course those characters are definitely not within ISO 8859-1.
Guest
In IE6 I just get blank space. Opera at least displays a line of "||||||"-like characters, while Firefox displays "????".
Christian J
That was me replying to the above, don't know why I became logged out. But it's happened before when I have both IE6 and Opera open at the same time.
John Pozadzides
Here is what I see using Avant Browser.
Click to view attachment
pandy
While I see the Japanese fine, both in IE and Opera. Always do. Browsers set to use to Western European (ISO). Never understood why it works this way.
jimlongo
it's Japanese in all my OS X browsers
IPB Image


Noticed the BB software converts it to html entities. Which you can type in and have appear, but once you preview they get converted to screen fonts in my browser . . . then if you edit or re-preview the screen fonts get converted to ????????

Strange.

語語語語
pandy
You are so smart! Never thought of looking at the source. wub.gif
Liam Quinn
QUOTE(jimlongo @ Aug 29 2006, 12:54 PM) *

Noticed the BB software converts it to html entities.


Your browser does that conversion when it submits non-ISO-8859-1 characters from an ISO-8859-1-encoded page.

The problem is that the BB software isn't converting "&" to "&amp;" for numeric character references. If I type "&" followed by "#169;", I would expect to see those six characters displayed literally, but the BB software fails to escape the "&", so I see the copyright sign.

Since we need to be able to give examples of numeric character references here, I think we should fix the software to convert "&" to "&amp;" as it does for entities like &copy;.
John Pozadzides
QUOTE(Liam Quinn @ Aug 30 2006, 01:07 AM) *

Your browser does that conversion when it submits non-ISO-8859-1 characters from an ISO-8859-1-encoded page.

O! That explains it. So John and Christian simply don't have a Japanese font?
John Pozadzides
QUOTE(John Pozadzides @ Aug 29 2006, 06:39 PM) *

QUOTE(Liam Quinn @ Aug 30 2006, 01:07 AM) *

Your browser does that conversion when it submits non-ISO-8859-1 characters from an ISO-8859-1-encoded page.

O! That explains it. So John and Christian simply don't have a Japanese font?


Uh. Something is wrong with this picture... I didn't write that last comment.

pandy did you accidentally edit mine instead of replying? I had said something to the effect that I was having to check on this because it was a tough issues...
John Pozadzides
QUOTE(Liam Quinn @ Aug 29 2006, 06:07 PM) *

Since we need to be able to give examples of numeric character references here, I think we should fix the software to convert "&" to "&" as it does for entities like ©.

I just got an answer back from Invision on this. They say that this WILL work if we enable HTML posting on the board. Currently it is disabled because of all the mischief that people could cause with it.

I'm not entirely following along with exactly what the problem is though. Can someone break it down for me very explicitly. Perhaps even give me a little sample of how it might be used/broken in a realy thread?

John
pandy
QUOTE(John Pozadzides @ Aug 30 2006, 05:46 AM) *

Uh. Something is wrong with this picture... I didn't write that last comment.

pandy did you accidentally edit mine instead of replying? I had said something to the effect that I was having to check on this because it was a tough issues...


I guess I must have... I'm the one who said it anyway. unsure.gif
Peter Evans
I should have looked at the source, sorry. (I plead exhaustion: a losing battle with a combination of installation of an unfamiliar distro, and gradual conviction that my hardware's on the blink. A new motherboard awaits installation.)

What's more alarming is that people can edit others' messages. As long as this superpower is limited to moderators, it's fine. I hasten to reassure y'all that I don't see edit buttons on your messages. (Or indeed on my own, which is a bit irritating as I'd like to correct my sleepy "like Chinese, and indeed from Japanese" to "like Chinese, and indeed from Chinese".)
jimlongo
QUOTE(Peter Evans @ Aug 30 2006, 12:25 AM) *

What's more alarming is that people can edit others' messages. As long as this superpower is limited to moderators, it's fine. I hasten to reassure y'all that I don't see edit buttons on your messages. (Or indeed on my own, which is a bit irritating as I'd like to correct my sleepy "like Chinese, and indeed from Japanese" to "like Chinese, and indeed from Chinese".)


Once you start editing BBcode QUOTES it's pretty easy to improperly nest quotes and have them attributed to others. Just like any BB.
jimlongo
QUOTE(John Pozadzides @ Aug 29 2006, 11:48 PM) *

I'm not entirely following along with exactly what the problem is though. Can someone break it down for me very explicitly. Perhaps even give me a little sample of how it might be used/broken in a realy thread?


One problem you will find is that if you were to type an html entity -> then Preview, then look what happens to what you typed in the Reply Box, then Preview again. Eventually it will become undecipherable to the software. The same thing will happen with Quotes . . . for instance look at the quote in your reply to Liam it's quite different than what he wrote.*


I seem to remember a similar problem on my board with PHP and ampersands
. . . is it anything in there?
John Pozadzides
QUOTE(Peter Evans @ Aug 29 2006, 11:25 PM) *

I should have looked at the source, sorry. (I plead exhaustion: a losing battle with a combination of installation of an unfamiliar distro, and gradual conviction that my hardware's on the blink. A new motherboard awaits installation.)

What's more alarming is that people can edit others' messages. As long as this superpower is limited to moderators, it's fine. I hasten to reassure y'all that I don't see edit buttons on your messages. (Or indeed on my own, which is a bit irritating as I'd like to correct my sleepy "like Chinese, and indeed from Japanese" to "like Chinese, and indeed from Chinese".)

Peter,

Only moderators can edit everyone's messages. You however can edit your own for up to 1 hour after posting. We set it up that way so that revisionism wouldn't take place. If you feel that time needs to be a little longer we can change it, but we'd need to have a little discussion on the matter.

John
John Pozadzides
QUOTE(jimlongo @ Aug 30 2006, 08:48 AM) *

One problem you will find is that if you were to type an html entity -> then Preview, then look what happens to what you typed in the Reply Box, then Preview again. Eventually it will become undecipherable to the software. The same thing will happen with Quotes . . . for instance look at the quote in your reply to Liam it's quite different than what he wrote.*

Ok. I do see that. But the flip side is that Liam was actually able to put it in there originally, which is the most important part if we are trying to give someone a little instruction... isn't it?

John
Liam Quinn
QUOTE(John Pozadzides @ Aug 29 2006, 11:48 PM) *

I just got an answer back from Invision on this. They say that this WILL work if we enable HTML posting on the board. Currently it is disabled because of all the mischief that people could cause with it.

I'm not entirely following along with exactly what the problem is though. Can someone break it down for me very explicitly. Perhaps even give me a little sample of how it might be used/broken in a realy thread?


If I type the characters & # 1 6 9 ; without intervening spaces, I should not see the copyright sign. I should see the characters that I typed, like on the old BBS, which always escapes "&" as "&amp;".

Invision appears to be inconsistent regarding when it escapes "&". I can type & c o p y ; and have it show literally as &copy;, but even that gets mangled with quoting. That clearly indicates an Invision bug, whereas the & # 1 6 9 ; issue could be argued to be a feature (just not on an HTML discussion board).

Enabling HTML posting isn't the right answer. It would make discussing HTML quite difficult if we had to escape all tags ourselves.
Peter Evans
QUOTE(John Pozadzides @ Aug 30 2006, 11:28 PM) *

QUOTE(Peter Evans @ Aug 29 2006, 11:25 PM) *

. . . I hasten to reassure y'all that I don't see edit buttons on your messages. (Or indeed on my own, which is a bit irritating as I'd like to correct my sleepy "like Chinese, and indeed from Japanese" to "like Chinese, and indeed from Chinese".)


Only moderators can edit everyone's messages. You however can edit your own for up to 1 hour after posting. We set it up that way so that revisionism wouldn't take place. If you feel that time needs to be a little longer we can change it, but we'd need to have a little discussion on the matter. . . .


Point taken. One hour seems reasonable.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.