The Web Design Group

... Making the Web accessible to all.

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Garbled form script output after changing charset to UTF-8
Christian J
post Sep 3 2018, 03:42 PM
Post #1


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



This old guestbook script converts Swedish åäö characters to the HTML entities å, ä and ö. But when I changed charset from ISO 8859-1 to UTF-8 in the guestbook's HTML files, åäö characters in new guestbook entries became garbled by the perl script (old entries in the guestbook still displayed correctly). I eventually gave up and deleted the whole guestbook, but I'm still curious what might have caused the bug. Could form data posted from a UTF-8 web page be to blame?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
pandy
post Sep 3 2018, 08:20 PM
Post #2


🌟Computer says no🌟
********

Group: WDG Moderators
Posts: 20,716
Joined: 9-August 06
Member No.: 6



Yeah, I think so. I don't know the ins and outs of it, but Perl at least used to have a problem with Unicode. Maybe if you had made the script also convert Unicode åäö to entities?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Sep 4 2018, 04:45 AM
Post #3


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



QUOTE(pandy @ Sep 4 2018, 03:20 AM) *

Yeah, I think so. I don't know the ins and outs of it, but Perl at least used to have a problem with Unicode.

IIRC, the åäö characters looked like this: åäö, which I think usually happens if a document saved as UTF-8 still uses the iso-8859-1 charset. I tried replacing every occurence of iso-8859-1 META charset tags in the perl script with UTF-8, and even tried saving the perl script itself as UTF-8 to no avail.

QUOTE
Maybe if you had made the script also convert Unicode åäö to entities?

Alas I don't know Perl or Unicode well enough.

I was thinking of making the form on the UTF-8 page to submit its form data as iso-8859-1, so that the Perl script then could handle it. Could a form's ACCEPT-CHARSET attribute be used for that?

"The ACCEPT-CHARSET attribute specifies a list of character encodings that are accepted by the form handler. The value consists of a list of "charsets" separated by commas and/or spaces. The default value is UNKNOWN and is usually considered to be the character encoding used to transmit the document containing the FORM."
http://www.htmlhelp.com/reference/html40/forms/form.html


User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
pandy
post Sep 4 2018, 09:34 AM
Post #4


🌟Computer says no🌟
********

Group: WDG Moderators
Posts: 20,716
Joined: 9-August 06
Member No.: 6



QUOTE(Christian J @ Sep 4 2018, 11:45 AM) *

QUOTE(pandy @ Sep 4 2018, 03:20 AM) *

Yeah, I think so. I don't know the ins and outs of it, but Perl at least used to have a problem with Unicode.

IIRC, the åäö characters looked like this: åäö, which I think usually happens if a document saved as UTF-8 still uses the iso-8859-1 charset. I tried replacing every occurence of iso-8859-1 META charset tags in the perl script with UTF-8,

Meta tags in the script? How would that work? unsure.gif

QUOTE
and even tried saving the perl script itself as UTF-8 to no avail.

There's more to it. Have you read this?
https://perldoc.perl.org/perlunicode.html
I haven't more than glanced at it. But I think you may find something there.

QUOTE
QUOTE
Maybe if you had made the script also convert Unicode åäö to entities?

Alas I don't know Perl or Unicode well enough.


Neither do I.

QUOTE

I was thinking of making the form on the UTF-8 page to submit its form data as iso-8859-1, so that the Perl script then could handle it. Could a form's ACCEPT-CHARSET attribute be used for that?

But why is it important that the page is UTF-8? Can't you just go back to what you had?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Sep 4 2018, 11:55 AM
Post #5


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



QUOTE(pandy @ Sep 4 2018, 04:34 PM) *

Meta tags in the script? How would that work? unsure.gif

Sorry it was headers, not Meta tags. Like this one:

CODE
print "Content-Type: text/html; charset=iso-8859-1\n\n";

It's done for each confirmation/error page.

QUOTE
There's more to it. Have you read this?
https://perldoc.perl.org/perlunicode.html
I haven't more than glanced at it. But I think you may find something there.

blink.gif

QUOTE
But why is it important that the page is UTF-8? Can't you just go back to what you had?

I could have made an ISO 8859-1 exception with the guestbook form page, but the sites uses inclusion files like nav menus that need the same encoding on all pages, so I'd have to change back the whole site to ISO 8859-1 which felt like even more work.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
pandy
post Sep 4 2018, 04:27 PM
Post #6


🌟Computer says no🌟
********

Group: WDG Moderators
Posts: 20,716
Joined: 9-August 06
Member No.: 6



QUOTE(Christian J @ Sep 4 2018, 06:55 PM) *

QUOTE(pandy @ Sep 4 2018, 04:34 PM) *

Meta tags in the script? How would that work? unsure.gif

Sorry it was headers, not Meta tags. Like this one:

CODE
print "Content-Type: text/html; charset=iso-8859-1\n\n";

It's done for each confirmation/error page.

QUOTE
There's more to it. Have you read this?
https://perldoc.perl.org/perlunicode.html
I haven't more than glanced at it. But I think you may find something there.

blink.gif

QUOTE
But why is it important that the page is UTF-8? Can't you just go back to what you had?

I could have made an ISO 8859-1 exception with the guestbook form page, but the sites uses inclusion files like nav menus that need the same encoding on all pages, so I'd have to change back the whole site to ISO 8859-1 which felt like even more work.



Ack. Then yo have to read the perl doc page I linked to. biggrin.gif
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post

Reply to this topicStart new topic
2 User(s) are reading this topic (2 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 28th March 2024 - 10:00 AM