Help with UTF8 encoding
Posted by: Yogi
Date: December 19, 2008 11:52AM

Opening (local) saved pages K-M doesn't display German characters (Umlaute like ä,ü,ö).
It does if I set encoding to UTF8 but I can't find how to make the setting permanent.
After closing K-M my setting is lost. I also tried in browser config to set network.standard-url.encode-utf8 to true but it doesn't help neither.

PS
It doesn't occur on all sites.
Here it does (if I save the page and open it local):
http://www.heise.de/tp/blogs/2/120670

While here it doesn't:
http://www.heise.de/tp/r4/artikel/29/29404/1.html

Opened with Opera both sites are displayed correct.



Edited 4 time(s). Last edit at 12/19/2008 12:38PM by Yogi.

Re: Help with UTF8 encoding
Posted by: guenter
Date: December 19, 2008 12:48PM

I opened in K-Meleon all is displayed with Umlauten.

A problem I updated exe & kplugins with 1.5.2 RC and GRE with files from Fred's update to GRE 1.8.1.19 from 20081204. And only NT/XP/Vista based systems can use UTF8.



Edited 1 time(s). Last edit at 12/19/2008 12:50PM by guenter.

Re: Help with UTF8 encoding
Posted by: Yogi
Date: December 19, 2008 02:10PM

Thanks Günter for testing

I installed K-Meleon 1.5.2RC2 in a new folder.
I have the same results, I had with 1.5.1.
Watching the sites online is no problem. When saved and opened from HD I still have the above described problem.
No matter which version I try, I can correctly view the localy opened page only after manually selecting UTF8 and the page gets reloaded.
Since nobody else except me complained till now it must be something on my system.
I'm clueless what it could be. However Opera opens all pages saved by K-M and displays them correctly while the latter doesn't.
BTW, my OS is W2K/SP4.

Re: Help with UTF8 encoding
Posted by: misterp
Date: December 19, 2008 03:04PM

I see the same results as Yogi.

Neither page sets a charset. I don't know enough about html to know what is *supposed* to be assumed when no charset is specified.

The first page just has its characters in UTF-8.

The second page uses encoding: ü for example.

This is why there is a difference.

Re: Help with UTF8 encoding
Posted by: Yogi
Date: December 19, 2008 05:29PM

Thanks misterp for looking into.

At least a minor consolation that I'm not the only one winking smiley

I might be wrong but as far as I can see, with no charset specified, opening a locally stored page, Western(ISO-8859-1) gets applied.
This might cause my problem. However wonder how it works for other people.

Re: Help with UTF8 encoding
Posted by: misterp
Date: December 19, 2008 06:07PM

You need to set 'intl.charset.default' equal to 'UTF-8' to get what you want.

The first page sets the charset to UTF-8 in the server headers(View > cache information). The second page just uses ISO-8859-1.

The first page is badly constructed in the sense that the page itself does not contain the necessary information to display the page correctly.

Opera must be smart enough to incorporate the server information into the saved file. smiling smiley

Re: Help with UTF8 encoding
Posted by: guenter
Date: December 19, 2008 06:50PM

Can be set that way. But maybe somee Windows setting must be altered - I read something I only half remember at yahoo clever.

But not in my case - I made a mistake during testing.
And sorry for my fault - stupid me first downloaded the wrong page to test sad smiley

I retested with K-Meleon, SaeMonkey, Firefox and Lunscape5's Gecko engine.
All tests confirm that the local page does NOT show UTF8 in autodetect mode.

Reason:

Offline: <?xml version="1.0" encoding="UTF-8"?> is not at the top of the code nor any alternative code inside <head>. So it does not show UTF8.

Online: Apache server gives info that internet page is delivered with: Content-Type: text/html; charset=UTF-8 so it shows UTF8 without fault (info copied&pasted from LiveHTTPHeader extension for K-Meleon).

Sometimes servers give wrong info about content type. In this case wrong content encoding can be shown online.

Opera seems to be able to sniff better. sad smiley

For local use copy&paste: <?xml version="1.0" encoding="UTF-8"?> to the top of the document. More info about alternatives at selfhtml (German html documentation page - parts are available more languages e.g. in English).

But maybe there is a better way?



Edited 1 time(s). Last edit at 12/19/2008 06:52PM by guenter.

Re: Help with UTF8 encoding
Posted by: desga2
Date: December 19, 2008 07:15PM

When I load this web pages online both are showed fine with (¨) in K-Meleon.

It possible that as guenter post when you saved it in local disk this get your OS charset encode ISO-8859-1 by default instead Unicode UTF-8 because in website saved don't specify charset code.

Opera is possible that when save a web page added a line at begining of code for specify the charset used when a web page was loaded online (browsing before saved it). Like guenter said add this line between <head> and </head>:

<meta http-equiv=Content-Type content="text/html; charset=UTF-8" />

I have this preferences by default:

network.standard-url.encode-utf8 = false
network.standard-url.escape-utf8 = true
prefs.converted-to-utf8 = true

K-Meleon in Spanish



Edited 2 time(s). Last edit at 12/19/2008 07:58PM by desga2.

Re: Help with UTF8 encoding
Posted by: JohnHell
Date: December 19, 2008 07:58PM

Everytime you come here with character problems I recommend the same. Install this font and forget to configure anything anywhere:

http://www.megaupload.com/?d=W277L9TX

Re: Help with UTF8 encoding
Posted by: Yogi
Date: December 19, 2008 10:01PM

Quote
misterp
Opera must be smart enough to incorporate the server information into the saved file. smiling smiley

That's not the case. Opera opens and correctly displays the file saved by K-M without connecting to the server.

Quote
disrupted
that's because my default fonts for kmeleon all support unicode internally.
open preferences> page display> fonts
for serif select: new times roman
for sans serif: select arial

I had the above settings by default and didn't change them.

Quote
disrupted
Opera is possible that when save a web page added a line at begining of code for specify the charset used when a web page was loaded online (browsing before saved it).

Opera opens and correctly displays the file saved by K-M, offline as well.
Checked the source, no code is added by Opera (<html><head><!-- Copyright (c) Heise Zeitschriften Verlag...). Therefore K-M can't correctly display the page saved by Opera neither.

As expected, by adding Günter's or your line at the source K-M diplays the page fine.
As mentioned before, another way to get the page correctly displayed is to open the page, afterwards hit the View-button\Encoding and check Unicode(UTF8). After short reloading the page will be correctly displayed. The downside, you have to do so every time you open the page.

Quote
JohnHell
Everytime you come here with character problems I recommend the same. Install this font and forget to configure anything anywhere:

http://www.megaupload.com/?d=W277L9TX

No difference with the Ciberbit.ttf installed.


Thanks all of you for your readiness to help!
Best regards,
Yogi



Edited 1 time(s). Last edit at 12/19/2008 10:03PM by Yogi.

Re: Help with UTF8 encoding
Posted by: misterp
Date: December 19, 2008 10:38PM

Quote
Yogi
As expected, by adding Günter's or your line at the source K-M diplays the page fine.
As mentioned before, another way to get the page correctly displayed is to open the page, afterwards hit the View-button\Encoding and check Unicode(UTF8). After short reloading the page will be correctly displayed. The downside, you have to do so every time you open the page.

Or (as mentioned above), set intl.charset.default = UTF-8.

Thanks for the information about Opera not adding charset info - so it is just Opera sniffing on all files without a specified charset.

Re: Help with UTF8 encoding
Posted by: JohnHell
Date: December 19, 2008 10:46PM

Quote
Yogi
Quote
JohnHell
Everytime you come here with character problems I recommend the same. Install this font and forget to configure anything anywhere:

http://www.megaupload.com/?d=W277L9TX

No difference with the Ciberbit.ttf installed.


Thanks all of you for your readiness to help!
Best regards,
Yogi

Ok. I've seen the problem. The file is saved to the desktop as UTF-8 so the trick is to open just with notepad and resave as ansi. It did the trick with cyberbit to show it correctly.

Maybe we should request the feature for K-meleon, that, when saving, choose the codification of the file (as we can when we save a file with notepad).

I mean, it's not the same the document (html) codification and the file codification. I think that if we ever save it as ANSI or maybe if we are able to select how to save, that would fix these kind of problems winking smiley

EDIT: it works saving in Unicode too. The problem comes when is saved as UTF-8, as K-meleon does.



Edited 1 time(s). Last edit at 12/19/2008 10:49PM by JohnHell.

Re: Help with UTF8 encoding
Posted by: Yogi
Date: December 19, 2008 11:49PM

It seems that I've found the solution!

> about:config <

intl.charset.default    user set    string    UTF-8

So far no negative site effects.

K-Meleon forum is powered by Phorum.