next message in archive
next message in thread
previous message in archive
previous message in thread
Index of Subjects
> &am On Sun, 18 May 1997, Edward Dyer wrote: > Hello Norman! > > Looks fine to me! > > But you are right, in a way, it doesn't look right to you :-( > Here's the long explanation: > > The problem is this: There are two phases in retrieving a web document > (i.e. anything that goes through http:) > > For a web document, the &#NNN; is supposed to be interpreted according to > either ISO-8859-1 (ISO Latin-1) or in the proposed revision to use UniCode > (which includes ISO 8859-1 as a subset in the first 256 characters) by the > browser (Lynx if you dial in to CCN, your choice if you go by other > routes.) > > But Lynx uses the setting "Display character set" to decide which 8-bit > code to send to you to display the e-ague character. In order for that to > work two things are required: your communications software must be able to ============================================ > map that code to the appropriate glyph on your screen, which actually ===================================================================== > means the comm program must map the received code to a display font, and ========================================================================= > the display font must include the correct glyph at the correct code. ==================================================================== That doesn't apply to the samples I sent you as I used the "p" (Print) command to print to my Chebucto directory and then snipped away the irrelevant text. Although I may have seen a *copy* of it on my screen, the actual text I sent you never passed through my communications software. It went straight from the HTML file, through lynx (via "Print") to a local text file, got snipped and sent to you. > On the other hand, I hypothesise that if Lynx is not viewing web pages, > i.e. using file: access, the first stage of the interpretation does not > occur, in the same way: Lynx displays the file according to your default > character set, which is by default USASCII, and the character that is sent > to you has a different 8-bit value. This is probably a Lynx BUG, the way > we use it, but one might get some discussion on that. Technically, it is > probably an undefined behaviour. > > THE WORKAROUND: set the default character set to ISO 8859-1, and select > fonts that will work for you. I'll try that but am not too hopeful. In fact, I'll try a number of settings to see if characters are shuffled around even more with others. Whan I am on CCN, I have an ANSI font loaded into the VGA adapter. No translation is selected at all -- presumably. Note that at the library they have a 7-bit text terminal. High ANSI characters are folded to the low characters: !"#$%&'()*+,-./01...89:;<=>?@AB...YZ[\]^_`ab...yz{|}~ With a local link, the folded "alphabetic" characters come out in the order: ABCDEFGHIJKLMNOPQRSTUVWXYZ ... abcdefghijklmnopqrstuvwxyz when you view my htmlchars.html file. With an "http://www.chebucto.etc." link, you see something like the sequence: (I may have the wrong characters swapped but you get the idea) ABCEDFGIHJKMLNOPQSRTUVWYXZ ... abcedfgihjklmnpoqsrtuvwyxz > THE DOWNSIDE: people receiving mail from you may see a note that you are > using a different character set, unless they too are using 8859-1. > > THE FIX: make Lynx interpret entities in local HTML files using ISO > 8859-1/Unicode, since we use local references as a shortcut to what are > effectively web documents. Does this have any downside? need to > distinguish between HTML and other document types and views, especially > binary, and source, but I think only HTML and equivalent (htm, html.fr) > are interpreted anyway. (Aside: some browsers now support up to 5 digit > numbers in &#NNNNN; to do Unicode - does Lynx?) > > Discussion on the merits to CSuite-Dev@chebucto.ns.ca, please. > > Ed Dyer aa146@chebucto.ns.ca (902) H 826-7496 CCN Assistant Postmaster > http://www.chebucto.ns.ca/~aa146/ W 426-4894 CSuite Technical Workshop > Religion Page Editor, Chebucto Community Network http://www.chebucto.ns.ca > > On Sat, 17 May 1997 af380@chebucto.ns.ca wrote: > > > Hello. > > > > Has anyone found the cause of the annoying bug that displays some accented > > characters differently depending on the URL used? > > > > I can never be sure when I quote a web page with accented characters on it > > if I am getting them correctly. [lexographic defoliation] Norman De Forest af380@chebucto.ns.ca http://www.chebucto.ns.ca/~af380/Profile.html (A Speech Friendly Site) ......................................................................... Q. Which is the greater problem in the world today, ignorance or apathy? A. I don't know and I couldn't care less. .........................................................................
next message in archive
next message in thread
previous message in archive
previous message in thread
Index of Subjects