Discussion:
[VM] Cached data with international characters
Uday Reddy
2012-10-27 14:54:06 UTC
Permalink
VM stores in a special header (called X-VM-v5-data) certain internal data
which it remembers between VM sessions. These are things like the author,
recipients, subject etc., which are primarily used for displaying summary
lines.

The trouble is that message headers can only have ASCII characters but the
data that needs to be remembered can be in other character sets. Rob F
introduced a way to encode the cached data headers in VM 8.0.10 (way back in
2008). The release notes say:

* Correctly store UTF-8 strings in the X-VM-v5-Data header to avoid
corruption of summary lines. (Thanks to Yuning Feng for reporting)

* Correctly encode multibyte subjects. (Thanks to Yuning Feng for the
patch)

However, I am unable to find any place where this data is being decoded.
So, it is surprising that things stayed this way for so long and nobody
reported any problems!

Are there problems? Do people that receive messages with international
character sets in the headers find that VM is mishandling them in the
Summary lines?

Cheers,
Uday
Julian Bradfield
2012-10-27 16:09:28 UTC
Permalink
Post by Uday Reddy
Are there problems? Do people that receive messages with international
character sets in the headers find that VM is mishandling them in the
Summary lines?
Yes. It doesn't work for me (with what I call 8.2.0b1-maybe, which is
something you pointed me at a while ago). International data in
cached headers gets lost when it's saved to disk - it's not being
mime-encoded. And since I throw warnings whenever this happens, it
annoys me a lot - but clearly not enough yet! I've made a half-hearted
attempt to work out how to fix this, but haven't succeeded.
John Stoffel
2012-10-29 19:48:15 UTC
Permalink
Uday> VM stores in a special header (called X-VM-v5-data) certain internal data
Uday> which it remembers between VM sessions. These are things like the author,
Uday> recipients, subject etc., which are primarily used for displaying summary
Uday> lines.

Uday> The trouble is that message headers can only have ASCII characters but the
Uday> data that needs to be remembered can be in other character sets. Rob F
Uday> introduced a way to encode the cached data headers in VM 8.0.10 (way back in
Uday> 2008). The release notes say:

Uday> * Correctly store UTF-8 strings in the X-VM-v5-Data header to avoid
Uday> corruption of summary lines. (Thanks to Yuning Feng for reporting)

Uday> * Correctly encode multibyte subjects. (Thanks to Yuning Feng for the
Uday> patch)

Uday> However, I am unable to find any place where this data is being decoded.
Uday> So, it is surprising that things stayed this way for so long and nobody
Uday> reported any problems!

Uday> Are there problems? Do people that receive messages with
Uday> international character sets in the headers find that VM is
Uday> mishandling them in the Summary lines?

This might explain some of the wierd issues I've been seeing with
emails, but I generally tend to run VM inside emacs in a gnu screen
session, so I'm very text based. And what happens is that sometimes
when I read a message, then move to the next message parts of the
screen aren't re-drawn properly, which is usually fixed with an C-l to
force a re-fresh.

This happens with both 8.2.0b (emacs 23.1.1 centos) and 8.1.2 (debian
squeeze emacs 23.2.1), though I suspect I run into this more often
with 8.1.2 just because I'm usually reading my home email that way.

So I've recently tried moving over to TMUX instead of gnu screen, but
I still see the problem randomly. If you can come up with a text
case, I'd be happy to test things out and let you now how it looks.

John
Julian Bradfield
2012-10-29 20:01:53 UTC
Permalink
Post by John Stoffel
This might explain some of the wierd issues I've been seeing with
emails, but I generally tend to run VM inside emacs in a gnu screen
session, so I'm very text based. And what happens is that sometimes
when I read a message, then move to the next message parts of the
screen aren't re-drawn properly, which is usually fixed with an C-l to
force a re-fresh.
This happens with both 8.2.0b (emacs 23.1.1 centos) and 8.1.2 (debian
squeeze emacs 23.2.1), though I suspect I run into this more often
with 8.1.2 just because I'm usually reading my home email that way.
Interesting - I see this too, and I'd assumed it was a problem in my
Emacs. But since you're using fsfmacs, and I use (my own fork of)
XEmacs, perhaps it isn't.
But I'm not at all sure how it could be anything to do with VM - why
would VM be messing with anything as low level as screen re-drawing?
John Stoffel
2012-10-30 14:52:48 UTC
Permalink
Post by John Stoffel
This might explain some of the wierd issues I've been seeing with
emails, but I generally tend to run VM inside emacs in a gnu screen
session, so I'm very text based. And what happens is that sometimes
when I read a message, then move to the next message parts of the
screen aren't re-drawn properly, which is usually fixed with an C-l to
force a re-fresh.
This happens with both 8.2.0b (emacs 23.1.1 centos) and 8.1.2 (debian
squeeze emacs 23.2.1), though I suspect I run into this more often
with 8.1.2 just because I'm usually reading my home email that way.
Julian> Interesting - I see this too, and I'd assumed it was a problem
Julian> in my Emacs. But since you're using fsfmacs, and I use (my own
Julian> fork of) XEmacs, perhaps it isn't.

Julian> But I'm not at all sure how it could be anything to do with VM
Julian> - why would VM be messing with anything as low level as screen
Julian> re-drawing?

I suspect it's VM just pushing utf8 or other non-encoded characters
into the screen, when Emacs is expecting plain ASCII or some other
encoding system. I'm not sure and my elisp-fu is so weak that I'll
never be able to debug it myself.

So my setup is:

xterm -> ssh -> tcsh -> screen/tmux -> tcsh -> emacs -> vm

so there's *alot* of potential to screw up. One reason I went to tmux
was to see if I could fix the problem, since I assumed it was more of
a screen issue not redrawing things properly.

Interestingly enough, my home system (the tcsh sessions) has
LANG=en_US.UTF-8 by default. Hmm... maybe if I change that down to
plain ASCII or something else things will be better? Unfortunately,
I don't have a test email at hand right this second. I'll keep
waiting for it to happen again and I'll experiment some more.

John
John Stoffel
2012-10-30 19:35:06 UTC
Permalink
Post by John Stoffel
This might explain some of the wierd issues I've been seeing with
emails, but I generally tend to run VM inside emacs in a gnu screen
session, so I'm very text based. And what happens is that sometimes
when I read a message, then move to the next message parts of the
screen aren't re-drawn properly, which is usually fixed with an C-l to
force a re-fresh.
This happens with both 8.2.0b (emacs 23.1.1 centos) and 8.1.2 (debian
squeeze emacs 23.2.1), though I suspect I run into this more often
with 8.1.2 just because I'm usually reading my home email that way.
Julian> Interesting - I see this too, and I'd assumed it was a problem
Julian> in my Emacs. But since you're using fsfmacs, and I use (my own
Julian> fork of) XEmacs, perhaps it isn't.

Julian> But I'm not at all sure how it could be anything to do with VM
Julian> - why would VM be messing with anything as low level as screen
Julian> re-drawing?

John> I suspect it's VM just pushing utf8 or other non-encoded characters
John> into the screen, when Emacs is expecting plain ASCII or some other
John> encoding system. I'm not sure and my elisp-fu is so weak that I'll
John> never be able to debug it myself.

John> So my setup is:

John> xterm -> ssh -> tcsh -> screen/tmux -> tcsh -> emacs -> vm

John> so there's *alot* of potential to screw up. One reason I went to tmux
John> was to see if I could fix the problem, since I assumed it was more of
John> a screen issue not redrawing things properly.

John> Interestingly enough, my home system (the tcsh sessions) has
John> LANG=en_US.UTF-8 by default. Hmm... maybe if I change that down
John> to plain ASCII or something else things will be better?
John> Unfortunately, I don't have a test email at hand right this
John> second. I'll keep waiting for it to happen again and I'll
John> experiment some more.

As a small experiment, I did:

setenv LANG C

and restarted the emacs process running VM 8.1.2 and it *seems* to
have less screen display corruption now, but I haven't been using this
long enough to be sure.

John
Uday Reddy
2012-10-30 21:23:11 UTC
Permalink
Post by John Stoffel
This might explain some of the wierd issues I've been seeing with
emails, but I generally tend to run VM inside emacs in a gnu screen
session, so I'm very text based. And what happens is that sometimes
when I read a message, then move to the next message parts of the
screen aren't re-drawn properly, which is usually fixed with an C-l to
force a re-fresh.
Dear John, it will be useful if you can try this out with the good old
VM 7.19 (Kyle Jones's last release) and see if the problem was present
there. If it wasn't present, then we need to track down to see which
version of VM introduced the problem.

I do think, however, that this problem is different from the cached data
issue that I was worrying about.

Cheers,
Uday

Loading...