[Israel.pm] multipart/alternative added
gaal at forum2.org
Sun May 30 10:46:10 PDT 2010
On Sun, May 30, 2010 at 6:51 PM, Amit Aronovitch <aronovitch at gmail.com>wrote:
> just some<technical bidi-related nitpicking> ...
> On 05/27/2010 08:46 PM, Gaal Yahas wrote:
> > Hi Mikhael,
> > It is my understanding that this change was made for two reasons: the
> > possibility of embracing Hebrew as an acceptable language for posts on
> > this list, and the related technical questions regarding the
> > difficulty of transferring Hebrew correctly over plain text email.
> > The first reason can afford social debate. As Gabor pointed out, there
> > are people on this list who don't know Hebrew and it's perhaps
> > discourteous to them to accept another language. But this is offset by
> > the fact that this is a local Mongers list, and there is no other
> > forum on the internet where people who are uncomfortable with English
> > can participate and have a conversation about Perl in Hebrew. The list
> > can debate this more but personally, I think this is a perfectly fine
> > goal, even if I'm quite likely to make my own posts in English due to
> > personal preferences.
> > The other is technical. It is simply impossible to get all email
> > clients to work correctly in bidi languages using only plain text.
> Not impossible. Just "not simple at the moment", as we can see even in
> Oron's message, which does not mix LTR and RTL text in the same line (in
> thunderbird, for example, the semicolons/colons display in the wrong
> side of the code/hebrew when you use the keyboard shortcut to switch to
> RTL/LTR mode, respectively. You can never see them both in the same
> window without any garbling).
I take this mostly back. I misremembered the spec's treatment of paragraphs:
they reset bidi context, which is fine (UAX #9); the problem lies in 5.8
which doesn't make the definition of a paragraph separator bulletproof. I
suppose you can start every line with either RLE or LRE and always emit a
PDF before linebreaks, to be safe. This is very cumbersome.
> To really solve the garbling, one has to use unicode control characters.
> הנה קטעי הקוד של גבור--->:
> In my viewer at least, the comment opener here seems wrong (mirrored).
Either that, or the closing part of the tag is wrong. I didn't bother
inspecting your source (because, here's another problem, bidi marks are
invisible and difficult to debug). You're probably aware that in certain
cases bidi contexts do not fully reset after PDFs and need a RLM or LRM back
in the document directionality for things to work out.
> הגדרת משתנה סקלרי:
> my $x = 42;
> הגדרת מערך:
> my @x = qw(4 2);
> This part is fine.
> ושיהיה קצת יותר מעניין, בשורה אחת--->:
> Comment trouble.
> נגדיר משתנה סקלרי ע"י my $x = 42; ואח"כ עוד משהו.
> Looks good.
> Look mom, no HTML!
> Of course, I "cheated" by using characters which are not available in
> common keyboard layouts. The point is that one could write simple
> scripts to do that automatically in the MUA (e.g. as some plugin
> activated when submitting "rich text" as plaintext).
> Once such a solution is out there, it should be easier to spread it to
> other agents (maybe even to gmail).
> My point, apart from the obvious fact that directionality marks are hard to
author correctly, was that some of their interpretation is underspecified so
receiving MUAs may still behave differently.
> > Alignment is the least of your problems.
> But alignment is the only part of the problem that *can not* be solved
> in plaintext.
> Simply due to the fact that plaintext does not provide a way to encode
> that information (so user agents use their own algorithms to decide, if
> at all, and you can not rely on having it displayed the same way
> This insufficient determination is compounded by heuristic solutions.
HTML-capable viewers may try to do the right thing with completely unmarked
text, but that would be a guess and will occasionally be wrong (and wrong
differently among viewers). It also means that they have to scan the entire
document (or a reasonable portion of it at least) to establish that it
indeed contains RTL characters but no bidi marks.
Tightening the specs is the right technical solution, but doing that +
getting MUAs to comply is difficult.
> > If you mix Hebrew and English in the same paragraph, it is almost
> > certain that garbling will occur. In prose this is just very annoying.
> > In technical discussion it can render text completely unreadable.
> > Examples of garbling include reversed parentheses, misplaced
> > punctuation, reversed number segments. These have potential to do real
> > damage to coherence of the text. Unicode offers some technology to
> > help with this, but it is just not sufficient for email when used in
> > plain text. There are underspecified features that are interpreted
> > differently by clients, and regardless, these mechanisms are hard to
> > use, even for a technical user.
> Well, I still have to see if my examples above work or not
> (thuderbird/icedove is known to do some garbling of its own if you
> choose the wrong setup option).
> Unicode does have enough support to prevent all the garbling you mention
> (excluding alignment). The problem is that user agents do not insert the
> proper unicode. The community could help by writing plugins, but we are
> too lazy and prefer to revert to an "evil" but working solution such as
> HTML (at least until someone else writes the script).
> The proper Unicode is not as straightforward to pick as you make it.
> </technical nitpicking>
> Perl mailing list
> Perl at perl.org.il
Gaal Yahas <gaal at forum2.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Perl