Discussion:
[Xmltv-devel] Definition of badiso8859
h***@gmail.com
2014-04-30 15:29:52 UTC
Permalink
ValidateFile.pm (line 315 %unusual_iso8859 ) rejects as "badiso8859", data containing \xA0 and \xAD. These are respectively 'non-breaking space' and 'soft hyphen'.

Please can someone tell me why these are being actively rejected? They are perfectly valid iso-8859-1 characters:

http://www.htmlhelp.com/reference/charset/iso160-191.html

It seems daft to have to explicitly code *every* grabber to remove these 2 characters... unless there's a good reason which I can't see?

Thanks,
Geoff
Karl Dietz
2014-04-30 15:39:51 UTC
Permalink
Post by h***@gmail.com
ValidateFile.pm (line 315 %unusual_iso8859 ) rejects as "badiso8859", data containing \xA0 and \xAD. These are respectively 'non-breaking space' and 'soft hyphen'.
It seems daft to have to explicitly code *every* grabber to remove these 2 characters... unless there's a good reason which I can't see?
When I added that check the only instances of these two characters where
due to code/data/editorial errors. Basically no one was using them for
the intended purpose, so I just flagged them as unexpected.

Feel free to change the test, if we now do have grabbers that properly
emit them.

Regards,
Karl
h***@gmail.com
2014-04-30 16:12:19 UTC
Permalink
Post by Karl Dietz
When I added that check the only instances of these two characters where
due to code/data/editorial errors. Basically no one was using them for
the intended purpose, so I just flagged them as unexpected.
Feel free to change the test, if we now do have grabbers that properly
emit them.
Thanks Karl. Ok I see - I guess the principal use of \xA0 is to space-align text, so you're right it probably isn't sensible to include them all (e.g. if multiple consecutive nbsp).

Likewise the only time I've seen soft hyphen (on _fi) it immediately followed a hard hyphen so that's a bit daft too!

So I think you are right - they are *not* being used correctly and the grabber should do something with them. Ok I've talked myself into it! ;-)

Thanks for your advice.

Rgds,
Geoff

Loading...