Discussion:
[Xmltv-devel] tv_grab_eu_epgdata: not well-formed (invalid token) - how to debug further?
Carsten Aulbert
2011-11-19 18:46:25 UTC
Permalink
Hi all,

I've just discovered that I cannot get listings for days after tomorrow from
said grabber due to a parsing problem.

xmltv is the Debian Squeeze package 0.5.59-1

and the error message is

2011-11-19 19:34:38.445 Updating source #3 (SATDE) with grabber
tv_grab_eu_epgdata
2011-11-19 19:34:38.446 Found 54 channels for source 3 which use grabber
2011-11-19 19:34:39.084 Grabber has capabilities: baseline manualconfig
tkconfig apiconfig cache preferredmethod
2011-11-19 19:34:39.697 Grabber prefers method: allatonce
2011-11-19 19:34:39.698 XMLTV config file is: /home/mythtv/.mythtv/SATDE.xmltv
2011-11-19 19:34:39.699 Grabber Command: nice tv_grab_eu_epgdata --config-file
'/home/mythtv/.mythtv/SATDE.xmltv' --output /tmp/myth5F56dR
2011-11-19 19:34:39.699 ----------------- Start of XMLTV output
-----------------
Downloading zip file for day 1
Downloading zip file for day 2
Downloading zip file for day 3
Downloading zip file for day 4
Downloading zip file for day 5
Downloading include zip file
Your PIN will expire around Tue Apr 10 00:00:00 MEST 2012

not well-formed (invalid token) at line 112, column 161, byte 4468 at
/usr/lib/perl5/XML/Parser.pm line 187
at /usr/bin/tv_grab_eu_epgdata line 355
at /usr/bin/tv_grab_eu_epgdata line 355
2011-11-19 19:35:22.303 ------------------ End of XMLTV output
------------------
2011-11-19 19:35:22.303 FillData, Error: xmltv returned error code 65280
2011-11-19 19:35:22.305 Error in 127:15: unexpected end of file
2011-11-19 19:35:22.305 IconData: Updating icons for sourceid: 3
2011-11-19 19:35:22.306 No programs found in data.
2011-11-19 19:35:22.307 Failed to fetch some program info
2011-11-19 19:35:22.307 Adjusting program database end times.
2011-11-19 19:35:22.758 0 replacements made
2011-11-19 19:35:22.758 Marking generic episodes.
2011-11-19 19:35:24.084 Found 247
2011-11-19 19:35:24.084 Fudging non-unique programids with multiple parts.
2011-11-19 19:35:24.178 Found 0
2011-11-19 19:35:24.178 Marking repeats.
2011-11-19 19:35:24.375 Found 0

is this just a broken download (which then would be consitant over the past
days) or is this possibly an internal problem within the XML file?

A first glance at the xml files looks alright to me.

anyone with a quick idea?

Cheers

Carsten
Carsten Aulbert
2011-11-20 13:03:40 UTC
Permalink
Hi
Post by Carsten Aulbert
not well-formed (invalid token) at line 112, column 161, byte 4468 at
/usr/lib/perl5/XML/Parser.pm line 187
at /usr/bin/tv_grab_eu_epgdata line 355
at /usr/bin/tv_grab_eu_epgdata line 355
I've dug a bit deeper and it seems that epgdata has some "unprotected" & ni
the listings, e.g.

<d19>Geld & Leben - Das Wirtschaftsmagazin</d19>

Am I right, that this should need to be "masked" to read &amp;?

And what would be the correct way to move forward?

Cheers

Carsten
Karl Dietz
2011-11-20 13:52:31 UTC
Permalink
Hi
Post by Carsten Aulbert
not well-formed (invalid token) at line 112, column 161, byte 4468 at
/usr/lib/perl5/XML/Parser.pm line 187
at /usr/bin/tv_grab_eu_epgdata line 355
at /usr/bin/tv_grab_eu_epgdata line 355
I've dug a bit deeper and it seems that epgdata has some "unprotected"& ni
the listings, e.g.
<d19>Geld& Leben - Das Wirtschaftsmagazin</d19>
Am I right, that this should need to be "masked" to read&amp;?
Yes, you cannot have an ampersand without some encoding in xml.
And what would be the correct way to move forward?
The fixup is easy. Just replace every "& " with "&amp; " and tell your
data provider that their xml is invalid.

Regards,
Karl
Jan Schneider
2011-11-20 14:08:36 UTC
Permalink
Post by Karl Dietz
Hi
Post by Carsten Aulbert
not well-formed (invalid token) at line 112, column 161, byte 4468 at
/usr/lib/perl5/XML/Parser.pm line 187
at /usr/bin/tv_grab_eu_epgdata line 355
at /usr/bin/tv_grab_eu_epgdata line 355
I've dug a bit deeper and it seems that epgdata has some "unprotected"& ni
the listings, e.g.
<d19>Geld& Leben - Das Wirtschaftsmagazin</d19>
Am I right, that this should need to be "masked" to read&amp;?
Yes, you cannot have an ampersand without some encoding in xml.
And what would be the correct way to move forward?
The fixup is easy. Just replace every "& " with "&amp; " and tell your
data provider that their xml is invalid.
I have informed epgdata.com already that they produce invalid markup
at the moment. They didn't get back to me yet though. For the time
being, here's some hotfix (it won't match the line numbers exactly).

Jan.
Jan.
--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/
Carsten Aulbert
2011-11-20 15:00:21 UTC
Permalink
Hi
Post by Jan Schneider
Post by Karl Dietz
The fixup is easy. Just replace every "& " with "&amp; " and tell your
data provider that their xml is invalid.
I have informed epgdata.com already that they produce invalid markup
at the moment. They didn't get back to me yet though. For the time
being, here's some hotfix (it won't match the line numbers exactly).
Thanks a bunch - my "fix" was a bit more brutal, simply

system("sed -i 's/&/&amp;/g' $file");

but your patch looks much better ;)

Do you think it helps if more people write to epgdata.com?

Cheers

Carsten
Jan Schneider
2011-11-20 15:35:00 UTC
Permalink
Post by Carsten Aulbert
Hi
Post by Jan Schneider
Post by Karl Dietz
The fixup is easy. Just replace every "& " with "&amp; " and tell your
data provider that their xml is invalid.
I have informed epgdata.com already that they produce invalid markup
at the moment. They didn't get back to me yet though. For the time
being, here's some hotfix (it won't match the line numbers exactly).
Thanks a bunch - my "fix" was a bit more brutal, simply
system("sed -i 's/&/&amp;/g' $file");
but your patch looks much better ;)
Do you think it helps if more people write to epgdata.com?
I doubt that. :) And I know of at least one more person who already
did. And usually they tend to fix things within a few days.
Jan.
--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/
Jan Schneider
2011-11-21 19:06:03 UTC
Permalink
Post by Jan Schneider
Post by Carsten Aulbert
Hi
Post by Jan Schneider
Post by Karl Dietz
The fixup is easy. Just replace every "& " with "&amp; " and tell your
data provider that their xml is invalid.
I have informed epgdata.com already that they produce invalid markup
at the moment. They didn't get back to me yet though. For the time
being, here's some hotfix (it won't match the line numbers exactly).
Thanks a bunch - my "fix" was a bit more brutal, simply
system("sed -i 's/&/&amp;/g' $file");
but your patch looks much better ;)
Do you think it helps if more people write to epgdata.com?
I doubt that. :) And I know of at least one more person who already
did. And usually they tend to fix things within a few days.
Looks like they silently fixed it today.

Jan.
--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/
Loading...