Discussion:
[Xmltv-devel] Download errors with tv_grab_eu_epgdata while direct wget works
Carsten Aulbert
2014-10-11 14:25:37 UTC
Permalink
Hi

Since I usually get 4-5 days worth of listings and I have only today
left, I guess this problem manifested itself 3-4 days ago. System is
Debian Jessie with xmltv 0.5.63 but the same happens with 0.5.65.

I used this simple patch to get the return codes from LWP::Simple (sorry
for line breaks if they arrive at your MUA):

diff -u /usr/local/bin/tv_grab_eu_epgdata /tmp/tv_grab_eu_epgdata
--- /usr/local/bin/tv_grab_eu_epgdata 2014-10-11 15:57:56.000000000 +0200
+++ /tmp/tv_grab_eu_epgdata 2014-10-11 16:12:38.240908281 +0200
@@ -341,7 +341,8 @@
my $pin = $conf->{pin}->[0];
my $includeurl =
"http://www.epgdata.com/index.php?action=sendInclude&iOEM=&pin=$pin&dataType=xml";
warn "Downloading include zip file\n" unless $opt->{quiet};
- getstore($includeurl, $tmp . 'includezip');
+ my $http_return = getstore($includeurl, $tmp . 'includezip');
+ print "DEBUG: HTTP return code: $http_return for '$includeurl'\n";
my @zipfiles=($tmp . 'includezip');
unzip(@zipfiles);
}

Running the script yields (PIN redacted):

/usr/local/bin/tv_grab_eu_epgdata --debug --config-file SAT.xmltv
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE tv SYSTEM "xmltv.dtd">
tz=Europe/Berlin
Downloading include zip file
DEBUG: HTTP return code: 404 for
'http://www.epgdata.com/index.php?action=sendInclude&iOEM=&pin=myPINwm&dataType=xml'
Extracting *.dtd and *.xml from /tmp/26cItG9MTV/includezip
IO error: opening /tmp/26cItG9MTV/includezip for read : No such file or
directory
at /usr/share/perl5/Archive/Zip/Archive.pm line 572.

Archive::Zip::Archive::read(Archive::Zip::Archive=HASH(0x3826178),
"/tmp/26cItG9MTV/includezip") called at
/usr/share/perl5/Archive/Zip/Archive.pm line 59
Archive::Zip::Archive::new("Archive::Zip::Archive",
"/tmp/26cItG9MTV/includezip") called at /usr/share/perl5/Archive/Zip.pm
line 284
Archive::Zip::new("Archive::Zip", "/tmp/26cItG9MTV/includezip")
called at /usr/local/bin/tv_grab_eu_epgdata line 355
main::unzip("/tmp/26cItG9MTV/includezip") called at
/usr/local/bin/tv_grab_eu_epgdata line 347
main::prepareinclude(HASH(0x3272a30), HASH(0x3272598)) called at
/usr/local/bin/tv_grab_eu_epgdata line 286
Can't call method "memberNames" on an undefined value at
/usr/local/bin/tv_grab_eu_epgdata line 356.

Obviously the script fails at the unzip stage as the zip file is not
there. Please note the 404 HTTP error.

However, if I simply copy&paste the URL and try with wget from the very
same machine I get:

$ wget -O /tmp/include.zip
'http://www.epgdata.com/index.php?action=sendInclude&iOEM=&pin=myPIN&dataType=xml'
--2014-10-11 16:09:50--
http://www.epgdata.com/index.php?action=sendInclude&iOEM=&pin=myPIN&dataType=xml
Resolving www.epgdata.com (www.epgdata.com)... 195.50.176.247
Connecting to www.epgdata.com (www.epgdata.com)|195.50.176.247|:80...
connected.
HTTP request sent, awaiting response... 200 OK
Length: 8668 (8.5K) [application/x-zip-compressed]
Saving to: ‘/tmp/include.zip’

100%[==================================================================================================================================>]
8,668 --.-K/s in 0.005s

2014-10-11 16:09:50 (1.63 MB/s) - ‘/tmp/include.zip’ saved [8668/8668]

$ unzip -t /tmp/include.zip
Archive: /tmp/include.zip
testing: category.xml OK
testing: channel_y.xml OK
testing: genre.xml OK
No errors detected in compressed data of /tmp/include.zip.

I tried to debug it further with T-/Whireshark but to no avail (still
with old xmltv version):

GET /index.php?action=sendInclude&iOEM=&pin=myPIN&dataType=xml HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: www.epgdata.com
User-Agent: xmltv/0.5.63

HTTP/1.1 404 Not Found
Date: Sat, 11 Oct 2014 13:47:12 GMT
Server: Apache
Set-Cookie: session-1=3mySESSION; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0,
pre-check=0
Pragma: no-cache
Cache-Control: nocache, private
Vary: Accept-Encoding
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

2e3f
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de">
<head>
[...]

Any idea, what could cause something like this?

Cheers

Carsten
Carsten Aulbert
2014-10-11 17:53:37 UTC
Permalink
Hi again
Post by Carsten Aulbert
Any idea, what could cause something like this?
I've tried various options, but I'm not making any progress.

With curl's "H" option I tried to emulate the LWP specific HTTP headers,
but while curl succeeds, LWP fails.

curl:
Connected to www.epgdata.com (195.50.176.247) port 80 (#0)
GET /index.php?action=sendInclude&iOEM=&pin=myPIN&dataType=xml HTTP/1.1
Host: www.epgdata.com
Accept: */*
TE: deflate,gzip;q=0.3
Connection: TE, close
User-Agent: xmltv/0.5.65

HTTP/1.1 200 OK
Date: Sat, 11 Oct 2014 17:45:03 GMT
Server Apache is not blacklisted
Server: Apache

LWP (Aceppt is missing here, if that makes any difference)...

GET /index.php?action=sendInclude&iOEM=&pin=myPIN&dataType=xml HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: www.epgdata.com
User-Agent: xmltv/0.5.65

HTTP/1.1 404 Not Found
Date: Sat, 11 Oct 2014 17:31:16 GMT
Server: Apache


*sigh*

replacing the "getstore" method with a system call to curl succeeds
without issues (well at least for the include.zip).

Maybe I'll find something else - any suggestions are welcome.

Cheers

Carsten
Carsten Aulbert
2014-10-12 08:08:24 UTC
Permalink
(resending, as I originally sent with wrong from address)

Hi

after a good night of sleep not pondering on the issue, the solution
simply seems to be:

diff -u /tmp/tv_grab_eu_epgdata /usr/local/bin/tv_grab_eu_epgdata
--- /tmp/tv_grab_eu_epgdata 2014-10-12 08:53:48.381303336 +0200
+++ /usr/local/bin/tv_grab_eu_epgdata 2014-10-12 08:41:21.821281946 +0200
@@ -164,6 +164,7 @@

# set user agent
$ua->agent("xmltv/$XMLTV::VERSION");
+$ua->default_header('Accept' => '*/*');

our(%genre, $channelgroup, $expiry_date, %chanid, $country);
our $tmp = tempdir(CLEANUP => 1) . '/';


Somehow www.epgdata.com's apache is making wrong assumptions here (and I
overlooked this issue when using wget/curl yesterday as Accept: */*
ought to be the default).

In short, 0.5.65 is broken for me unless I insert this single line. As
it should not hurt anyone, it would be nice to see this in the next release.

Cheers

Carsten
Jan Schneider
2014-10-13 16:40:27 UTC
Permalink
Post by Carsten Aulbert
(resending, as I originally sent with wrong from address)
Hi
after a good night of sleep not pondering on the issue, the solution
diff -u /tmp/tv_grab_eu_epgdata /usr/local/bin/tv_grab_eu_epgdata
--- /tmp/tv_grab_eu_epgdata 2014-10-12 08:53:48.381303336 +0200
+++ /usr/local/bin/tv_grab_eu_epgdata 2014-10-12 08:41:21.821281946 +0200
@@ -164,6 +164,7 @@
# set user agent
$ua->agent("xmltv/$XMLTV::VERSION");
+$ua->default_header('Accept' => '*/*');
our(%genre, $channelgroup, $expiry_date, %chanid, $country);
our $tmp = tempdir(CLEANUP => 1) . '/';
Somehow www.epgdata.com's apache is making wrong assumptions here (and I
overlooked this issue when using wget/curl yesterday as Accept: */*
ought to be the default).
In short, 0.5.65 is broken for me unless I insert this single line. As
it should not hurt anyone, it would be nice to see this in the next release.
Cheers
Carsten
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://p.sf.net/sfu/Zoho
_______________________________________________
xmltv-devel mailing list
https://lists.sourceforge.net/lists/listinfo/xmltv-devel
This had already been added a few days ago.
--
Jan Schneider
The Horde Project
http://www.horde.org/
https://www.facebook.com/hordeproject
Loading...