Discussion:
[Xmltv-devel] time for a release!
Robert Eden
2013-11-24 03:17:20 UTC
Permalink
So... Geoff has called for a release.

Any objections?

How about a target of 12/3?

Robert
Karl Dietz
2013-12-03 07:37:56 UTC
Permalink
Post by Robert Eden
So... Geoff has called for a release.
Any objections?
How about a target of 12/3?
all fine with me.

I have some things on "the list" that would be nice to get out into
another release in the not to distant future.

a) make tv_find_grabbers actually find any grabbers when called from
xmltv.exe. I'm not sure how to do this, maybe simply compile a
static list of embedded grabbers when packaging everything.

b) add a capability to report some "suggested-run-time"
Shepherd (Australia) wants this.
I'm thinking about something that specifies expectation and deviation
of the start time (http://en.wikipedia.org/wiki/Normal_distribution)
That would allow to specify when the service has low load and move
access by automated grabbers that understand "suggested-run-time"
there.
Or move updates to "in time to get last minute updates to prime
time shows" while still spreading out the load a bit.

I'm open for suggestions if someone wants something more advanced to
allow separate times for bulk filling and last minute updates, e.g.
for scrapers. But as I don't believe in scrapers that would be up to
you. :)

c) extend configuration API so its actually useful for programs to use
it. (wanted to do that for christmas 2011, lets try 2013 :(

d) extend tv_imdb/others to use the new 'imdb.com' episode-num system.

e) split the request size for _pt_meo, it has a nice API upstream, but
requesting all channels for all days runs into some API limit.

f) extend NonameTV / "the Atlas grabber" / Schedules Direct to use the
new metadata site specific episode-num systems.

g) extend MythTV to handle the new metadata references.


Any help is greatly appreciated. e.g. d), e) and maybe a) should be
isolated and straight forward for anyone who wants to start
contributing to the maintainance work.

Regards,
Karl
Max Barry
2013-12-03 22:56:44 UTC
Permalink
Post by Karl Dietz
b) add a capability to report some "suggested-run-time"
Shepherd (Australia) wants this.
I'm thinking about something that specifies expectation and deviation
of the start time (http://en.wikipedia.org/wiki/Normal_distribution)
That would allow to specify when the service has low load and move
access by automated grabbers that understand "suggested-run-time"
there.
Or move updates to "in time to get last minute updates to prime
time shows" while still spreading out the load a bit.
I'm open for suggestions if someone wants something more advanced to
allow separate times for bulk filling and last minute updates, e.g.
for scrapers. But as I don't believe in scrapers that would be up to
you. :)
Shepherd dev here! The benefits we'd seek from something like this are:

1. Be able to suggest how often the grabber should be run: daily,
hourly, whatever. This is something the grabber knows better than the
media center, since it's familiar with how often the data source
updates, and how to most efficiently get data from it. So it makes sense
for the grabber to be able to pass that information along.

In our particular case, we want to run hourly even if we might not
perform a data grab, because Shepherd is relatively fragile and runs
don't always succeed. (Shepherd can take hours to complete, and some
rural users have poor internet connectivity.) This way we can ensure the
user's EPG was successfully updated at least once in the last day, and
if not, try again. Without this, a few failed grabs in a row lead to
systems running low on data.

2. Avoid "piling", where a media center has a particular default time to
run its grabber and every user hits the data source at the same moment.
This isn't really anything to do with XMLTV, as far as I can tell, but
more an issue with how MythTV has traditionally operated. It's been a
showstopper for us because each Shepherd user has to scrape hundreds or
thousands of HTML pages, and if everyone does that at the same time, it
destroys servers, and we have no EPG.

We don't need anything more sophisticated than that. We don't want to
engage in active load-balancing, for example. We're happy for each user
to run at random times, or whenever they like, so long as they're not
all being defaulted to the same time.

I think MythTV still has a default window of 2am-5am which is the only
time grabbers are permitted to run, although it now randomizes the time
within that window. This is a big improvement from the early days, when
everyone's grabber would fire at precisely 2am, and is mostly adequate
in terms of avoiding piling. It means the grabber can't check for
schedule changes after 5am, though, which is a shame. I don't really
understand the point of that window.

I'm not sure what does or should happen when the grabber's suggested run
time falls outside of MythTV's window.

What happens if the user's system is powered off during
"suggested-run-time"? I think it should run the grabber as soon as
possible after boot, and this should be explicit.

Max.
Robert Eden
2013-12-05 02:40:43 UTC
Permalink
Post by Robert Eden
So... Geoff has called for a release.
Any objections?
How about a target of 12/3?
I started updating the README file last night, but noticed a ton of
recent commits, plus I want to try to add more grabbers to the EXE, and
now there's the atlas config file key issue...

So, now let's shoot for a release target Sunday 12/8.

I saw mostly minor bug fixes, if your program has major changes, please
check/update the README in case I missed something major.

Robert
Robert Eden
2013-12-08 22:25:46 UTC
Permalink
Post by Robert Eden
So, now let's shoot for a release target Sunday 12/8.
I saw mostly minor bug fixes, if your program has major changes,
please check/update the README in case I missed something major.
With Geoff's help, tv_grab_uk_atlas is now working on windows and in
xmltv.exe.

I still some some commits being made.. so we can push this back a little
further.

On other cleanup, I looked at the XMLTV Test report
tv_grab_ar hasn't passed for a while. I tried it myself and got a
http 404 error (page not found) on configure.

tv_grab_dk_dr seems to work, but doesn't give any program records.

tv_grab_za gives a http 500 error on --configure

Do these grabbers work for anyone? If not, I'll disable them in the
Makefile.

Robert
h***@gmail.com
2013-12-09 13:29:06 UTC
Permalink
Post by Robert Eden
On other cleanup, I looked at the XMLTV Test report
tv_grab_ar hasn't passed for a while. I tried it myself and got a
http 404 error (page not found) on configure.
tv_grab_dk_dr seems to work, but doesn't give any program records.
tv_grab_za gives a http 500 error on --configure
Do these grabbers work for anyone? If not, I'll disable them in the
Makefile.
I'll take a look...

Geoff
h***@gmail.com
2013-12-09 18:04:23 UTC
Permalink
Post by Robert Eden
On other cleanup, I looked at the XMLTV Test report
tv_grab_ar hasn't passed for a while. I tried it myself and got a
http 404 error (page not found) on configure.
tv_grab_dk_dr seems to work, but doesn't give any program records.
tv_grab_za gives a http 500 error on --configure
Do these grabbers work for anyone? If not, I'll disable them in the
Makefile.
_za should be ok now (nightly check will probably throw a "not additive" error)

I'll look at _ar next. (site has changed and now uses a lot of Ajax)

Geoff
Robert Eden
2013-12-20 00:55:01 UTC
Permalink
Post by h***@gmail.com
_za should be ok now (nightly check will probably throw a "not additive" error)
I'll look at _ar next. (site has changed and now uses a lot of Ajax)
Looks like _ar is working too! Are we ready for a release?

Robert
h***@gmail.com
2013-12-20 06:18:32 UTC
Permalink
Post by Robert Eden
Post by h***@gmail.com
_za should be ok now (nightly check will probably throw a "not additive" error)
I'll look at _ar next. (site has changed and now uses a lot of Ajax)
Looks like _ar is working too! Are we ready for a release?
I'd like to add eu_timefortv which I will do today. Do you want to wait for that one?
h***@gmail.com
2013-12-20 15:09:08 UTC
Permalink
Post by h***@gmail.com
Post by Robert Eden
Looks like _ar is working too! Are we ready for a release?
I'd like to add eu_timefortv which I will do today. Do you want to wait for that one?
Ok I give up. I've had zero success in getting a response from them. They insist their XML is valid per the DTD despite me pointing out where it's not.

The only alternative I can see is to rewrite the grabber so it parses the xml (using XML::Tree directly and not using node_to_programme) and then rewrites it.

Shall I do that or does anyone have any other ideas?
Ben Bucksch
2013-12-20 15:19:42 UTC
Permalink
The quality seems low anyway:

http://timefor.tv/xmltv/c81e728d9d4c2f636f067f89cc14862c

* No RFC channel IDs, but: <channel id="www.timefor.tv/tv/162">
* Channel display names with country and language, which doesn't
belong there:<display-name lang="de">ARD DE DE</display-name>
* category "documentary", "news" and "serie" are mutually exclusive (I
defined it several years ago), but used together at the same time.
Also, category is an internal ID, so it can't have a language
* In movies, <sub-title lang="de"> abused for original title (wrong
tag and wrong lang)


This is so bad on a data level, this is useless. It will cause more
trouble than helping. You're better off with DVB-EIT data, I think.

Ben
Ben Bucksch
2013-12-20 15:21:05 UTC
Permalink
Re-posting with proper subject.
Post by h***@gmail.com
Post by h***@gmail.com
I'd like to add eu_timefortv which I will do today. Do you want to wait for that one?
Ok I give up. I've had zero success in getting a response from them. They insist their XML is valid per the DTD despite me pointing out where it's not.
The only alternative I can see is to rewrite the grabber so it parses the xml (using XML::Tree directly and not using node_to_programme) and then rewrites it.
Shall I do that or does anyone have any other ideas?
The quality seems low anyway:

http://timefor.tv/xmltv/c81e728d9d4c2f636f067f89cc14862c

* No RFC channel IDs, but: <channel id="www.timefor.tv/tv/162">
* Channel display names with country and language, which doesn't
belong there:<display-name lang="de">ARD DE DE</display-name>
* category "documentary", "news" and "serie" are mutually exclusive (I
defined it several years ago), but used together at the same time.
Also, category is an internal ID, so it can't have a language
* In movies, <sub-title lang="de"> abused for original title (wrong
tag and wrong lang)


This is so bad on a data level, this is useless. It will cause more
trouble than helping. You're better off with DVB-EIT data, I think.

Ben

h***@gmail.com
2013-12-10 09:41:35 UTC
Permalink
Post by Robert Eden
tv_grab_dk_dr seems to work, but doesn't give any program records.
It seems the JSON service ( www.dr.dk/tjenester/programoversigt/DBService.ashx?test ) used by this grabber stopped working on 2012-11-02 (it still gives some old data which is why configure works, but there are no programme schedules available).

What looks like the subsequent live service ( www.dr.dk/tv/oversigt/json/guide/schedule?... ) seems to have stopped around about Feb this year.

So unless anyone has any better thoughts I think this grabber should be considered dead. :(
Karl Dietz
2013-12-10 18:24:14 UTC
Permalink
Post by h***@gmail.com
Post by Robert Eden
tv_grab_dk_dr seems to work, but doesn't give any program records.
It seems the JSON service ( www.dr.dk/tjenester/programoversigt/DBService.ashx?test ) used by this grabber stopped working on 2012-11-02 (it still gives some old data which is why configure works, but there are no programme schedules available).
What looks like the subsequent live service ( www.dr.dk/tv/oversigt/json/guide/schedule?... ) seems to have stopped around about Feb this year.
So unless anyone has any better thoughts I think this grabber should be considered dead. :(
No better thought but there is a commercial service for many countries
(former ontv.dk) http://dk.timefor.tv/xmltv

It might be nice to supply a grabber for them (I have tried to slap a
simple downloader together, but their demo data doesn't match our
testers expectation) Then I ran out of spare time. If anyone feels like
pickung up, that would be appreciated.

Regards,
Karl
h***@gmail.com
2013-12-14 13:18:14 UTC
Permalink
Post by Karl Dietz
No better thought but there is a commercial service for many countries
(former ontv.dk) http://dk.timefor.tv/xmltv
It might be nice to supply a grabber for them (I have tried to slap a
simple downloader together, but their demo data doesn't match our
testers expectation) Then I ran out of spare time. If anyone feels like
pickung up, that would be appreciated.
The problem appears to be simply because their data is not well-formed against xmltv.dtd. Specifically it has the <icon> element *after* the <episode-num>. The dtd says it must come before it.

*******
# xmllint --noout --dtdvalid tester/xmltv.dtd tester/epg.xml

tester/epg.xml:31: element programme: validity error : Element programme content does not follow the DTD, expecting (title+ , sub-title* , desc* , credits? , date? , category* , language? , orig-language? , length? , icon* , url* , country* , episode-num* , video? , audio? , previously-shown? , premiere? , last-chance? , new? , subtitles* , rating* , star-rating* , review*), got (title sub-title desc category category episode-num icon )

Document tester/epg.xml does not validate against tester/xmltv.dtd
*******


call_handlers_read() checks the elements are in the correct order (by comparing them against @Programme_Handlers array):

*******
doing subelement
tag name: icon
element icon not expected here at /usr/lib/perl5/site_perl/5.8.8/XMLTV.pm line 2016.
*******


I don't know whether we should attempt to workaround this or simply get kazer to fix their xml output. I think the latter?

Geoff
Karl Dietz
2013-12-15 18:52:48 UTC
Permalink
Post by Karl Dietz
No better thought but there is a commercial service for many countries
(former ontv.dk) http://dk.timefor.tv/xmltv
It might be nice to supply a grabber for them (I have tried to slap a
simple downloader together, but their demo data doesn't match our
testers expectation) Then I ran out of spare time. If anyone feels like
pickung up, that would be appreciated.
The problem appears to be simply because their data is not well-formed against xmltv.dtd. Specifically it has the<icon> element *after* the<episode-num>. The dtd says it must come before it.
...

Its also the order of the credits.
I don't know whether we should attempt to workaround this or simply get kazer to fix their xml output. I think the latter?
I guess that's a typo. The people that run the site behind _fr_kazer
have been very responsive when I asked them to do "what it takes to make
the xmltv-tester green" :)

For timefor.tv I'd prefer if someone talks to them first. Maybe it
itching a dev and it gets scratched by some rounds of reordering and
running tv_validate_file. (Its been so long that I forgot, but running
"xmltv.exe tv_validate_file filenamehere" should work, too)

Fixing the channel ids will likely have to be done in the grabber,
though, as its a change that is not transparent to existing setups.

Regards,
Karl
h***@gmail.com
2013-12-15 19:37:48 UTC
Permalink
Post by Karl Dietz
Post by h***@gmail.com
I don't know whether we should attempt to workaround this or simply get kazer to fix their xml output.
I guess that's a typo. The people that run the site behind _fr_kazer
have been very responsive when I asked them to do "what it takes to make
the xmltv-tester green" :)
For timefor.tv I'd prefer if someone talks to them first.
Ah, I thought kazer were responsible for creating the xml file
"Data provided via web service from kazer.org. Check their terms of usage!"

but it seems it's actually coming from timefor.tv
"SetSupplementRoot( 'http://timefor.tv/' );"

I should've read the script more closely! :)

Ok I will contact them tomorrow and see if I can get them to correct the files.

Regards,
Geoff
Karl Dietz
2013-12-15 20:38:30 UTC
Permalink
Post by h***@gmail.com
Post by Karl Dietz
Post by h***@gmail.com
I don't know whether we should attempt to workaround this or simply get kazer to fix their xml output.
I guess that's a typo. The people that run the site behind _fr_kazer
have been very responsive when I asked them to do "what it takes to make
the xmltv-tester green" :)
For timefor.tv I'd prefer if someone talks to them first.
Ah, I thought kazer were responsible for creating the xml file
"Data provided via web service from kazer.org. Check their terms of usage!"
but it seems it's actually coming from timefor.tv
"SetSupplementRoot( 'http://timefor.tv/' );"
FYI, that's a hack because we have no generic "download a file, but be
nice to the site and perform proper caching and stuff" mechanism. But
its working well :)
Post by h***@gmail.com
I should've read the script more closely! :)
Ok I will contact them tomorrow and see if I can get them to correct the files.
Mea culpa. That is because I was lazy and used _fr_kazer as template :D
http://wiki.xmltv.org/index.php/User:Dekarl/Static_File_Grabber_Template
(hm, should move that page into the main space from my user space)

Both sites provide premade xmltv files for download and _fr_kazer was my
test case for the concept of a grabber for sites that host one big xmltv
file.

Regards,
Karl
h***@gmail.com
2013-12-16 07:53:40 UTC
Permalink
Post by Karl Dietz
Post by h***@gmail.com
but it seems it's actually coming from timefor.tv
"SetSupplementRoot( 'http://timefor.tv/' );"
FYI, that's a hack because we have no generic "download a file, but be
nice to the site and perform proper caching and stuff" mechanism. But
its working well :)
Yes I saw that and thought, "interesting lateral thinking" :) Nice idea.
Post by Karl Dietz
Mea culpa. That is because I was lazy and used _fr_kazer as template :D
No problem - we all do that :) No point in re-inventing the wheel each time.

Cheers,
Geoff
Loading...