I'm going to have to move onto other things for a bit. I did have a play
with using DateTime within the existing perl code, but not in a nice
generic (library) fashion that might be usable throughout XMLTV.
Possibly someone with more perl experience would be able to find it easier.
I will try and have another go at some point, but I need to move onto other
problems.
I have finished playing with my python script though, I tried experimenting
with some output caching, but actually this wasn't very efficient and it
was quicker to just reprocess the files each time, especially given the
absolute times are quite good anyway.
I've run it on my laptop and it creates the 62 freesat channels in around
90s, 60s of which is data fetching. This compares to around 15mins with the
original script. So assuming a roughly equal fetch time this is about a 30x
speedup.
My script still doesn't include ALL the title processing in the original,
but it does have most and as previously noted this is not a bottleneck in
my opinion. Adding the rest isn't a big job, I just don't need it at the
moment. I added as much as I felt necessary to allow me to do a direct file
diff.
With regards to the DST handling, this is also handled manually within the
script and probably isn't as generic as it could be (in fact I've just
realised I think I've ignored explicit timezone info in the entries).
However my testing on some old data from the internet archive showed it did
a better job than the original script which still appears to get confused
at the DST switchover (DST -> non-DST).
The output is hand-generated XML, rather than using the python xmltv output
module, since this doesn't format and again makes comparison difficult
(couldn't seem to get tv_sort working?).
I've put the script in my googlecode svn repo, so feel free to take a look:
http://adamsutton.googlecode.com/svn/xbmc-pvr/trunk/xmltv/tv_grab_uk_rt_aps.py
The script will use the existing tv_grab_uk_rt configuration file. However
so that it doesn't interfere with any of the cached data I've implemented
my own caching under ~/.xmltv/cache2.
You can run the script as:
python tv_grab_uk_rt_aps.py > listings.xml
--debug will enable some extra debug (though not as informative as the
original)
--help will show other options
I'm not proposing this as a replacement for the existing script in any way,
since the project is obviously written in perl not python. I just thought
it might serve as useful demonstration of an alternative, more efficient,
approach.
Regards
Adam
P.S.
I have hacked it quite a bit since I ran the comparison tests against older
data, so I won't put my hand on heart and swear it still gets all the DST
issues right ;)
I wasn't tracking it in SVN at the time, doh!
P.P.S
There are definitely still other general faults in the script, as its just
a toy example.
Post by Karl DietzPost by Adam SuttonI have spoken briefly to Robb about this and he has pointed at that I'm
probably not handling the DST issues, and he was correct I've left this
out initially (especially given the fact that DST only really becomes an
issue for that 1 nasty hour a year when the time jumps backwards,
forwards is easy to deal with due to no ambiguities). But this is
a solvable problem.
I've just started looking into this using some data I managed to grab
from archive.org <http://archive.org> covering the two change dates (a
few years back mind). And interestingly the library in use by xmltv gets
the times wrong anyway! When the time goes forward I believe its
correct, but when the time goes back it gets all muddled up.
We are using the DateTime modules on our NonameTV sites for time
handling. I try to fixup the guide around the DST switch but the
upstream data is ambiguous more often then not.
I usually tell the module "here comes Europe/Berlin local time" then do
math in UTC and convert back to local time with explicit offset for
output.
I've considered porting the whole of Xmltv over but currently lack the
time for such a big project.
Post by Adam SuttonI'm going to try and spend some time making my _uk_rt mod DST aware, for
the time being I'm focusing on mods directly in that code, rather than
creating a replacement time processing lib that can be used by the other
tv_grab routines. But that shouldn't be too tricky.
May I suggest to look at DateTime? Its working well for advanced stuff
like creating sets of time spans to cut and merge time sharing channels.
(see
https://github.com/dekarl/nonametv/blob/master/lib/NonameTV/Importer/Combiner.pm#L439
)
Yeah I started using DateTime for a "rough" reworking. Unfortunately my
perl is VERY rusty and while my implementation was much faster than the
original it's nowhere near as quick as my from scratch python
implementation.
I didn't see any obvious auto timezone calculation in DateTime, however
the rules for determining DST are relatively simple if you know whether or
not they apply for a given timezone, which we do for the Radio Times since
all times are given in UK local time.
Having local time in the source data is always going to create
ambiguities, I've had this problem in the past with other things, without
comparison to other info (i.e. preceding and proceeding programmes) it's
simply not possible to be 100% accurate in the determination of the correct
time when moving from DST to non-DST (due to the nasty repeating hour).
I've noted that even the existing _uk_rt script does not actually handle
the DST changeovers properly, whereas my implementation does. It tends to
output the end time in the same timezone as the start, even if the end is
no longer subject to DST, however the actual time is at least correct. I.e.
it might output 0205 +0100, even though strictly its 0105 +0000, but the
represented UTC time is at least correct.
Post by Karl DietzPost by Adam SuttonMy intention will be to provide a patch once I actually have something
that works properly and includes all time checks etc... I'm happy to
also provide my own python implementation, but to be honest I'd
personally rather use the xmltv scripts (with mods) since these are well
used/tested etc... However I may do some more work on my python scripts
as it could form a useful testing ground (as I better understand the
code) and a useful comparison.
I'm not using the on-air guide since I only seem to be getting the
now/next info, plus I believe EIT is only 7 days anyway? I'd rather have
the full 2weeks if possible.
We have up to 4 weeks of DVB-EIT on satellite/cable in Germany. I don't
know how for the Freesat/Freeview MHEG guide goes into the future.
I had been told there was a 7day EPG on the air, but it wasn't
automatically detected by tvheadend. And in the grand scheme of things,
i.e. getting a working replacement for my existing
standalone satellite PVR, that won't get me in trouble with the wife, the
EPG performance is low down the list.
However I thought I could quickly contribute something useful in this area.
Post by Karl DietzPS: EIT is explicit about the time ;)
No idea :(