Discussion:
[Xmltv-devel] tv_sort fails to sort category within programme
Alistair Grant
2015-01-08 15:36:22 UTC
Permalink
Hi,

I'm attempting to run tv_grab_huro through the tests on
http://wiki.xmltv.org/index.php/XmltvValidation however it is
currently failing the third test, testing the difference of two sorted
files.

The two files appear to have the same content, however because tv_sort
doesn't sort categories within programmes, the diff command produces
output. For example:

t_1.xml:

<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
</programme>

t_1.sorted.xml:

<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
</programme>


t_2_3.xml:

<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
</programme>

t_2_3.sorted.xml:

<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
</programme>

As can be seen above, the two files have the same programme entry
contents, but the categories are in a different order, which remains
after sorting.

I couldn't find any reference to this problem searching with google.
Is this a new bug, am I misinterpreting the files, or...???

Thanks,
Alistair
Alistair Grant
2015-01-09 10:59:22 UTC
Permalink
Hi,

I have a patch for tv_sort which sorts category within programme and
enables me to pass the basic tests in XmltvValidation
(tv_validate_grabber and the basic tests) for tv_grab_huro. It will
hopefully also reduce the number of notadditive errors reported by
test_grabbers. There will still be some notaddtitive errors, e.g.
I've seen the URL for the same program change between fetches,
programs added, etc.

The code can be viewed at
https://github.com/akgrant43/XMLTV/blob/huro_retry/filter/tv_sort

The two relevant patches are:

* (code) https://github.com/akgrant43/XMLTV/commit/96a273e0d035f4c7f5cca90d23cc4750bbf3d135#diff-ccf2846cb934531f32ad418a2df748ba
* (comment) https://github.com/akgrant43/XMLTV/commit/98941a50f7c29544721f335741f28d5a4add8af4#diff-ccf2846cb934531f32ad418a2df748ba

I'm not a perl programmer, so I suspect that the code could be much
more elegantly written, but it is working for me.

As with tv_grab_huro, I'm happy to do the work to get it back in to
the cvs repository.

Thanks,
Alistair
Post by Alistair Grant
Hi,
I'm attempting to run tv_grab_huro through the tests on
http://wiki.xmltv.org/index.php/XmltvValidation however it is
currently failing the third test, testing the difference of two sorted
files.
The two files appear to have the same content, however because tv_sort
doesn't sort categories within programmes, the diff command produces
<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
</programme>
<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
</programme>
<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
</programme>
<programme start="20150106055000 +0100" stop="20150106090000 +0100"
channel="001.port.hu">
<title lang="hu">Ma Reggel</title>
<desc lang="hu">Ma Reggel Közéleti mûsor Benne: Híradó,
sporthírek, idõjárás-jelentés Feliratozva a Teletext 111.
oldalán.</desc>
<category lang="en">News</category>
<category lang="hu">Hírek</category>
<category lang="en">Sports</category>
<category lang="hu">Sport</category>
</programme>
As can be seen above, the two files have the same programme entry
contents, but the categories are in a different order, which remains
after sorting.
I couldn't find any reference to this problem searching with google.
Is this a new bug, am I misinterpreting the files, or...???
Thanks,
Alistair
h***@gmail.com
2015-01-11 17:02:40 UTC
Permalink
Post by Alistair Grant
I'm attempting to run tv_grab_huro through the tests on
http://wiki.xmltv.org/index.php/XmltvValidation however it is
currently failing the third test, testing the difference of two sorted
files.
The two files appear to have the same content, however because tv_sort
doesn't sort categories within programmes, the diff command produces
[...]
Post by Alistair Grant
As can be seen above, the two files have the same programme entry
contents, but the categories are in a different order, which remains
after sorting.
Hi,

This should really be down to the grabber itself to do. Some grabbers deliberately output their multiple categories in a specific, but not sorted, order. Sorting them will break this.

For example, one grabber pulls both categories and sub-categories from its source. In its xml it outputs the categories before the sub-categories. The reason it does this is for downstream apps which can only handle one category; it is important they receive a proper category and not a sub-category. E.g. there is no point in the app getting "Crime" (sub-category) when it is missing the more important "Film" (category) because the list has been re-ordered by tv_sort.

Cheers,
Geoff
Alistair Grant
2015-01-11 18:52:08 UTC
Permalink
Post by h***@gmail.com
Post by Alistair Grant
I'm attempting to run tv_grab_huro through the tests on
http://wiki.xmltv.org/index.php/XmltvValidation however it is
currently failing the third test, testing the difference of two sorted
files.
The two files appear to have the same content, however because tv_sort
doesn't sort categories within programmes, the diff command produces
[...]
Post by Alistair Grant
As can be seen above, the two files have the same programme entry
contents, but the categories are in a different order, which remains
after sorting.
Hi,
This should really be down to the grabber itself to do. Some grabbers deliberately output their multiple categories in a specific, but not sorted, order. Sorting them will break this.
For example, one grabber pulls both categories and sub-categories from its source. In its xml it outputs the categories before the sub-categories. The reason it does this is for downstream apps which can only handle one category; it is important they receive a proper category and not a sub-category. E.g. there is no point in the app getting "Crime" (sub-category) when it is missing the more important "Film" (category) because the list has been re-ordered by tv_sort.
Thanks for your reply, and good point. I'm not even sure how
tv_grab_huro is being used by various applications.

How about adding this as an option to tv_sort that is disabled by
default, with help text that it is useful for the XmltvValidation
tests?

Thanks again,
Alistair
h***@gmail.com
2015-01-12 07:50:21 UTC
Permalink
Post by Alistair Grant
Thanks for your reply, and good point. I'm not even sure how
tv_grab_huro is being used by various applications.
How about adding this as an option to tv_sort that is disabled by
default, with help text that it is useful for the XmltvValidation
tests?
I think you can get sorted order from tv_grab_huro simply by changing line 743 to

foreach (sort keys %CATMAP) {

The categories in the xml should then always be consistent (rather than being indeterminate).

Cheers,
Geoff
Alistair Grant
2015-01-12 18:48:42 UTC
Permalink
Post by h***@gmail.com
Post by Alistair Grant
Thanks for your reply, and good point. I'm not even sure how
tv_grab_huro is being used by various applications.
How about adding this as an option to tv_sort that is disabled by
default, with help text that it is useful for the XmltvValidation
tests?
I think you can get sorted order from tv_grab_huro simply by changing line 743 to
foreach (sort keys %CATMAP) {
The categories in the xml should then always be consistent (rather than being indeterminate).
OK, I've made the changes as suggested above and successfully ran
tv_validate_grabber a couple of times (once with the supplied
test.conf and once with my production configuration). As before, the
code can be viewed at:

* https://github.com/akgrant43/XMLTV/blob/huro_retry/grab/huro/tv_grab_huro.in

The patch can be viewed with:

* https://github.com/akgrant43/XMLTV/commit/44d53215ddb6865ff85e533c8fd7049db7958221#diff-0430f28879ec294f84b1cf139b10ad96

The file also contains the modifications for the retry, which are
still a work-in-progress.

Please let me know what you think (about the category sorting).

Thanks,
Alistair
h***@gmail.com
2015-01-13 07:18:14 UTC
Permalink
Post by Alistair Grant
OK, I've made the changes as suggested above and successfully ran
tv_validate_grabber a couple of times (once with the supplied
test.conf and once with my production configuration). As before, the
* https://github.com/akgrant43/XMLTV/blob/huro_retry/grab/huro/tv_grab_huro.in
* https://github.com/akgrant43/XMLTV/commit/44d53215ddb6865ff85e533c8fd7049db7958221#diff-0430f28879ec294f84b1cf139b10ad96
The file also contains the modifications for the retry, which are
still a work-in-progress.
Please let me know what you think (about the category sorting).
Cool, thanks. Category sorting change committed into CVS, tv_grab_huro v1.54.
Alistair Grant
2015-01-13 07:45:32 UTC
Permalink
Hi Geoff,
Post by h***@gmail.com
Post by Alistair Grant
OK, I've made the changes as suggested above and successfully ran
tv_validate_grabber a couple of times (once with the supplied
test.conf and once with my production configuration). As before, the
* https://github.com/akgrant43/XMLTV/blob/huro_retry/grab/huro/tv_grab_huro.in
* https://github.com/akgrant43/XMLTV/commit/44d53215ddb6865ff85e533c8fd7049db7958221#diff-0430f28879ec294f84b1cf139b10ad96
The file also contains the modifications for the retry, which are
still a work-in-progress.
Please let me know what you think (about the category sorting).
Cool, thanks. Category sorting change committed into CVS, tv_grab_huro v1.54.
Thanks very much for applying this!

Cheers,
Alistair

Loading...