Discussion:
[Xmltv-devel] tv_grab_sd_json: Total number of episodes missing for many TV series
Finn Ellebaek Nielsen (Appatra Limited)
2016-06-04 14:53:47 UTC
Permalink
Dear all

XMLTV 0.5.68
tv_grab_sd_json 1.9, 2016/05/28 01:56:20

I'm trying out the Schedules Direct service as a replacement for the RadioTimes UK feed that sadly will be taken down 16 Jun 2016. I've been using it for 4 years.

It appears that this grabber often doesn't provide the total number of episodes for TV series, which is a crucial element in my usage. On a few occasions it's there but for most episodes it's missing.

Here's an example:

RadioTimes UK:

<programme start="20160615213000 +0100" stop="20160615223000 +0100" channel="UK_RT_105">
<title lang="en">Versailles</title>
...
<episode-num system="xmltv_ns">0.2/10.</episode-num>
</programme>

Schecules Direct tv_grab_sd_json:

<programme start="20160615203000 +0000" stop="20160615212500 +0000" channel="50059">
<title>Versailles</title>
...
<episode-num system="xmltv_ns">0.2.</episode-num>
<episode-num system="dd_progid">EP023565850003</episode-num>
</programme>

As you can see, RadioTimes UK provides "0.2/10." (total number of episodes = 10), Schedules Direct provides "0.2." and "EP023565850003" (total number of episodes unknown).

According to Schecules Direct, the information is there in the source:

{
"programID": "SH023565850000",
"resourceID": "12271345",
"titles": [
{
"title120": "Versailles"
}
],
...
"metadata": [
{
"Gracenote": {
"totalEpisodes": 10,
"totalSeasons": 1,
"season": 0,
"episode": 0
}
}
],

Is this something you're aware of and will be fixing?

Cheers

Finn

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________
Kevin Groeneveld
2016-06-04 23:59:51 UTC
Permalink
Total episodes/seasons was only recently added to the SD-JSON data so it
has not had a lot of testing in the xmltv grabber.

I am a little confused about the example data you posted:

On Sat, Jun 4, 2016 at 10:53 AM, Finn Ellebaek Nielsen (Appatra Limited) <
Post by Finn Ellebaek Nielsen (Appatra Limited)
<programme start="20160615203000 +0000" stop="20160615212500 +0000" channel="50059">
<title>Versailles</title>
...
<episode-num system="xmltv_ns">0.2.</episode-num>
<episode-num system="dd_progid">EP023565850003</episode-num>
</programme>
The above data has a program ID of EP023565850003.
Post by Finn Ellebaek Nielsen (Appatra Limited)
{
"programID": "SH023565850000",
"resourceID": "12271345",
"titles": [
{
"title120": "Versailles"
}
],
But the above JSON has a program ID of SH023565850000.

If I download the data for EP023565850003 there is no totalEpisodes or
totalSeasons:

{
'showType' => 'Series',
'titles' => [
{
'title120' => 'Versailles'
}
],
'programID' => 'EP023565850003',
'metadata' => [
{
'Gracenote' => {
'episode' => 3,
'season' => 1
}
}
],

For the above "0.2." seems correct.

That being said, the current tv_grab_sd_json grabber ignores totalEpisodes
if episode=0 and ignores totalSeason if season=0. So for the example JSON
you posted it would still not include the totals in the XML output. In the
JSON 0 means the value is unknown, 0 in XMLTV means episode/season 1.
Post by Finn Ellebaek Nielsen (Appatra Limited)
Also note the programID of SH* you shouldn't generate xmltv_ns data for
SH* programs because the SH* means Gracenote doesn't know any details.

The example you posted has a SH* programID. But since it does include
totalEpisode and totalSeason (even though season/episode are both
zero/unknown) it seems maybe these should be included in the XML. Maybe
something like "/1./10.". I am not sure if it is considered valid to leave
out the "X" in the "X/Y" notation. The DTD has examples of just "X" but
not "/Y".

Kevin
Finn Ellebaek Nielsen (Appatra Limited)
2016-06-05 08:49:06 UTC
Permalink
I was told by SD that the SH record holds metadata about the series and that the EP record holds metadata about the episode and that you need to combine the two.

I'm not aware whether the SH key is generic and there's supposed to be one record per season (with season = 1, season = 2 ... season = n) or whether it's specific to each season. If it's specific to each season I guess you can ignore season = 0 because you already know the season number from the episode information. Perhaps you need to work this out with SD.

Here's an example of where totalEpisodes is included in the XMLTV:

<programme start="20160616075500 +0000" stop="20160616083500 +0000" channel="47657">
<title>Jamie's Comfort Food</title>
<sub-title>The Ultimate Cheese Toastie</sub-title>
<category>Series</category>
<category>series</category>
<episode-num system="xmltv_ns">0.1/8.</episode-num>
<episode-num system="dd_progid">EP019817390002</episode-num>
</programme>

From: Kevin Groeneveld [mailto:***@gmail.com]
Sent: Sun 05 Jun 2016 01:00
To: xmltv-***@lists.sourceforge.net
Subject: Re: [Xmltv-devel] tv_grab_sd_json: Total number of episodes missing for many TV series

Total episodes/seasons was only recently added to the SD-JSON data so it has not had a lot of testing in the xmltv grabber.
I am a little confused about the example data you posted:

On Sat, Jun 4, 2016 at 10:53 AM, Finn Ellebaek Nielsen (Appatra Limited) <***@appatra.com<mailto:***@appatra.com>> wrote:
Schecules Direct tv_grab_sd_json:

<programme start="20160615203000 +0000" stop="20160615212500 +0000" channel="50059">
<title>Versailles</title>
...
<episode-num system="xmltv_ns">0.2.</episode-num>
<episode-num system="dd_progid">EP023565850003</episode-num>
</programme>

The above data has a program ID of EP023565850003.

According to Schecules Direct, the information is there in the source:

{
"programID": "SH023565850000",
"resourceID": "12271345",
"titles": [
{
"title120": "Versailles"
}
],

But the above JSON has a program ID of SH023565850000.
If I download the data for EP023565850003 there is no totalEpisodes or totalSeasons:

{
'showType' => 'Series',
'titles' => [
{
'title120' => 'Versailles'
}
],
'programID' => 'EP023565850003',
'metadata' => [
{
'Gracenote' => {
'episode' => 3,
'season' => 1
}
}
],
For the above "0.2." seems correct.

That being said, the current tv_grab_sd_json grabber ignores totalEpisodes if episode=0 and ignores totalSeason if season=0. So for the example JSON you posted it would still not include the totals in the XML output. In the JSON 0 means the value is unknown, 0 in XMLTV means episode/season 1.
Also note the programID of SH* you shouldn't generate xmltv_ns data for SH* programs because the SH* means Gracenote doesn't know any details.
The example you posted has a SH* programID. But since it does include totalEpisode and totalSeason (even though season/episode are both zero/unknown) it seems maybe these should be included in the XML. Maybe something like "/1./10.". I am not sure if it is considered valid to leave out the "X" in the "X/Y" notation. The DTD has examples of just "X" but not "/Y".
Kevin

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________
Robert Kulagowski
2016-06-05 16:19:44 UTC
Permalink
Take a look at the similarities between these two:

EP023565850003

SH023565850000

In an EP, the last 4 digits are typically used as an episode number,
but don't correlate to S/E, or broadcast order, or anything that means
anything externally. In an SH, they will be "0000".

"EP" means that there is specific information for _this_ episode. It
will have guest stars, S/E, plot description for this episode, etc.

"SH" is the Show record. If we don't know specifics, because the
broadcaster hasn't shared it yet, but we know that _an_ episode of the
program is playing 3 weeks from now, there will be an SH record as a
placeholder. But, "SH" records are also used to store metadata about
the series, such as the total number of episodes. It will have
information about "you may also like..." recommendations.

So the middle set of characters after the first two and before the
last 4 are specific to the series.

If you want to know the total number of episodes for an EP, then you
need to pull the SH record. You can programatically determine the SH
record programID based on the above pattern and description.
Kevin Groeneveld
2016-06-05 23:13:23 UTC
Permalink
Post by Robert Kulagowski
If you want to know the total number of episodes for an EP, then you
need to pull the SH record. You can programatically determine the SH
record programID based on the above pattern and description.
This is the first I have heard that you can programmatically generate an SH
ID from an EP ID to get further show information. Is that in the JSON API
documentation anywhere? Maybe I just missed it. I thought the SH IDs were
only used when exact episode information was not known and if there was an
EP record it would have all the information.

Although for the specific example of this thread I am still not sure this
would fully solve the problem.

For totalEpisodes the docs state:

'in an "EP" program this indicates the total number of episodes in this
season'

and

'In an "SH" program, it will indicate the total number of episodes in the
series.'

It is my understanding that the xmltv_ns format is supposed to be "total
number of episodes this season" as from the EP record and does not include
"total number of episodes in the series" as from the SH record. In
the EP023565850003
example there was no totalEpisodes field to include in the xmltv_ns data.

For totalSeasons the docs state:

'totalSeasons: integer indicating the total number of seasons in the
series. *SH* programs only.'

Why not include this in the EP record as well?

If we really need to pull the SH record in addition to the EP record how
should the grabber know to update the SH record? Does the md5 in the
programs array change if either the EP record or SH record changes?


Kevin

Loading...