Discussion:
[Xmltv-devel] Add 'votes' attribute to star-rating element
h***@gmail.com
2014-04-10 16:58:32 UTC
Permalink
May I propose a change to the DTD. Some websites have "star-rating" based on the number of votes cast by people viewing the website. In these cases it's important to know the number of voters to make useful sense of the rating score. E.g. a programme might have a score of "0" but that's because no votes have been cast. Similarly a score of "10" means little if the number of voters is "1"!

I propose adding an attribute to the star-rating element called "votes" which could have the number of votes cast.

e.g.
http://port.ro/inchisoarea_ingerilor_the_shawshank_redemption/pls/fi/films.film_page?i_perf_id=59569078&i_topic_id=1
Rating = 9,7/10 from 1650 votes

So rather than :
<star-rating>
<value>10 / 10</value>
</star-rating>

It would have :
<star-rating votes="1650">
<value>10 / 10</value>
</star-rating>


Likewise
http://port.hu/noddy_kalandjai_jatekvarosban_ii._make_way_for_noddy/pls/fi/films.film_page?i_perf_id=41219731&i_topic_id=1

would have
<star-rating votes="2">
<value>1 / 10</value>
</star-rating>

indicating that the film isn't necessarily that bad!


(FTAOD this would be in addition to the existing "system" attribute)

Rgds,
Geoff
Robert Eden
2014-04-10 17:39:50 UTC
Permalink
Nice idea Geoff...

Would "Weight" or "Strength" be a better name than "votes"? That would
be allow automated rating systems to give a confidence number. (It's
still up to the app to determine how much a "system" is useful of course).

So rather than :
<star-rating>
<value>10 / 10</value>
</star-rating>

It would have :
<star-rating system="Geoff votes" weight="1650">
<value>10 / 10</value>
</star-rating>
Post by h***@gmail.com
May I propose a change to the DTD. Some websites have "star-rating" based on the number of votes cast by people viewing the website. In these cases it's important to know the number of voters to make useful sense of the rating score. E.g. a programme might have a score of "0" but that's because no votes have been cast. Similarly a score of "10" means little if the number of voters is "1"!
I propose adding an attribute to the star-rating element called "votes" which could have the number of votes cast.
e.g.
http://port.ro/inchisoarea_ingerilor_the_shawshank_redemption/pls/fi/films.film_page?i_perf_id=59569078&i_topic_id=1
Rating = 9,7/10 from 1650 votes
<star-rating>
<value>10 / 10</value>
</star-rating>
<star-rating votes="1650">
<value>10 / 10</value>
</star-rating>
Likewise
http://port.hu/noddy_kalandjai_jatekvarosban_ii._make_way_for_noddy/pls/fi/films.film_page?i_perf_id=41219731&i_topic_id=1
would have
<star-rating votes="2">
<value>1 / 10</value>
</star-rating>
indicating that the film isn't necessarily that bad!
(FTAOD this would be in addition to the existing "system" attribute)
Rgds,
Geoff
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
xmltv-devel mailing list
https://lists.sourceforge.net/lists/listinfo/xmltv-devel
h***@gmail.com
2014-04-10 19:21:03 UTC
Permalink
Post by Robert Eden
Nice idea Geoff...
Would "Weight" or "Strength" be a better name than "votes"? That would
be allow automated rating systems to give a confidence number. (It's
still up to the app to determine how much a "system" is useful of course).
<star-rating system="Geoff votes" weight="1650">
<value>10 / 10</value>
</star-rating>
Yes, "weight" is a much better term - I like that. Allows for many more systems than simply votes.
Robert Eden
2014-04-10 20:20:45 UTC
Permalink
Post by h***@gmail.com
Post by Robert Eden
Nice idea Geoff...
Would "Weight" or "Strength" be a better name than "votes"? That would
be allow automated rating systems to give a confidence number. (It's
still up to the app to determine how much a "system" is useful of course).
<star-rating system="Geoff votes" weight="1650">
<value>10 / 10</value>
</star-rating>
Yes, "weight" is a much better term - I like that. Allows for many more systems than simply votes.
Maybe expand it farther.. do we want to include guidelines for Weight?
Instead of 1-infiniity, say 0-9

We can provide guidelines...
0 = untrusted
1 = a few random internet reviews ( maybe even int(votes /
1000) )
5 = lots of random internet reviews
9 = lots "quality" professional reviews.

I'm not looking to tie down a system (the terms "lots" and "quality" are
purposely left to interpretation) but provide a way for apps to compare
weights by having a guideline
Ben Bucksch
2014-04-10 20:53:48 UTC
Permalink
I concur with Robert that some abstraction is necessary. We don't need
the particularities of each source calculated in each app.

I would, in fact, recommend that you simply filter them out on the
grabber level. I.e. if you (as the grabber author) feel that a rating is
not useful, because there are too few votes, then simply don't add the
rating at all. I think that's the easiest and also cleanest fix.
Post by Robert Eden
<star-rating system="Geoff votes" weight="1650">
<value>10 / 10</value>
</star-rating>
*If* you add the weight="", it should be an attribute on the <value>
node, not on <star-rating>.
Or you could add a new element <weight>.

Ben
Robert Eden
2014-04-10 22:31:22 UTC
Permalink
Post by Ben Bucksch
I concur with Robert that some abstraction is necessary. We don't need
the particularities of each source calculated in each app.
I would, in fact, recommend that you simply filter them out on the
grabber level. I.e. if you (as the grabber author) feel that a rating is
not useful, because there are too few votes, then simply don't add the
rating at all. I think that's the easiest and also cleanest fix.
Post by Robert Eden
<star-rating system="Geoff votes" weight="1650">
<value>10 / 10</value>
</star-rating>
*If* you add the weight="", it should be an attribute on the <value>
node, not on <star-rating>.
Or you could add a new element <weight>.
Good point Ben... and I'll change my divinely inspired "weight" idea....
if it's on value (as it should be), how about "Quality" with the same
0-9 suggested scale.

Robert
Ben Bucksch
2014-04-10 23:59:34 UTC
Permalink
Personally, I don't think these low-quality ratings shouldn't even be
added in the first place.

What is an app going to do with them? Show them in a different color?
What when rating is the primary sorting criteria (for a list of movies),
how would weight/quality come in?
And what about apps that don't understand the quality? They'll display
(or worse: use in their algorithm) random ratings based on a single user
or no vote.

I think the grabber is in the best position to judge what to do with it.

Ben
h***@gmail.com
2014-04-11 12:32:45 UTC
Permalink
Post by Robert Eden
Maybe expand it farther.. do we want to include guidelines for Weight?
Instead of 1-infiniity, say 0-9
Good idea.
Post by Robert Eden
[snip suggested score definitions]
I think that's the best approach where the score is based on a mix of review entities (e.g. "internet" mixed with "professional").

We can go one better where a numeric score is displayed on the site, by calculating the population size required to give a confidence level.

Calculating the sample sizes to give various confidence levels (with a margin of error = 5%) suggests the following values:

confidence
level population
.1 2
.2 6
.3 15
.4 27
.5 45
.6 71
.7 107
.8 164
.9 271
1 664

(The statistical formula used to calculate these is largely independent of total population size).

I'm not suggesting we prescribe these actual values, but as "guidance" they have some merit. (Simply drop the decimal point to determine as 0 thru 9 ? And shift down one?)
Post by Robert Eden
I would, in fact, recommend that you simply filter them out on the
grabber level. I.e. if you (as the grabber author) feel that a rating is
not useful, because there are too few votes, then simply don't add the
rating at all. I think that's the easiest and also cleanest fix.
But that only gives you a yes/no response (i.e. whether to output or not) - it doesn't give the finer-grained 'confidence' in the value.

E.g. if I set an arbitrary limit at 10 votes then is 11 votes really the same as 11,000?

I feel a "confidence level" (a.k.a. "weight") is still required to make good use of the "value". For the same reason the source website includes it in the screen display! ;-)
Post by Robert Eden
*If* you add the weight="", it should be an attribute on the <value>
node, not on <star-rating>.
Semantically I agree with you but I think it should go on the "star-rating" to allow for source sites which have only an "icon" and not a "value".

e.g.
<star-rating system="tomatoes" quality="8">
<icon src="Loading Image..." />
</star-rating>
Post by Robert Eden
Personally, I don't think these low-quality ratings shouldn't even be
added in the first place.
[...]
Post by Robert Eden
I think the grabber is in the best position to judge what to do with it.
One is damned either way. I don't think it should be up to the grabber author to manipulate the data - who am I to say what the downstream app is going to do with it? Without an in-depth understanding of exactly what all apps are using it for, I can't say whether any arbitrary limits set in the grabber are 'fair and reasonable' or not.

No, I maintain it should be up to the app to determine what it displays - if it displays the wrong values then that's *its* problem not the grabber's.

Sure you can set a 'minimum' limit, but I dislike the simplistic binary approach of:
score < minimum limit = no rating at all
score > minimum limit = 100% confidence in the rating value displayed.

The first is ok, but the second is grossly misleading.

The source websites don't work like this; and for a good reason.

Rgds,
Geoff
Robert Eden
2014-04-19 20:51:00 UTC
Permalink
Post by h***@gmail.com
Post by Ben Bucksch
*If* you add the weight="", it should be an attribute on the <value>
node, not on <star-rating>.
Semantically I agree with you but I think it should go on the "star-rating" to allow for source sites which have only an "icon" and not a "value".
e.g.
<star-rating system="tomatoes" quality="8">
<icon src="http://example.co.uk/icons/4-stars.png" />
</star-rating>
When I read the DTD, <value> is required, <icon> is optional. In
addition, the DTD states a grabber should try "map whatever wacky system
your listings source uses to a number of stars". :) So if there is only
a 4-stars.png icon, the grabber should turn that into a value of 4.

I also agree with Ben that "weight" should go on <value>....

So how about this: (note, I'm not an XML expert, please correct me.. I
assume CDATA is for character data, but I couldn't find a numeric
numeric equivalent. Maybe "byte"?)

-------------------------------

<!-- 'Star rating' - many listings guides award a programme a score as
a quick guide to how good it is. The value of this element should be
'N / M', for example one star out of a possible five stars would be
'1 / 5'. Zero stars is also a possible score (and not the same as
'unrated'). You should try to map whatever wacky system your listings
source uses to a number of stars: so for example if they have thumbs
up, thumbs sideways and thumbs down, you could map that to two, one or
zero stars out of two. If a programme is marked as recommended in a
listings guide you could map this to '1 / 1'. Because there could be many
ways to provide star-ratings or recommendations for a programme, you can
specify multiple star-ratings. You can specify the star-rating system
used, or the provider of the recommendation, with the system attribute.
Whitespace between the numbers and slash is ignored.
The "weight" value attribute can be used to provide the strength of the
recommendation
Thousands of 5* ratings is "heavier" than 5 5* ratings. For simplicity
"weight" should
range from 0-9. Suggestions on computing values are in the XMLTV Wiki.
-->

<!ELEMENT star-rating (value, icon*)>
<!ATTLIST star-rating system CDATA #IMPLIED>
<!ATTLIST value weight CDATA #IMPLIED>

Karl Dietz
2014-04-11 06:54:29 UTC
Permalink
Post by h***@gmail.com
May I propose a change to the DTD. Some websites have "star-rating" based on the number of votes cast by people viewing the website. In these cases it's important to know the number of voters to make useful sense of the rating score. E.g. a programme might have a score of "0" but that's because no votes have been cast. Similarly a score of "10" means little if the number of voters is "1"!
I propose adding an attribute to the star-rating element called "votes" which could have the number of votes cast.
I'm with Ben, the XMLTV concept is based heavily on the grabber making
sense of the source data so the consuming application of the guide does
not have to add magic stuff to make sense for every source site. Our
consumers have a hard time with simple things like "more then one
rating" so lets not make it harder for them.

Here is an example of how I handle multiple ratings / vote counts per
item in NonameTV.
https://github.com/dekarl/nonametv/blob/master/lib/NonameTV/Augmenter/Tvdb.pm#L185

Based on "carefully adjusted magic numbers" the code decides which
rating to use as "the one true rating" if any.

And here's another site
https://github.com/dekarl/nonametv/blob/master/lib/NonameTV/Augmenter/Tmdb3.pm#L132

A vote count below the threshold would be encoded as "no rating" instead
of "rating of 0 / x".

Regards,
Karl
Continue reading on narkive:
Loading...