h***@gmail.com
2014-05-01 07:54:51 UTC
While working on tv_imdb I've come across an issue with the way it currently handles episodes of a series. Basically it's just ignoring eps and lumping everything together as one title. So you get the entire series all rolled up.
I believe this is wrong. E.g. try listing the actors for:
grep '"Doctor Who" (1963)' stage3.data
or
grep '"Coronation Street" (1960)' stage3.data
I think it's no wonder it needs so much memory to run ;-)
However the final stage (creating the actual database) then ignores these records since they don't match stage1.data which *does* still contain the episode in the title!
grep '"Doctor Who" (1963)' stage1.data
So the net result is that "Doctor Who" (1963) has no data at all (except ratings) in the final database:
doctor%20who%20%281963%29 "Doctor Who" (1963) 1963 tv_series 0216306
0216306:<> <> <> 0..001221. 42 7.3 <> <>
I propose changing this so IMDB.pm becomes 'episode-aware' (will require changes to the file layout).
I can then subsequently amend the searching lookup (tv_imdb) so it properly looks for series title + episode title.
Your thoughts please?
Rgds,
Geoff
I believe this is wrong. E.g. try listing the actors for:
grep '"Doctor Who" (1963)' stage3.data
or
grep '"Coronation Street" (1960)' stage3.data
I think it's no wonder it needs so much memory to run ;-)
However the final stage (creating the actual database) then ignores these records since they don't match stage1.data which *does* still contain the episode in the title!
grep '"Doctor Who" (1963)' stage1.data
So the net result is that "Doctor Who" (1963) has no data at all (except ratings) in the final database:
doctor%20who%20%281963%29 "Doctor Who" (1963) 1963 tv_series 0216306
0216306:<> <> <> 0..001221. 42 7.3 <> <>
I propose changing this so IMDB.pm becomes 'episode-aware' (will require changes to the file layout).
I can then subsequently amend the searching lookup (tv_imdb) so it properly looks for series title + episode title.
Your thoughts please?
Rgds,
Geoff