super crash!
works ok with list1 / list2 / list3
but trying my lists, i get a crash. my lists may have unicode (or just non english codepage) characters in the filenames. i dont need to compare these filenames, and would rather have them ignored.
attached is a sample filename which crashes
running sed "s/[^\x00-\x7F]//g" list.txt
seems to fix the issue and allows me to use the program on my list.
as for matching... well. there are problems. the year has to be matched. but i do like that years are sometimes matched because sometimes movie years will be incorrect from filename to filename. a release date vs a production date or theatrical release date. so i dont know what i want here. i've seen years be off by 3 or even 8 years from what imdb claims. not sure this can be fixed, its an imdb issue
Alice
First list:
Alice [1982]
Alice [1991]
Second list:
Alice (1988).mkv
Alice in Wonderland
First list:
Alice in Wonderland [1999]
Alice in Wonderland [2010]
Second list:
Alice in Wonderland (1903).mkv
All Quiet on the Western Front
First list:
All Quiet on the Western Front [1979]
Second list:
All Quiet on the Western Front (1930).mkv
but these didnt match. the following might be useful for fuzzy match testing?
list1
Child's Play [1988]
Child's Play 2 [1990]
Child's Play 3 [1991]
Child's Play Sidney Lumet, 1972
Necronomicon
Necronomicon: Book of Dead [1994]
The Necronomicon [2009]
RiffTrax - The Psychotronic Man (1979) mp4
The Psychotronic Man - The Psychotronic Man 1980 Movie FULL HD (480p_25fps_H264-128kbit_AAC) srt
The Phantom Creeps [1939]
ROTOR_divx avi
list2
01 Child’s Play (1988) 3 Commentaries.mkv
03 Childs Play 3 (1991).mkv
Necronomicon (1993).mkv
Psychotronic Man.mp4
R.O.T.O.R..mkv
Phantom Creeps.mp4
(its why i wanted to ignore words like "the". because sometimes they use the "the" and sometimes "not".