topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Saturday April 26, 2025, 7:40 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - compn [ switch to compact view ]

Pages: [1] 2 3next
1
still run into annoying stuff like this with fuzzy match though :(

Bruiser
First list:
Bruiser
Gunbuster (1988)
buster.1988.internal.bdrip.x264-ghouls.mkv
Second list:
Buster (1988).mkv.torrent

2
ah yes, i'm using the ahk 1.x script in that thread now. i can select text and paste it with middle click. i am never going back to ctrl+c / ctrl+v again!

i wonder if there is an updated script?

3
in one piece of software i use, it auto copies to clipboard anything i highlight.

i thought this was a neat feature, but why not make it universal to all software and operating system?

i see there is a tool to do it, autoclipX but it is not developed anymore and website is gone. i am not sure if it works in windows10 either. i havent tested it yet.

i saw autoclipx from this reddit thread while looking for a tool: https://www.reddit.c...ted_text_on_windows/

web archive of autoclipx website (last? version when it was freeware): https://web.archive....0/software/autoclipx

screenshot: https://images.sftcd...clipx-screenshot.png

(i am not interested in funding this program.)

it looks like its possible with AHK https://www.autohotk...lect-implementation/

4
no more crashing!
also it doesnt use a lot of ram,  even on large lists. i am impressed!

5
hmm. does this work on win2k ? https://superuser.co...op-folder-in-windows

congrats on using win2k this long. i was able to use win2k until about 2010. so i made it 10 years!   :up:

actually this program looks like what you want . maybe it can be modified. but i dont know about win2k https://www.donation...ex.php?topic=50889.0

7
 :Thmbsup: :Thmbsup: :Thmbsup:

1. command line interface if possible.


8
thanks vic

9
another string that causes crash.

"FÃ¥rö Document    [Documentary]   [1970].mkv"

can we get an update program that doesn't crash on these? :)

also this crashes: (i might edit and put more below)
"Movies\\[1971]\\David Lean   A Self Portrait   [Documentary]   [1971]"

10
program crashes on more characters:

Dark Lady of Kung Fu µûŸoÓ° (1981) º£°¶°æ.avi.torrent

11
here are some stress test lists. these are from youtube titles and it crashes. probably due to non english character.

12
fuzzy match works!

document the default ratio / partial ratio / token set etc from fuzzysharp into help menu?

alpha-4 has a weird bug where the matches result lists switch. its not important bug. i just report it.

Anthropophagous
First list:
Anthropophagous.1980.REMASTERED.1080p.BluRay.x264.DTS-FGT.mkv.torrent
Anthropophagus (1980) DTS (Custom 5.1 Mix).torrent
Anthropophagus 1980 1080p GBR Blu-ray AVC LPCM 2.0-CultFilmsT.torrent
Anthropophagus.1980.1080p.BluRay.x264-FAPCAVE.torrent
Anthropophagus.1980.BD.Baggerinc.torrent
ANTROPOPHAGUS.Mg.torrent
Antropophagus.torrent
Second list:
Anthropophagous [1980]

fuzzy matching is slower. i know there is nothing that can be done to speed this up, no worries. a progress/working indicator or progress bar would be nice. maybe just a little animated gif ?

14
having years reusable wont help though. at least not that i can see. these are all different films with the same titles.

see the alice movie results:
Alice
First list:
Alice [1982]
Alice [1991]
Second list:
Alice (1988).mkv

Alice in Wonderland
First list:
Alice in Wonderland [1999]
Alice in Wonderland [2010]
Second list:
Alice in Wonderland (1903).mkv

its best to have a mismatched-year list.  or its fine to just leave it as is for now. year matching is not high priority. and i like the year matching how it is right now. :)

i look forward to next updated version soon! i am in need of this program more than i thought. especially because of the crash filename bug!

15
no, i prefer just having a third output list with mismatched years.

relying on a 3rd party server will break always.

16
i forgot to say that i am happy with your progress , and your designs so far.

i look forward to exclusion ability for better matching :)

i dont need fuzzy matching first , but you kept asking so i go for it :D

17
General Software Discussion / batch scripting
« on: May 19, 2024, 09:34 PM »
i was doing a lot of grepping and it was causing a toll on my hands typing all the extra characters.

so i made a bat file

g.bat:
@echo off
grep -i %1 * | grep -i %2

now i'm wondering if i can make it so i just run it once, and i can just type the grep search words over and over again ?

what i mean is: instead of:
grep -i foo * | grep -i bar
grep -i foo1 * | grep -i bar1
and then, with bat file it looks like this:
g foo bar
g foo1 bar1

i can just do
g.bat
foo bar
(results for foo bar)
foo1 bar1
(results for foo1 bar1)
foo2 bar2
(results for foo2 bar2)

without having to type g each time.


also would be nice to be able to do g foo (it currently errors because the second grep is empty), or g foo bar zed hex whatever number of grep searches i need to do at once.

its been so long i cant remember what this stuff is even called in batch to search how to script it.

18
i thought up a solution to different films having different years.

i think having a "year mismatch" output list :D

19
another feature:
a count of matched/unmatched strings would be nice
auto saving of matches/unmatched to default .txt names
drag and drop lists?
command line interface if possible ?

20
i think fuzzy matching will fix pretty much all these issues.

if you need more sample list just do a dir /a /s > list1.txt and compare it to the same list. good for stress testing mem and cpu usage as well.

21
fuzzy matching is the next feature.

22
super crash!

works ok with list1 / list2 / list3

but trying my lists, i get a crash. my lists may have unicode (or just non english codepage) characters in the filenames. i dont need to compare these filenames, and would rather have them ignored.

attached is a sample filename which crashes

running sed "s/[^\x00-\x7F]//g" list.txt
seems to fix the issue and allows me to use the program on my list.

as for matching... well. there are problems. the year has to be matched. but i do like that years are sometimes matched because sometimes movie years will be incorrect from filename to filename. a release date vs a production date or theatrical release date. so i dont know what i want here. i've seen years be off by 3 or even 8 years from what imdb claims. not sure this can be fixed, its an imdb issue

Alice
First list:
Alice [1982]
Alice [1991]
Second list:
Alice (1988).mkv

Alice in Wonderland
First list:
Alice in Wonderland [1999]
Alice in Wonderland [2010]
Second list:
Alice in Wonderland (1903).mkv

All Quiet on the Western Front
First list:
All Quiet on the Western Front [1979]
Second list:
All Quiet on the Western Front (1930).mkv

but these didnt match. the following might be useful for fuzzy match testing?

list1
Child's Play [1988]
Child's Play 2 [1990]
Child's Play 3 [1991]
Child's Play Sidney Lumet, 1972
Necronomicon
Necronomicon: Book of Dead [1994]
The Necronomicon [2009]
RiffTrax - The Psychotronic Man (1979) mp4
The Psychotronic Man -  The Psychotronic Man 1980 Movie FULL HD (480p_25fps_H264-128kbit_AAC)  srt
The Phantom Creeps [1939]
ROTOR_divx avi

list2
01 Child’s Play (1988) 3 Commentaries.mkv
03 Childs Play 3 (1991).mkv
Necronomicon (1993).mkv
Psychotronic Man.mp4
R.O.T.O.R..mkv
Phantom Creeps.mp4


(its why i wanted to ignore words like "the". because sometimes they use the "the" and sometimes "not".

23
I see Vic is on the task, but I think what you're asking for is usually called fuzzy string matching.  Try a Web search, there seems plenty of Python work on it, and take a look at Comparing Strings Is Easy With FuzzyWuzzy.

thanks, i didnt know what it was called.

this stackoverflow post has some info about scoring fuzzy matches, and the output kinda sounds exactly what i was describing, although i dont know how useful that output would be until using it. and i wouldnt want the match percentage to be output in a new list anyhow.
https://stackoverflo...h-very-similar-names

Output:

print(results)
           sample_name             actual_name  score
0             jtsports           JT Sports LLC   79.0
1          tombaseball       Tom Baseball Inc.   81.0
2      context express     Context Express LLC   95.0
3            zb sicily           ZB Sicily LLC   95.0
4   lightening express  Lightening Express LLC   95.0
5           fire roads       Fire Road Express   86.0
6                  NaN             Earth Treks    NaN
7                  NaN           TS Sports LLC    NaN
8                  NaN        MM Baseball Inc.    NaN
9                  NaN     Contact Express LLC    NaN
10                 NaN           AB Sicily LLC    NaN
11                 NaN    Lightening Roads LLC    NaN

- First, the MovieList for direct movie comparison.
-paradisusvic (May 15, 2024, 07:38 PM)
this already exists i think. its just uniq -d

for example, with the lists i provided:

C:\>cat list* | sort | uniq -d
The.Last.Married.Couple.In.America.1980.720p.BluRay.x264.AAC-[YTS.MX].mp4
The.Osterman.Weekend.1983.720p.BluRay.x264.YIFY.mp4
The.Wild.Life.1984.720p.BluRay.x264.AAC-[YTS.MX].mp4

C:\>uniq --version
uniq (textutils) 2.1
C:\>cat --version
cat (GNU textutils) 2.0

but that wouldnt tell me which list has the duplicate because i had to concatenate the files.
all i really need is a fuzzy uniq for comparing two lists...

and if i google fuzzy uniq theres 'funiq'

https://github.com/mjfisheruk/funiq

Funiq (fuzzy uniq) is a command line tool for performing fuzzy string matching against lists of words.

and the examples it is using to compare? movie titles! hahaha :finger pointing at brain.meme:

but the ability to exclude words and characters would make for better matches. i see some fuzzy matching toolkits use scrubbers to scrub the inputs first.

- Second, Fuzzy-matching for adding more/partial results.
Tokens + ignore list of words :Thmbsup:

fuzzy matching looks difficult :D

24
neat. hopefully i described it properly what i wanted :)

25
something like this except background color instead of text color. this took me way longer than i expected...

excluded strings: .mkv .avi .mp4 - ( ) [ ] dvdrip
list2list3
The Big Red One - Lee Marvin (1980).avi1980 - The Big Red One.mp4
The Black Cauldron {1985} ENG.DVDRIP.avi1985 - The Black Cauldron.mkv
The Burbs (1989) Tom Hanks .avi1989 - The Burbs.mkv
The Breakfast Club - 1985.avi1985 - The Breakfast Club.mkv
The Jewel of the Nile (1985) Kathlene Turner, Michael Douglas, Danny DeVito.avi
1985 - The Jewel Of The Nile.mkv

Pages: [1] 2 3next