topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Tuesday April 16, 2024, 5:40 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Rename files based on a list of similar names  (Read 8946 times)

Jabberwock

  • Participant
  • Joined in 2006
  • *
  • default avatar
  • Posts: 76
    • View Profile
    • Donate to Member
Rename files based on a list of similar names
« on: June 27, 2010, 12:40 PM »
I sometimes have a bunch of, erm, obtained media files. The files in themselves are fine, but the filenames are often a mess (especially when Polish characters are involved). On the other hand, for most of them I can obtain a clean nice list of correct filenames. How to get the two together automatically? I guess a perl script that would look for most similar string might do, but it is not exactly trivial...

Is there a software that would rename a file to a name that is most similar from a list?

kfitting

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 593
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #1 on: June 27, 2010, 02:50 PM »
To get started, look up  "Levenshtein distance."  There are various algorithms in various languages (I found one in python, converted it to VB).  There are variations depending on speed vs fuzzy matching complexity.

Stackoverflow
How the algorithm works
« Last Edit: June 27, 2010, 03:28 PM by kfitting »

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #2 on: June 27, 2010, 03:12 PM »
1) You have a list of files with names you don't like.
2) You have a text listing of filenames that you do you like.

How about an application with two listviews?  The list on the left shows your actual files and the one on the right is your text list.  For the listview on the right, you'd be able to move the entries up and down to match up against the ones on the left.  Once you have the names where you want them, click a button, and your original files are renamed.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #3 on: June 27, 2010, 03:29 PM »
Actually, my favourite file renamer, ReNamer, can already do what you want.  http://www.den4b.com/projects.php

2010-06-27_152217.pngRename files based on a list of similar names

Let me know if that doesn't cover your needs and I'll see about writing something custom for you.
« Last Edit: June 27, 2010, 03:31 PM by skwire »

AbteriX

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 1,149
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #4 on: June 27, 2010, 04:39 PM »
Actually, my favourite file renamer, ReNamer, can already do what you want. 
Please note that the list of "right" names have to be in the same order as the files to rename are listed in e.g. ReNamer.
Otherwise you get files with the right names but not the content the names may suggest.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #5 on: June 27, 2010, 04:51 PM »
Please note that the list of "right" names have to be in the same order as the files to rename are listed in e.g. ReNamer.
Otherwise you get files with the right names but not the content the names may suggest.

Understood.  That's why I offered to write a custom app to be able to do what he wants.  However, most any decent text editor has keyboard shortcuts to move lines up and down so it should be a simple matter for Jabberwock to manipulate his text listing before pasting it into ReNamer.

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #6 on: June 27, 2010, 05:01 PM »
However, most any decent text editor has keyboard shortcuts to move lines up and down so it should be a simple matter for Jabberwock to manipulate his text listing before pasting it into ReNamer.
ReNamer is my favorite too :)

FWIW, I sometimes sort before pasting into ReNamer and tend to sort with an editor.  I haven't tried with Notepad++, but I noticed that TextFX -> TextFX Tools has some line-sorting-related menu items.

Jabberwock

  • Participant
  • Joined in 2006
  • *
  • default avatar
  • Posts: 76
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #7 on: June 27, 2010, 05:18 PM »
I am afraid the order is the big issue here.

The list I usually get is e.g. chronological list of shows. The filenames do not necessarily follow that order. I just realized that possibly the fastest solution would be to rearrange the list itself, i.e. sort it alphabetically. As most of the files have (or can be made to have) the first letter correct, the rearrangement of files (by means of manual rename) could be quite easy.

Having said that, I had a look at the Levenshtein distance. I was familiar with the concept, but imagined the application would be rather complicated. In fact, it is not - I had a working Perl script within minutes... While it is quite rough at the edges, it did the test job quite well.

As I said, the script is rather amateurish, but if someone insists, I can put it here. By the way, Perl is great here as parsing through the filename list (and the list of files) is very fast. An application (e.g. VB) might be easier to use, but I suppose it would be much slower...

Innuendo

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 2,266
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #8 on: June 27, 2010, 08:03 PM »
The list I usually get is e.g. chronological list of shows. The filenames do not necessarily follow that order.

Sounds like the source where you get these files needs to name them better in the first place.

AbteriX

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 1,149
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #9 on: June 28, 2010, 04:38 AM »
As I said, the script is rather amateurish, but if someone insists, I can put it here.
Yeah, lets see please  :P

Jabberwock

  • Participant
  • Joined in 2006
  • *
  • default avatar
  • Posts: 76
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #10 on: June 28, 2010, 04:46 AM »
OK, here it goes... It's very simple, but it has manual confirmation built-in - the distance measurement is not always correct. Any tips for improvement are welcome, of course.


use Text::Levenshtein qw(distance);

# Change to your file dir, naturally

opendir (DIR, "N:/Cartoons/New");

@files = readdir(DIR);

foreach $filename (@files) {
print $filename;
$dis = 1000;

# Your filename list text file here

open (INFILE, "Filenames.txt");


while (<INFILE>) {

chomp $_;

if ($dis > distance ($_, $filename)) {
$dis = distance ($_, $filename);

# All my files are avi, you might want to check the extension, too

$best = $_ . ".avi";
}
}
print "$filename = $best \n";
print "OK?";
$response = <>;
chomp $response;
if ($response eq "y") {
rename ($filename, $best);
print "renamed";
}

}


sajman99

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 664
    • View Profile
    • Donate to Member
Re: Rename files based on a list of similar names
« Reply #11 on: July 05, 2010, 01:57 PM »
Perhaps I'm a "dummy" for saying this, but it occurs to me  a dummy file generator could be of use in these file renaming situations. The dummy files would  serve as temporary spacers to ensure the correct file sequence remains intact. Then a file renaming utility like ReNamer (my fave also) could process the files without disrupting the correct file order.

So for me the question becomes---Is there an easy to use dummy file generator which can batch generate multiple files?

For example, if I am missing 725006, 725012, 725035, and 725057 from a series [725001-725100], then I would (1) batch generate those specific dummy files, (2) rename all files with ReNamer or other renaming utility, and (3) delete the temp dummy files.

Uh...just trying to figure a quick and easy way to address these renaming problems.