Welcome Guest.   Make a donation to an author on the site August 01, 2014, 11:33:29 PM  *

Please login or register.
Or did you miss your validation email?


Login with username and password (forgot your password?)
Why not become a lifetime supporting member of the site with a one-time donation of any amount? Your donation entitles you to a ton of additional benefits, including access to exclusive discounts and downloads, the ability to enter monthly free software drawings, and a single non-expiring license key for all of our programs.


You must sign up here before you can post and access some areas of the site. Registration is totally free and confidential.
 
Read the Practical Guide to DonationCoder.com Forum Search Features
   
   Forum Home   Thread Marks Chat! Downloads Search Login Register  
Pages: [1]   Go Down
  Reply  |  New Topic  |  Print  
Author Topic: Rename files based on a list of similar names  (Read 3147 times)
Jabberwock
Participant
*
Posts: 71

View Profile Give some DonationCredits to this forum member
« on: June 27, 2010, 12:40:48 PM »

I sometimes have a bunch of, erm, obtained media files. The files in themselves are fine, but the filenames are often a mess (especially when Polish characters are involved). On the other hand, for most of them I can obtain a clean nice list of correct filenames. How to get the two together automatically? I guess a perl script that would look for most similar string might do, but it is not exactly trivial...

Is there a software that would rename a file to a name that is most similar from a list?
Logged
kfitting
Charter Member
***
Posts: 574


View Profile Give some DonationCredits to this forum member
« Reply #1 on: June 27, 2010, 02:50:59 PM »

To get started, look up  "Levenshtein distance."  There are various algorithms in various languages (I found one in python, converted it to VB).  There are variations depending on speed vs fuzzy matching complexity.

Stackoverflow
How the algorithm works
« Last Edit: June 27, 2010, 03:28:37 PM by kfitting » Logged
skwire
Charter Member
***
Posts: 4,021



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #2 on: June 27, 2010, 03:12:30 PM »

1) You have a list of files with names you don't like.
2) You have a text listing of filenames that you do you like.

How about an application with two listviews?  The list on the left shows your actual files and the one on the right is your text list.  For the listview on the right, you'd be able to move the entries up and down to match up against the ones on the left.  Once you have the names where you want them, click a button, and your original files are renamed.
Logged

skwire
Charter Member
***
Posts: 4,021



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #3 on: June 27, 2010, 03:29:53 PM »

Actually, my favourite file renamer, ReNamer, can already do what you want.  http://www.den4b.com/projects.php



Let me know if that doesn't cover your needs and I'll see about writing something custom for you.
« Last Edit: June 27, 2010, 03:31:57 PM by skwire » Logged

AbteriX
Charter Honorary Member
***
Posts: 1,041


Member #520

see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #4 on: June 27, 2010, 04:39:46 PM »

Actually, my favourite file renamer, ReNamer, can already do what you want. 
Please note that the list of "right" names have to be in the same order as the files to rename are listed in e.g. ReNamer.
Otherwise you get files with the right names but not the content the names may suggest.
Logged

Greetings, Stefan.
skwire
Charter Member
***
Posts: 4,021



Another Coding Snack request? Om nom nom...

see users location on a map View Profile WWW Give some DonationCredits to this forum member
« Reply #5 on: June 27, 2010, 04:51:02 PM »

Please note that the list of "right" names have to be in the same order as the files to rename are listed in e.g. ReNamer.
Otherwise you get files with the right names but not the content the names may suggest.

Understood.  That's why I offered to write a custom app to be able to do what he wants.  However, most any decent text editor has keyboard shortcuts to move lines up and down so it should be a simple matter for Jabberwock to manipulate his text listing before pasting it into ReNamer.
Logged

ewemoa
Honorary Member
**
Posts: 2,397



View Profile Give some DonationCredits to this forum member
« Reply #6 on: June 27, 2010, 05:01:12 PM »

However, most any decent text editor has keyboard shortcuts to move lines up and down so it should be a simple matter for Jabberwock to manipulate his text listing before pasting it into ReNamer.
ReNamer is my favorite too smiley

FWIW, I sometimes sort before pasting into ReNamer and tend to sort with an editor.  I haven't tried with Notepad++, but I noticed that TextFX -> TextFX Tools has some line-sorting-related menu items.
Logged
Jabberwock
Participant
*
Posts: 71

View Profile Give some DonationCredits to this forum member
« Reply #7 on: June 27, 2010, 05:18:46 PM »

I am afraid the order is the big issue here.

The list I usually get is e.g. chronological list of shows. The filenames do not necessarily follow that order. I just realized that possibly the fastest solution would be to rearrange the list itself, i.e. sort it alphabetically. As most of the files have (or can be made to have) the first letter correct, the rearrangement of files (by means of manual rename) could be quite easy.

Having said that, I had a look at the Levenshtein distance. I was familiar with the concept, but imagined the application would be rather complicated. In fact, it is not - I had a working Perl script within minutes... While it is quite rough at the edges, it did the test job quite well.

As I said, the script is rather amateurish, but if someone insists, I can put it here. By the way, Perl is great here as parsing through the filename list (and the list of files) is very fast. An application (e.g. VB) might be easier to use, but I suppose it would be much slower...
Logged
Innuendo
Charter Member
***
Posts: 1,905

View Profile Give some DonationCredits to this forum member
« Reply #8 on: June 27, 2010, 08:03:43 PM »

The list I usually get is e.g. chronological list of shows. The filenames do not necessarily follow that order.

Sounds like the source where you get these files needs to name them better in the first place.
Logged
AbteriX
Charter Honorary Member
***
Posts: 1,041


Member #520

see users location on a map View Profile WWW Read user's biography. Give some DonationCredits to this forum member
« Reply #9 on: June 28, 2010, 04:38:36 AM »

As I said, the script is rather amateurish, but if someone insists, I can put it here.
Yeah, lets see please  tongue
Logged

Greetings, Stefan.
Jabberwock
Participant
*
Posts: 71

View Profile Give some DonationCredits to this forum member
« Reply #10 on: June 28, 2010, 04:46:05 AM »

OK, here it goes... It's very simple, but it has manual confirmation built-in - the distance measurement is not always correct. Any tips for improvement are welcome, of course.

[copy or print]

use Text::Levenshtein qw(distance);

# Change to your file dir, naturally

opendir (DIR, "N:/Cartoons/New");

@files = readdir(DIR);

foreach $filename (@files) {
print $filename;
$dis = 1000;

# Your filename list text file here

open (INFILE, "Filenames.txt");


while (<INFILE>) {

chomp $_;

if ($dis > distance ($_, $filename)) {
$dis = distance ($_, $filename);

# All my files are avi, you might want to check the extension, too

$best = $_ . ".avi";
}
}
print "$filename = $best \n";
print "OK?";
$response = <>;
chomp $response;
if ($response eq "y") {
rename ($filename, $best);
print "renamed";
}

}

Logged
sajman99
Supporting Member
**
Posts: 663


View Profile Give some DonationCredits to this forum member
« Reply #11 on: July 05, 2010, 01:57:39 PM »

Perhaps I'm a "dummy" for saying this, but it occurs to me  a dummy file generator could be of use in these file renaming situations. The dummy files would  serve as temporary spacers to ensure the correct file sequence remains intact. Then a file renaming utility like ReNamer (my fave also) could process the files without disrupting the correct file order.

So for me the question becomes---Is there an easy to use dummy file generator which can batch generate multiple files?

For example, if I am missing 725006, 725012, 725035, and 725057 from a series [725001-725100], then I would (1) batch generate those specific dummy files, (2) rename all files with ReNamer or other renaming utility, and (3) delete the temp dummy files.

Uh...just trying to figure a quick and easy way to address these renaming problems.
Logged
Pages: [1]   Go Up
  Reply  |  New Topic  |  Print  
 
Jump to:  
   Forum Home   Thread Marks Chat! Downloads Search Login Register  

DonationCoder.com | About Us
DonationCoder.com Forum | Powered by SMF
[ Page time: 0.036s | Server load: 0.05 ]