ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > DC Member Programs and Projects

remv: Rename files (and directories) with regular expressions

(1/8) > >>

Tuxman:
The fan of old-school tools strikes again.  8)
(NANY 2019, anyone?)

Do you know what's missing in both POSIX and Windows? Right: A sane way to rename files/directories with a regular expression.
The usual approach on POSIX systems is:


--- ---ls * | sed -E 's/(.*)text(.*)to(.*)remove(.*)/mv & "$1$2$3$4"/' | sh
That is horrendously ugly and does not port well to Windows. Here is my slightly less horrendously ugly attempt to fix that:
https://code.rosaelefanten.org/remv

Enjoy.

yarond:
Nice idea, and certainly can be useful.

After a quick test on Win10, there are some problems:

* When providing an invalid input pattern, but trying to provide an option parameter, the shown error text quotes the parameter as the input regex (e.g. "remv -n [a $1" returns error message "remv could not parse your input regex: -n")
* Running valid regex without any options does nothing, running with -n (simulate) just writes some empty lines to console, running with -v (verbose) actually renames while also indeed being verbose. I haven't checked other options yet.
* For replacement pattern $0 isn't valid, and treated as a literal (e.g. "remv -v .* pre$0" will rename file1.ext to "pre$0" instead of "prefile1.ext", and then will error that it can't rename "otherfile.ext" because "pre$0" already exists). Of course there's the workaround of explicitly putting the matching pattern in parenthesis, but it shouldn't be required and something like $0 should work for the entire pattern.
* When renaming files in current directory (I didn't test any folder/directory/move handling so far) the verbose display (which as mentioned for now seems a must) lists each file as starting with ".\" before the name. That's... technically correct (the files are in the current folder), but not really expected (or helpful) when renaming files locally without any indication from the user that the location should be managed.  Worse, the ".\" at the beginning of the file name also matches the input regex. So doing something like "remv -v (.*?)(\..*$) $1 $2" fails for trying to rename the folder, to "a.\" (instead of trying to add a space before the first period in the file name) which results in an appropriate "operation not permitted"
* This one is more of a feature request than a problem, but for file-name handling it's a very common/basic feature (and also partially related to the previous item): It should be possible to treat the file extension separately, with something like an optional parameter to specify whether it should/shouldn't be considered as a part of the regex. Otherwise this will just require a heck of a lot of "(.*)(\..*$)", or somesuch, in matching pattern and replacement group when wanting to rename files without modifying the extension

Tuxman:
1. Yes, thank you.
2.  :huh: I'll check that.
3. If you use invalid variables, that's not my fault.  :D
4. The possibility to include relative folder paths (to move whole structures) is actually a feature. How would you do that?
5. File extensions are technically not a separate part of a file name. Handling files with multiple endings (.tar.gz) could be tricky...

Note that I'll have to fix the Decision Sieve first anyway, so take the above comments as temporary ideas.  ;D

yarond:
Perfectly understandable. And thanks for the quick response.  :)

And of course what I present here are just my opinions, as much as I personally may think they're valid and correct one.

3. If the regex engine you're using doesn't support something like this, then of course not much you can do. But generally most (all?) user tools for search&replace that I can quickly think of (and regex code/development libraries) allow something like a "fake" backreference to the full match. Usually (in my experience) that's done by something like $0 or \0 (e.g. a capturing group for everything before the first actual capturing group) for tools that otherwise use $n or \n , . It's mostly a convenience thing, so not strictly necessary, but it's a very common one ("why do I explicitly need to indicate the group boundaries when this group is already very clearly defined, and is already a known individual entity to the engine?"). Though, again, yes, if it's not something that your regex engine handles, then it's technically correct, so this does downgrades to feature-request rather than bug.

4. Oh, I like that as a feature, for sure. But I think it should be done only if the user explicitly wants to use it, otherwise it should just be assumed.
If the user didn't explicitly provide the location, then it shouldn't match anything on the input regex. If working on "file.ext" in the current directory then it's called "file.ext", not ".\file.ext". If the user wants to move it somewhere then they don't replace .\(.*) with c:\somewhere\$1 , but rather (.*) with c:\somewhere\$1 . This should carry fine to sub-folders of the current folder.
If not using the current folder (or using sub-folders in the pattern) then the user needs to specify the path explicitly anyway, so no problems there. The ".\" should be a valid destination location (I didn't check if it is now), but shouldn't be matching on the input unless explicitly put there by the user.

5. Since it's posted here as a windows tool, file extensions are very much a separate part of a file name, both for the O/S and for the expectations of the very large majority of users. Files with ".tar.gz" are handled by programs that know how to open ".gz" files, which hopefully also know to notice that the base name ends with ".tar" and what it means (which for practical purposes I think anything that handles ".gz" is aware of). But as far as the O/S is concerned, when making file associations/icons/filters, that file has a ".gz" extension. And file renaming can often includes the desire to change the file name without touching the extension.

Tuxman:
To come back to 3.: I use the standard C++ regex header. Technically, $0 is supposed to work.  :huh:

1. is fixed (I removed the reference).  ;D
2. Head -> wall.
4. is actually already in, just skip the -r flag and remv will gladly ignore directories.
5. is fixed as well (new flag "-E").

Navigation

[0] Message Index

[#] Next page

Go to full version