Home | Blog | Software | Reviews and Features | Forum | Help | Donate | About us
topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • December 02, 2016, 02:03:50 PM
  • Proudly celebrating 10 years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: IDEA: DeEntropizer - File and Folder Auto Organizer by Similarity and Patterns  (Read 3160 times)

tmpusr

  • Member
  • Joined in 2005
  • **
  • Posts: 154
  • Instantiation stuck in meatspace with no backup
    • View Profile
    • Donate to Member
(Moved from Desktop Teleporter thread http://www.donationc...dex.php?topic=6091.0
 where the problem of a messy desktop, where everything is habitually dropped, was discussed.)

Don't use soft links on the desktop. Use Desktop Shell NameSpace objects. Reason: soft-linked data will appear twice in your backups - unless the backup app is smart enough.

Shell NameSpace:
http://www.donationc...87.msg39733#msg39733

You can multiply desktop drop area by using VirtuaWin.
http://www.donationc...72.msg43526#msg43526
You can have 9 screens (Some virtual window software surely supports even more - do you know any that do, that also support mouse screen switch like VW?) accessible by mere mouse-to-screen-edge motion. Each could have a different folder opened full screen - if droppable area is what you look for. Having big 48+ pixel folder icons in a (autohiding or not) toolbar would give you several easy-to-access drop areas for manually categorizing stuff.

The semi-random-stuff-in-inbox-to-be-dealt-with-later -habit is featured in Getting Things Done as "collecting". After collecting, it's prioritized into categories that lead to actions.

Merely automatically moving stuff from the desktop (or any folder you use for download/inbox) to another folder does little to decrease entropy. Automating the grouping of related files and folders into category folders does decrease entropy and expedites the creation of knowledge. The file managers I've tried still have only very rudimentary and ancient sorting tools. The concept of (fuzzy) similarity is not available (except in some fashion in the Fonts folder). Smart, virtual folders may change this.

An auto-categorizer app, DeEntropizer, that automatically organizes files and folders by similarity, could have features such as:

Fuzzy and relatively smart detection of similarities among:
file types
file names
folder names
file creation and/or modification times

Essentially a pattern recognition process.

Things that could be occurring behind the scenes:

Grouping into folders by file type:

1 EXEs are moved to folder "Executables"
   1.1 There, a folder is created for each. The folder name is either
      1.1.1 Extracted from the description in the EXE
Extracting description:
http://www.donationc...72.msg40418#msg40418
      1.1.2 file name
   1.2 EXEs are moved to folders

2 ZIPs and RARs are moved to folder "Archives"
   2.1 There they are extracted into folders
   2.2 Original files are deleted
   2.3 Unnecessary empty folders are removed
   2.4 Folder names are cleaned by a smart renamer

and so on for all file types. Exception: creating folders for document files by file name is obviously unuseful.

Within each file type folder:

3 Fuzzy grouping into folders
   3.1 If file or folder names have similarities
      3.1.1 they are moved under a common folder
         3.1.1.1 if "review" is present in many PDF names, the files are moved under folder "review". Typos are handled by fuzziness; reveiw, rewiev won't matter.
         3.1.1.2 if "daily" is present in many folder names, the folders are moved under folder "daily".

Grouping into folders by name:
http://www.donationc...43.msg41984#msg41984

4 Optionally grouping folders by
   4.1 Folder size. 300+MB, 50-300MB, 20-50 MB, 10-20 MB, 0-10 MB.
   4.2 Activity density (creation date). Example: a bunch of files have been created (usually downloaded) between 1830-1930 with little time between them. This pattern, the untypically small interval between creation/modification dates, would be recognized and they would be grouped under "20061116 1830-1930". Then some time has passed with few created files. Another rather continuous file creation stream occurs 2210-2240. They would be grouped too.

When you inspect the auto-grouped folders, you find that the first group consists of downloaded utitilies and some related documentation (though they'd be under another file type folder in this example and therefore it might be more useful if activity grouping took place before file type grouping), the next one is media files. The few files which were created between 1930 and 2210 occurred more than 15 minutes apart and aren't grouped; the algorithm for detecting groups would adjust the interval either automatically (perhaps with user adjustable limits; for example a stream of new files could have anything from 1 sec to 60 min interval) or be a user threshold, like 15 minutes; if files and folders are created within 15 minutes of the last one, they are grouped, if more time has passed, the group is broken and they are either not grouped or are grouped in the next group if such is detected.


The problem I have (perhaps in understanding only) with virtual folders such as the ones in Vista or OS X, is that they don't physically relocate, move files. You don't know which files are not included in any them - which ones don't fit any search pattern. If you set up a bunch of virtual folders that present files and folders in a manner described above, which of the files and folders are not displayed at all? How can you know unless you do a file move operation and go back to your download/inbox to see which files and folders are left?

Try picking up a few habits:

Create a folder for each download with full name and some description like "Company Software 1.0 - does this and that".

Categorize upon downloading into software type/media genre sub folders.

Download the most interesting under a separate folder, named "!" for example:
http://www.donationc...75.msg39673#msg39673

FlashGot with WellGet works great.
« Last Edit: January 26, 2007, 03:26:38 PM by tmpusr »

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 36,405
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
we've talked about this before, there is some suggested software somewhere here but it's not exactly what we want..

i agree this is a nice idea -
also use for things like if you have a giant directory full of downloaded pdfs, and want them to be auto sorted/files.

would be nice to have a flexible file sorting tool..