ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Other Software > Announce Your Software/Service/Product

Bvckup 2

<< < (10/13) > >>

apankrat:
Got any benchmark code you're willing to share? I'd be interested in trying it out on my own system, I'm afraid I didn't keep the stuff I wrote back then (and there were no threaded versions anyway).-f0dder (December 19, 2012, 06:01 PM)
--- End quote ---
Will do in a bit. I assume the command-line version is OK?-apankrat (December 19, 2012, 03:50 PM)
--- End quote ---
Sure thing. Would be nice with some source as well, but I can understand if it'll be too time-consuming to remove dependencies on code you want to keep private :)-f0dder (December 20, 2012, 02:03 PM)
--- End quote ---

Oki-doki.

bvckup2-demo2.exe - 32bit version
bvckup2-demo2-x64.exe - 64bit version, for nerds and rich kids

It's a console app, takes a path to be scanned and few other parameters:


--- ---Syntax: bvckup2-demo2.exe [-v] [-D | -B] [-t <threads>] location-to-scan

  -v        verbose
  -D        depth first, scan child then sibling directories   (default)
  -B        breadth first, scan sibling then child directories
  -t        number of threads to use

  Thread count defaults to the number of CPU cores if not specified.

Scanning with all defaults should yield something like this -


--- ---C:\Temp>bvckup2-demo2.exe C:\
8 threads, 840 ms
In a chatty mode it will look something like this -


--- ---C:\Temp>bvckup2-demo2.exe -v C:\

------ config ------
Location:    C:\
CPU cores:   8
Threads:     8
Depth first: Yes

----- scanning -----
   840 ms |     104024 files,      28599 folders


------ result ------
8 threads, 840 ms
I'll run some tests and post my numbers. f0dder, if you have time, let's see yours too :)

apankrat:
Updated verbose mode to dump the timing profile for FindFileFirst/Next/Close calls.
Please re-grab the .exe's from the original links.

The verbose output looks something like this now -


--- --------- config ------
Location:    n:\
CPU cores:   8
Threads:     16
Depth first: Yes

----- scanning -----
 32625 ms |      97964 files,      13192 folders

------ stats -------
FindFirstFileEx:
     0 ms        2054  |    10 ms           1  |   100 ms           -
     1 ms        1070  |    20 ms        1754  |   200 ms           1
     2 ms         859  |    30 ms         972  |   300 ms           -
     3 ms         708  |    40 ms         548  |   400 ms           1
     4 ms         421  |    50 ms         558  |   500 ms           1
     5 ms         242  |    60 ms         409  |   600 ms           -
     6 ms         143  |    70 ms         425  |   700 ms           -
     7 ms         175  |    80 ms         405  |   800 ms           -
     8 ms         211  |    90 ms         315  |   900 ms           -
     9 ms         221  |   100 ms         307  |  1000 ms           -

FindNextFile:
     0 ms      135181  |    10 ms           -  |   100 ms           -
     1 ms        1186  |    20 ms         127  |   200 ms           -
     2 ms          60  |    30 ms         201  |   300 ms           -
     3 ms          12  |    40 ms         151  |   400 ms           -
     4 ms           6  |    50 ms         149  |   500 ms           -
     5 ms          12  |    60 ms         139  |   600 ms           -
     6 ms           5  |    70 ms          72  |   700 ms           -
     7 ms           9  |    80 ms          58  |   800 ms           -
     8 ms           8  |    90 ms          43  |   900 ms           -
     9 ms          16  |   100 ms          22  |  1000 ms           -

FindClose:
     0 ms       12412  |    10 ms           -  |   100 ms           -
     1 ms         756  |    20 ms           1  |   200 ms           -
     2 ms          16  |    30 ms           -  |   300 ms           -
     3 ms           1  |    40 ms           -  |   400 ms           -
     4 ms           3  |    50 ms           -  |   500 ms           -
     5 ms           -  |    60 ms           -  |   600 ms           -
     6 ms           -  |    70 ms           -  |   700 ms           -
     7 ms           1  |    80 ms           -  |   800 ms           -
     8 ms           -  |    90 ms           -  |   900 ms           -
     9 ms           -  |   100 ms           -  |  1000 ms           -

------ result ------
16 threads, 32625 ms
"20 ms   1754" means there were 1754 calls that took between 20 and 30 ms.

"500 ms   1" means there was 1 call that took between 500 and 600 ms.

f0dder:
Cool :)

Too lazy to do proper testing with cold-cache right now (haven't got my testbox hooked up at the moment, and don't feel like rebooting my workstation a zillion times - sucks that windows doesn't have a way to "discard read cache", alloc-boatloads-of-memory isn't reliable enough).

190k files, 20k folders. Relatively flat hierarchy (haven't measured nesting level, but average is probably 4).

For warm-cache, there's negligible differences between breadth- and depth-first, the same goes for x86 vs x64. That was kinda expected, though :). The speed difference between 1 and 8 threads is a factor 3, after 4 threads there's no quantifiable speed increase (quadcore i7 with HyperTHreading - wonder where the bottleneck is, HT itself or some OS locks?). ~1200 vs 400 milliseconds, though, so not something that matters a lot - at least for relatively modest filesystems and higher-end systems :)

Cold-cache tests is what interests me most, anyway, since those tend to be slow. I'll see if I can find some time & energy to hook up my testbox and run some tests on it - the testbox also has the benefit of being a modest dual-core with a slow disk, compared to my workstation which is quadcore i7 with an SSD and a VelociRaptor :-)

apankrat:
I've tried running few cold-cache test and got numbers that are wildly different - from 131 seconds to 70 to 24. This is on a box that was shutdown, then powered on (with network connection disabled, with Windows Search and Windows Update services disabled, virtually no active tasks in the Task Manager and no resident antivirus/malware apps... so theoretically there's nothing that would actively populate disk index cache on boot). I will see what I've missed in a bit and re-run the tests.

With regards to getting 3x speed up on a warm cache - what's not to like? :) Arguably, this is the use case for real-time backups, with an operational profile consisting largely of frequent scans of small number of selected locations.

apankrat:
Scanned C:\ with about 100K files in 20K directories.
HDD + W7/64bit + 4 cores.


--- ---                   COLD       WARM

   1 thread   -  132 sec    2.55 sec
   4 threads  -  102 sec    1.24 sec
   8 threads  -   96 sec    1.32 sec
  64 threads  -   88 sec    1.44 sec
 256 threads  -   84 sec    1.52 sec
 512 threads  -   82 sec      -


In other words, parallel scanning yields ~2x speed up for warm caches, ~1.5x speed up for cold caches

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version