topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 4:29 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Binary Patching for "Similar" Large Files  (Read 5932 times)

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Binary Patching for "Similar" Large Files
« on: April 23, 2013, 12:11 AM »
Any recommendations for patching guest OS images that may not differ by much?

I've tried xdelta, xdelta3, bsdiff, and beat so far and discovered that:

  1. xdelta (1.x series) created a very small patch (a bit over 70,000 bytes) when the difference was between a .vdi file and a .vmdk file (vdi being somewhat over 810,000,000 bytes and vmdk being somewhat less than 745,000,000 bytes) -- here the vmdk file was created via conversion from the vdi file.

  2. xdelta3 (without tweaking of parameters) for the same scenario created an enormous file (a bit under 280,000,000 bytes) by comparison

  3. bsdiff exited with a message about not being able to allocate memory...

  4. beat created a large file (I think it was over 500,000,000 bytes) in linear mode and killed my X session in delta mode...(actually, it may have been systemd that killed my X session)

(AFAICT, xdelta3 does not process patches generated with xdelta 1.x...)
« Last Edit: April 23, 2013, 07:24 PM by ewemoa »

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #1 on: April 23, 2013, 02:15 AM »
Some additional bits on xdelta3 with command line tweaks:

  -B268435456 results in a patch a little under 139,000,000 bytes
  -B536870912 results in a patch a little over 103,000,000 bytes
  -B536870913 results in a patch a little over 56,000 bytes
 
So it looks like in this case, comparable sizes (to xdelta 1.x) are obtainable by tweaking the -B option.
« Last Edit: April 23, 2013, 04:43 AM by ewemoa »

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #2 on: April 23, 2013, 08:45 AM »
that is a pretty magic B option.

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #3 on: April 23, 2013, 04:17 PM »
that is a pretty magic B option.
Yeah, that does look pretty incomprehensible O_o
- carpe noctem

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #4 on: April 23, 2013, 07:21 PM »
FWIW, I found some explanation of tweaking at:

  https://code.google.com/p/xdelta/wiki/TuningMemoryBudget

Source buffer size

The encoder uses a buffer for the source input (of size set by the command-line flag -B). To ensure the source input is read sequentially, with no backward seeks, the encoder maintains the source horizon at half the source buffer size ahead of the input position. A source copy will not be found if it lies more than half the source buffer size away from its absolute position in the input stream.

For large files, -B may need to be raised. The default is 64MB. This means data should not shift more than 32MB, that is, not more than 32MB should be added or removed from the source.

The minimum value of -B is 16KB.

The source file is not mmaped, it is read into the source buffer (Xdelta-1.x used mmap()).

Note: all flags are set in bytes, so for example to set a 512MB source buffer you must pass -B536870912.

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #5 on: April 23, 2013, 08:03 PM »
On a peripheral note, found the following info on the .vdi and .vmdk formats:

  .vdi - All About VDIs
  .vmdk - Virtual Disk Format 5.0

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: Binary Patching for "Similar" Large Files
« Reply #6 on: April 24, 2013, 07:53 AM »
So the -B option is buffer size, given in bytes? Then it sorta makes sense.

I wonder why they moved away from mmap.
- carpe noctem