ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Other Software > Developer's Corner

Binary Patching for "Similar" Large Files

(1/2) > >>

ewemoa:
Any recommendations for patching guest OS images that may not differ by much?

I've tried xdelta, xdelta3, bsdiff, and beat so far and discovered that:

  1. xdelta (1.x series) created a very small patch (a bit over 70,000 bytes) when the difference was between a .vdi file and a .vmdk file (vdi being somewhat over 810,000,000 bytes and vmdk being somewhat less than 745,000,000 bytes) -- here the vmdk file was created via conversion from the vdi file.

  2. xdelta3 (without tweaking of parameters) for the same scenario created an enormous file (a bit under 280,000,000 bytes) by comparison

  3. bsdiff exited with a message about not being able to allocate memory...

  4. beat created a large file (I think it was over 500,000,000 bytes) in linear mode and killed my X session in delta mode...(actually, it may have been systemd that killed my X session)

(AFAICT, xdelta3 does not process patches generated with xdelta 1.x...)

ewemoa:
Some additional bits on xdelta3 with command line tweaks:

  -B268435456 results in a patch a little under 139,000,000 bytes
  -B536870912 results in a patch a little over 103,000,000 bytes
  -B536870913 results in a patch a little over 56,000 bytes
 
So it looks like in this case, comparable sizes (to xdelta 1.x) are obtainable by tweaking the -B option.

mouser:
that is a pretty magic B option.

f0dder:
that is a pretty magic B option.-mouser (April 23, 2013, 08:45 AM)
--- End quote ---
Yeah, that does look pretty incomprehensible O_o

ewemoa:
FWIW, I found some explanation of tweaking at:

  https://code.google.com/p/xdelta/wiki/TuningMemoryBudget

Source buffer size

The encoder uses a buffer for the source input (of size set by the command-line flag -B). To ensure the source input is read sequentially, with no backward seeks, the encoder maintains the source horizon at half the source buffer size ahead of the input position. A source copy will not be found if it lies more than half the source buffer size away from its absolute position in the input stream.

For large files, -B may need to be raised. The default is 64MB. This means data should not shift more than 32MB, that is, not more than 32MB should be added or removed from the source.

The minimum value of -B is 16KB.

The source file is not mmaped, it is read into the source buffer (Xdelta-1.x used mmap()).

--- End quote ---

Note: all flags are set in bytes, so for example to set a 512MB source buffer you must pass -B536870912.

--- End quote ---

Navigation

[0] Message Index

[#] Next page

Go to full version