Patching strings

I was looking for a way to patch strings. That is using the PathFile methods. It can take 2 different files and generate a patch with them. Is there the a way to do this with strings it only excepts files it looks like.

I wanted to try out this method (SpringTa people talked about it).
Basically:
serialize the game state in a string A
send the string A to client
update the game state
create another string B
compute patch between A and B - P1
send patch to client
now he can patch P1 to A and get B

The Patchfile public interface only accepts filenames, but internally it does operate on C++ streams. So you could extend Patchfile with a new pair of methods, for instance build_string() and apply_string(), that received strings and created ostringstreams and istringstreams instead of ofstreams and ifstreams.

Or, you could just write your strings to temporary files and then operate on temporary files. In a modern operating system (Windows, Linux) this will pretty much be exactly as fast anyway, since the operating system will do a good job of caching the data and will probably never even write it to disk.

David

after some testing it turns out that patchfile is quite slow.
Generating a patch for 2 3K data blobs result in about 200 bytes of changes, 50 bytes just for the header so 150 bytes of actual changes. But it takes .1 seconds to do so which us slow. If one has to do that every second to send that patch over the network. Relatively building the 3K data blobs takes about .006 seconds - that is centralizing about 50 python classes full of mixed data.

Now here is my question, could the slow down be cased writing the blobs to files first? Could it be some simple mistake in the patcher that just makes this slow? Could it be fixed/spedup? Or should i try make my own “Patching” algorithm based on the data format. Because I know that format of the data I could use that to smart patch it in different ways.

EDIT:
Surprisingly it takes .36 (only x3.5 larger) sec to build a patch for 1.2Meg (x400 larger) file, maybe there is some sort of over head that constantly eating up the .1 seconds.

Patching exact same 3 byte files take .08 sec

The patcher is designed for offline computation of large patchfiles, for the purpose of upgrading a downloadable game to the latest version. The problem it is trying to solve is: generate a patchfile within a reasonable timeframe (less than an hour, say) for arbitrarily large data sets. Given this constraint, attempt to make the smallest patchfile that can reasonably be determined in that timeframe.

Note that finding the smallest patchfile for a particular data set is a mathematically intractable problem. To answer this definitively, you would have to exhaustively search the entire problem space, which could take thousands of years for some large data sets. So our patcher uses a few heuristics to short-circuit this search considerably, and is capable of generating decently small patchfiles within about half an hour or so, even for very large data sets.

But it was never designed to generate tiny patchfiles in real time. That’s a completely different problem. If you require this feature, you’re probably better off writing a different algorithm to generate these patchfiles based on the knowledge you have about the nature of the data.

David