For my web-based php regex find/replace do-hickey, I need to match individual back references and wrap a tag around them so they'll be unique to the rest of the match for individual color markup. Initially this would seem easy enough, however not all of a potential regex match is going to be within a back reference. So it's necessary to replace the back reference, and only the back reference, while preserving the context of the match. For example, if I were to search the text
fish this fish fish
looking for
.*?(?<=this )(fish).*
I'd match everything, capturing the second instance of fish into the back reference. I can't simply take the match and run a replace for fish in order to apply the highlighting, because then i'd end up with 3 highlighted "fish", 2 of which weren't supposed to be. I also couldn't simply return the back reference with the markup, as that wouldn't return the non-back referenced stuff.
My initial solution was to run the original find text over the match to get the back references, using an extra flag to have it return the offset of each back reference. So now I have the location of the text within the string, and can get the length of it from that point from the string itself. Going backwards so as not to mess with the numeric location with in the string, it captures back references without losing context or data. Perfect.
. . . until back references are nested. In this example:
(.*?(?<=this )(fish).*)
back reference 1 would be fish this fish fish, back reference 2 would be fish -- here's where the problem surfaces.
If I wrap back reference 2 in the markup, when I apply back reference 1's markup it's going to apply the end tag in the wrong place since the string has increased and the original length calculated no longer applies. If I replace back reference 1 first, same problem. I'm sure there's some obvious, simple solution I'm overlooking having exhausted a bunch of complex attempts to compensate for it. Any fresh perspectives on the best way to markup nested groups while preserving the integrity of the return?