ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Text comparison tool with most robust similarity/moved block detection

<< < (2/3) > >>

irkregent:
I just ran across another diff tool that you might want to try:

True Human Design - Diffinity Official Homepage
http://truehumandesign.se/s_diffinity.php

David.P:
Hi all,

I was just going to report that I tested some online and off-line (*.exe) diff tools again. Thereby, the installed tools, like the latest versions of WinMerge and Meld, proved to have no chance against the best online tools. Both WinMerge and Meld fail completely as soon as the texts become just a little dissimilar.

I checked like a dozen online comparison tools, and the best ones (by far) still seem to be Vroniplag (great multi-color highlighting of identical textblocks including moved block detection), and wikEd diff (highly configurable, including moved block detection). Vroniplag even lets you copy the results including all the color highlighting into Word, however only if the comparison is done with Internet Explorer. Both with Chrome and Firefox, this does not seem to work properly.

Copyleaks.com seems to be even more sophisticated since it at least tries to detect similarity also when there are small differences like for example OCR errors. Also, you can click on text passages and it will highlight the respective passage in the other text. However, it is a) terribly slow as compared to the other two mentioned above, b) has no multi-color coding for quick overview, c) breaks longer texts into multiple pages that you have to click through, and d) you can't quickly edit one or both of your texts and redo the comparison with the edited texts.

I am yet to discover a service or tool that uses some kind of fuzzy logic or artificial intelligence in order to spot and visibly highlight similarities in a "stepless" manner, for example by using some sort of heat map color highlighting of the similarities.

Cheers
David

Shades:
ExamDiff Pro  and  BCompare are good off-line comparison tools. Can also keep track of changes in different folders and both come with decent text editor built-in. The latest incarnations of these software packages should have no problem with applying fuzzy logic to the content being compared.

Or did I misunderstand and you only consider on-line diffing solutions?

David.P:
Thanks, I see now that at least ExamDiff Pro says that it has some sort of fuzzy comparison.

In the meantime I found this, from the makers of Vroniplag. IMHO this tool blows everything else out of the water, at least when you're looking for text comparison with lots of moved blocks.

I will try ExamDiff Pro  and  BeyondCompare and report back.

Additionally, I am now looking for a tool that can find and highlight repeated blocks of text within a single file. So far, the only solution I found is Textanz:

Text comparison tool with most robust similarity/moved block detection

Textanz however only highlights repeated blocks in one single color which makes it very tedious to follow up when there are lots of (possibly nested) repetitions.

David.P:
Aaaarggghhhh, this is 2019 and BeyondCompare still can't wrap text to the width of the respective window  :down: :down: :down: :down: :down: :down: :down: :mad: :mad: :mad: :mad: :mad: :mad:

This unfortunately makes BeyondCompare completely UNUSABLE for text analysis (other than for program code, possibly).

This is also valid similarly for ExamDiff Pro. While ExamDiff Pro can (sort of) wrap text, the output is not practical for text analysis.

Summing up my experience: while practically all off-line tools like the ones mentioned above might be well suitable for comparing program code versions, they are completely and utterly unusable for text analysis.

This is what the comparison output of the typical off-line program looks:
Text comparison tool with most robust similarity/moved block detection

Below is what the comparison output of Similarity Texter looks, and this is what I'm after:
Text comparison tool with most robust similarity/moved block detection

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version