topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Saturday December 14, 2024, 12:43 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Comparison of the contents of 2 xls files to isolate missing records  (Read 13543 times)

beethoven

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 46
    • View Profile
    • Donate to Member
First, not sure if my request is suitable for here - please tell me if this is not of any general interest, too boring or too complicated.

My little feature, program or coding snack request is as follows:

I have two small xls files that contain output from different sources. While the underlying transactions are the same, I can't easily compare the files to see if any entry is only appearing in one file. I am looking for a simple procedure to compare the data in file 1 with the data in file 2 and highlight or flag or create an output file of any record that is not in both files.
Having studied the files, the only common information in both files is 1) an amount and 2) a time stamp.  Problem is that the amounts are not unique and the timestamps between file 1 and file 2 are not identical but vary by up to 1-2 minutes. So any comparison would have to be based on a) amount being identical and timestamp being within 2 minutes for the record in question.
I attach two files as samples ( some columns have been cleared or modified for privacy reasons) but these columns should be ignored anyway as they do not appear in both files. In my files two transactions are appearing only in the AOE13 File namely Order ID 40101 and 40103 - these are the files I would like to see flagged or similar. Sorry but since I can't upload xls files, I chose jpg for illustration.

nudone

  • Cody's Creator
  • Columnist
  • Joined in 2005
  • ***
  • Posts: 4,119
    • View Profile
    • Donate to Member
I've used Beyond Compare 3 to compare spreadsheets before. Available here (not sure what the free/trial version does): http://www.scootersoftware.com/

I appreciate this wouldn't be a solution without the spreadsheets being modified first, as I think Beyond Compare would still need the layout of the columns to be relatively similar.

Having said that. I've also used Excel's own features to compare two columns and colour the discrepancies. I can't remember what you have to do though. It was a method I found in Google so it must be a common problem/solution.

Again, not a one click solution, you'd have to copy and paste the columns around that you need to compare.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Perhaps this page may help: How to Compare Two Excel Spreadsheets & Highlight What Is Missing

Otherwise, WinMerge, (free), has a prediffer Excel plugin.

nudone

  • Cody's Creator
  • Columnist
  • Joined in 2005
  • ***
  • Posts: 4,119
    • View Profile
    • Donate to Member
Perhaps this page may help: How to Compare Two Excel Spreadsheets & Highlight What Is Missing

That looks like an even better Excel feature than the one I messed about with. Will bookmark that.  :up:

beethoven

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 46
    • View Profile
    • Donate to Member
Guys, thanks for the links and advice - I will study this more closely tomorrow.  I am not sure though if these programs will help me as it seems they pick up on minor and exact differences between files. My files are substantially different even though they relate to the same records. Only two data points can be directly compared and as I mentioned, for the time factor to be considered matching, the time has a variance of 1-2 minutes.

rjbull

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 3,205
    • View Profile
    • Donate to Member
I suspect "loose" comparison may prove hard to find.  In addition to the free programs above, there's also Florencesoft™ DiffEngineX™ for comparing Excel worksheets.  It's payware, but has been on Bits du Jour a couple of times.  It would be worth seeking out the Bits comments.

beethoven

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 46
    • View Profile
    • Donate to Member
yeah, not being that tech savvy I probably underestimated the difficulty.  I thought this was more a long the lines of some formula like if value in column K is identical in sheet A & sheet B based on time stamp in column B and  column I is some code 9 (varying by up to 2 minutes) then format cell  in red -- obviously some cut an and paste is still necessary to position the relevant fields in a better way. Or if necessary a lot of information can be cut out leaving only the bare minimum of data fields.  The original idea or hope of course was to use the original files and to minimise prior handling.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
The only easy way, (relative term), would be through VBA I think, (unless you want to export the sheets to another format, eg CSV, and use another program).

eg.
1) Grab amount entry from spreadsheet A, (X)
2) Look for a matching amount entry in spreadsheet B, (Y)
3) If X = Y then check to see if the time between them is less than 2 minutes
4) If it is, mark both entries as OK and go back to step 1.
5) If time doesn't match, go back to step 2.
6) If at the end of all values in spreadsheet B you haven't found a suitable match then flag the entry in spreadsheet A and go to step 1 until you've cycled through all entries in spreadsheet A.

The problem with that is if multiple entries occur within the space of a couple of minutes with the same amount.

It would probably make things a lot easier if you could synchronise the time between the two systems, eg. install a NTP server on one and have the other set its time from it.

AndyM

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 616
    • View Profile
    • Donate to Member
What 4wd said, you could do this with VBA.

Ask over in the Excel forum at VBAExpress:

http://www.vbaexpress.com/forum/

Target

  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 1,832
    • View Profile
    • Donate to Member
just to confirm my understanding, the points of comparison are as follows

by timestamp (with a tolerance of +/- 2 minutes), then if you get a 'match' by value

or look for a transaction that is within a corresponding envelope, then see if the transaction value matches...

and do you want to see the result in a separate worksheet/book, or within the individual sheets?

And you should be able to attach some sample files if you archive (zip?) them first (that would make it easier to try and come up with an answer)