ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Advice on manipulating a flat file



I have been given the 'pleasure' of masking data in a text file that is 160MB in size with 1.25M records.  The format is (yes the delimiter is ¦¦):


Can anybody recommend the easiest way to mask\amend certain columns for all the records.  So for example I'd like to replace the FIRST_NAME, LAST_NAME and address columns with dummy data (preferably with sequential numbering but not essential)?

So it would end up like this:

12343¦¦F_NAME1¦¦LAST_NAME1¦¦Address 1¦¦Address 1¦¦Address 1¦¦Address 1¦¦Address 1¦¦Address 1¦¦Address 1¦¦P_CODE1,25/05/1967¦¦MARRIED¦¦NULL
12343¦¦F_NAME2¦¦LAST_NAME2¦¦Address 2¦¦Address 2¦¦Address 2¦¦Address 2¦¦Address 2¦¦Address 2¦¦Address 2¦¦P_CODE2,02/08/1998¦¦SINGLE¦¦NULL

I tried using Excel 2010 but it can only load 1048576 records.


CS-Calc claims to be able to work with 12 million rows, might be worth a try (if that's a possibility)

A simple regex script (python, perl, etc.) would make quick work of it.

A colleague put together a SQL SSIS package to this this for me so problem resolved. Thanks


[0] Message Index

Go to full version