Main Area and Open Discussion > General Software Discussion
Extract REGEX matches from multiple text files
Ath:
If you are searching for a regex within a regex, 'You Are Doing It Wrong' (T).
You initial requirement was to find and extract content using a regex, but now you need parts of that regex to be split out? That can be done using a single regex, grouping the stuff you need to split out.
And for this whole exersize to make any sense, where is the variable part of the data to find? When searching for explicit text(s), a count would suffice...
Please provide a complete example, with actual data (not an entire file!), clearly marking the stuff you need to extract, of what you want to achieve, not how you think it could/should be solved.
wraith808:
If you are searching for a regex within a regex, 'You Are Doing It Wrong' (T).
You initial requirement was to find and extract content using a regex, but now you need parts of that regex to be split out? That can be done using a single regex, grouping the stuff you need to split out.
And for this whole exersize to make any sense, where is the variable part of the data to find? When searching for explicit text(s), a count would suffice...
Please provide a complete example, with actual data (not an entire file!), clearly marking the stuff you need to extract, of what you want to achieve, not how you think it could/should be solved.
-Ath (August 05, 2018, 06:20 AM)
--- End quote ---
Skunds like the same problem that I face at work. Except I get paid to deal with the frustration.
kalos:
If you are searching for a regex within a regex, 'You Are Doing It Wrong' (T).
You initial requirement was to find and extract content using a regex, but now you need parts of that regex to be split out? That can be done using a single regex, grouping the stuff you need to split out.
And for this whole exersize to make any sense, where is the variable part of the data to find? When searching for explicit text(s), a count would suffice...
Please provide a complete example, with actual data (not an entire file!), clearly marking the stuff you need to extract, of what you want to achieve, not how you think it could/should be solved.
-Ath (August 05, 2018, 06:20 AM)
--- End quote ---
Indeed, I now realised it!
I will try to provide an example in a bit.
kalos:
The format of the data is like that (the only difference is that the data is multiline rather than single line as in this example):
prod1
blah
specs=a
blah
price=b
blah
prod2
blah
specs=c
blah
price=d
blah
So I want the output to be a csv like:
prod1; a; b
prod2; c; d
So I was thinking first a regex to highlight/save in a variable the first area of the text that belongs to a prod, which is the the first six lines (I cannot use the number of lines to distinguish them as they vary).
Then it would extract a and b from that variable by matching the specs and price regex 'within' prod1 variable, so that I can distinguish them from prod2.
And then loop to complete the conversion.
Hope this helps?
So my understanding is that I cannot search for a regex that will match "specs=.+?" or something because I won't be able to distinguish this for prod1, prod2, etc.
At the same time, I cannot match the regex "prod1.+specs=.+?" because I don't know the exact text for prod1 (it's an xml attribute that is called prodID, but the value can be anything).
Do you have any idea on how to process this?
Ath:
This example is totally useless. >:(
Please extract 2 or 3 of those (complete) product records from your actual data file. Optionally replace confidential stuff (data, prices)with aaaaa, bbbbb, 1.23, etc., but leave the structure exactly as it is!
Then post that here.
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version