ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Extract REGEX matches from multiple text files

<< < (14/22) > >>

4wd:
Also, I want to run sequential several regex matches with their own references, one by one and append each result to the output file.
-kalos (August 10, 2018, 09:43 AM)
--- End quote ---
You have to make clear whether the results from the separate queries have any positional relation to each other, or can the queries be run one after the other and the output of the second, third, etc., runs appended to the first regex run?-Ath (August 10, 2018, 01:37 PM)
--- End quote ---

Or to put it another way:

* STOP trying to describe what you want to happen, (because you're not very good at it).
* Provide sufficient sample input data of any kind whether it's real or made up (as long as it represents the real format).
* Provide an example of the output using the input data that shows what you're trying to achieve.
* PROVIDE relevant feedback, something you consistently fail to do, (eg. 1, 2, 3, 4, etc, etc, etc).
DO NOT give us separate examples of two disparate data types without showing how they relate to one another, (eg. XMLw <> and CDATA []), within the same file.

Until you can do that we're just running around in circles and it's pointless continuing this thread, as such I'm out of here until the above happens.

kalos:
OK, so the input is:
<html:products>
    <html:prod id="prod1">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD</html:classificationType>
          <html:productType>PRD_XE</html:productType>
          <html:productId>10004</html:productId>
          <html:assignedDate>2018-07-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS</html:name>
          <html:Entity>REP_XE</html:legalEntity>
          <html:location>ED</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>

The above continues to prod2 etc.

The output of the data would be:
prod1; PRD; PRD_XE; 10004; 2018-07-23; REPAIRS; REP_XE; ED
Then a new line would start with:
prod2; etc


However, I want to convert the input data in a string, because, I may need to match longer substrings than eg "<html:classificationType>(.+?)</html:classificationType>"
Also, I think there may be duplicates for each prod, e.g. more than one assignedDate node with different values, so MatchAll would be best.
thanks!

wraith808:
So it's always xml and it's always that schema?  And you're just worried about duplicates?

kalos:
So it's always xml and it's always that schema?  And you're just worried about duplicates?
-wraith808 (August 13, 2018, 07:18 AM)
--- End quote ---


Yeah, for now it looks like that.

wraith808:
And one last question... when you say duplicate, you mean the whole record is duplicated?  Or just some of the fields, i.e. productID or prod id?

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version