OK, so the input is:
<html:products>
<html:prod id="prod1">
<html:referenceData>
<html:product>
<html:classificationType>PRD</html:classificationType>
<html:productType>PRD_XE</html:productType>
<html:productId>10004</html:productId>
<html:assignedDate>2018-07-23</html:assignedDate>
</html:product>
<html:book>
<html:name>REPAIRS</html:name>
<html:Entity>REP_XE</html:legalEntity>
<html:location>ED</html:location>
</html:book>
</html:referenceData>
</html:prod>
The above continues to prod2 etc.
The output of the data would be:
prod1; PRD; PRD_XE; 10004; 2018-07-23; REPAIRS; REP_XE; ED
Then a new line would start with:
prod2; etc
-kalos
That finally makes some sense.
Here is an example solution for putting that into a .csv formatted file.
You didn't give the specification for that html: namespace though. (But as it's the only namespace used, for data-extraction it can be filtered out)
But, earlier in this thread you wrote this:
<CATALOG>
<PLANT>
<COMMON>Bloodroot</COMMON>
<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
<ZONE>4</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$2.44</PRICE>
<AVAILABILITY>031599</AVAILABILITY>
</PLANT>
-kalos
Well guys, the data is what I posted in my last post (Plants),
-kalos
It doesn't even look a teensy bit like this new data you've given just now, are you playing us?
However, I want to convert the input data in a string, because, I may need to match longer substrings than eg "<html:classificationType>(.+?)</html:classificationType>"
-kalos
You are talking b.s. here.
Also, I think there may be duplicates for each prod, e.g. more than one assignedDate node with different values, so MatchAll would be best.
-kalos
This doesn't make sense without an example, and MatchAll is inappropriate here.
extract the appropriate regex
-kalos
PLEASE STOP TELLING US HOW TO SOLVE YOUR CHALLENGE!(This could have been bigger and in red, but I'm trying to stay nice, so I didn't)If you want to learn regex, go get a book or on-line course, there are plenty
here and
here, and stop feeding us xml.
When handling XML,
no regexes are usually involved, unless the data elements contain 'complex', somewhat structured, data that needs to be broken down.
I have this assignment for you:- read the entire thread from OP to the end and formulate an answer to all unanswered questions we asked you. (Just quote the question and type the answer below the quote)
After all the answers are given you can ask 1 new question. As 4wd already stated, and you said yourself but in other words, you aren't good in answering questions, but it is
required for other people to help you solve your challenge/quest.