Main Area and Open Discussion > General Software Discussion
Extract REGEX matches from multiple text files
Ath:
What could be the problem?
-kalos (September 12, 2018, 05:20 AM)
--- End quote ---
You haven't shared the file, so we'll never know, unless...
kalos:
What could be the problem?
-kalos (September 12, 2018, 05:20 AM)
--- End quote ---
You haven't shared the file, so we'll never know, unless...
-Ath (September 12, 2018, 10:10 AM)
--- End quote ---
I made it work like that:
gci FILEPATH | sls -AllMatches '<html:productType>(.+?)<\/html:productType>' | % { $_.Matches } | % { $_.Groups[1].Value } >> FILEPATH\out.txt
But I don't know how I made it work lol, can you spot the error? Also, I know I asked before, but can you point me to somewhere that explains % { $_.Matches } | % { $_.Groups[1].Value } ?
I think % means 'for every' and $_.Matches is the object variable of the matches, while $_.Groups[1].Value is the content value of the matches objects, right? But what is [1]?
UPDATE: it seems both work, but which would be better?
Thanks!
kalos:
Guys, after I search for regex matches in a text, how can I group the matches to separate files, by same reference inside the regex match?
For example, for every regex match <html:producttype>(.+?)</html:producttype>, I want to output to a separate file all the matches where the (.+?) is the same.
Any idea? Also, please explain the strategy/pseudocode to see how that would work.
Ath:
I want to output to a separate file all the matches where the (.+?) is the same.
-kalos (September 17, 2018, 05:03 AM)
--- End quote ---
Run a second command on your previous output
--- Code: PowerShell ---gci FILEPATH\out.txt|group|select Count,Name >FILEPATH\out-counted.txtThe code is also the pseudo code.
kalos:
gci FILEPATH\out.txt|group|select Count,Name >FILEPATH\out-counted.txt
-Ath (September 17, 2018, 12:59 PM)
--- End quote ---
No you misunderstood. I don't want to count matches. I want to group them and output them in a separate file.
For example, I will search for my regex:
<html:producttype>(.+?)</html:producttype>
The possible matches will be:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product2</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product3</html:producttype>
etc
I want the script to create one file with the matches where the (.+?) is the same, so:
1 file that contains:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
1 file that contains:
<html:producttype>Product2</html:producttype>
and 1 file that contains:
<html:producttype>Product3</html:producttype>
Thanks!
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version