ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Extract REGEX matches from multiple text files

<< < (21/22) > >>

Ath:
What could be the problem?
-kalos (September 12, 2018, 05:20 AM)
--- End quote ---
You haven't shared the file, so we'll never know, unless...

kalos:
What could be the problem?
-kalos (September 12, 2018, 05:20 AM)
--- End quote ---
You haven't shared the file, so we'll never know, unless...
-Ath (September 12, 2018, 10:10 AM)
--- End quote ---

I made it work like that:
gci FILEPATH | sls -AllMatches '<html:productType>(.+?)<\/html:productType>' | % { $_.Matches } | % { $_.Groups[1].Value } >> FILEPATH\out.txt

But I don't know how I made it work lol, can you spot the error? Also, I know I asked before, but can you point me to somewhere that explains % { $_.Matches } | % { $_.Groups[1].Value } ?
I think % means 'for every' and $_.Matches is the object variable of the matches, while $_.Groups[1].Value is the content value of the matches objects, right? But what is [1]?

UPDATE: it seems both work, but which would be better?
Thanks!

kalos:
Guys, after I search for regex matches in a text, how can I group the matches to separate files, by same reference inside the regex match?

For example, for every regex match <html:producttype>(.+?)</html:producttype>, I want to output to a separate file all the matches where the (.+?) is the same.

Any idea? Also, please explain the strategy/pseudocode to see how that would work.

Ath:
I want to output to a separate file all the matches where the (.+?) is the same.
-kalos (September 17, 2018, 05:03 AM)
--- End quote ---
Run a second command on your previous output

--- Code: PowerShell ---gci FILEPATH\out.txt|group|select Count,Name >FILEPATH\out-counted.txtThe code is also the pseudo code.

kalos:
gci FILEPATH\out.txt|group|select Count,Name >FILEPATH\out-counted.txt
-Ath (September 17, 2018, 12:59 PM)
--- End quote ---

No you misunderstood. I don't want to count matches. I want to group them and output them in a separate file.

For example, I will search for my regex:
<html:producttype>(.+?)</html:producttype>
The possible matches will be:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product2</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product3</html:producttype>
etc

I want the script to create one file with the matches where the (.+?) is the same, so:
1 file that contains:
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
<html:producttype>Product1</html:producttype>
1 file that contains:
<html:producttype>Product2</html:producttype>
and 1 file that contains:
<html:producttype>Product3</html:producttype>

Thanks!

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version