ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Extract REGEX matches from multiple text files

<< < (2/22) > >>

4wd:

--- Code: PowerShell ---$outfile = 'K:\output.txt'$regex = '<dsf:tsdfgd trsdfge=\"urn:x-ssdfgs-dfg-com:isdfgc/tg4r3e-i4d\" id=\"OsdfgsdfD\">'$items = Get-ChildItem -Path *.txt       # *.txt , *.foo , *.whateverfor ($i = 0; $i -lt $items.Count; $i++) {  Select-String -Path $items[$i] -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } >> $outfile}

kalos:
Could you tell me please in AHK? I am not familiar with that language, unless you can point me to the explanations of these commands?

Ath:
The scriptlanguage used is Microsoft's PowerShell, the aimed successor of cmd with its relatively poor language batch (.bat/.cmd) scripts, that comes standard installed with Win10, Win8.1 and Win8, and can easily be installed on older Windows versions.

Copy the script to a file with .ps1 extension, adjust the 1st line to your desired resultsfile, adjust in the 3rd line *.txt to the extension of your data files, press the Start button and start typing powershell to find that, then run the script from the directory where your data files are.
Largish files are no issue for PowerShell.

kalos:
Very interesting!

Do you know a good site that explains the structure of the script you posted and the definition/usage of the commands along with examples?

Also, does this script load the whole text of the file in memory to perform its operations? This will be a problem for a 25GB file

kalos:
Can you explain please word by word this bit:

$items = Get-ChildItem -Path *.txt       # *.txt , *.foo , *.whatever
for ($i = 0; $i -lt $items.Count; $i++) {
  Select-String -Path $items[$i] -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } >> $outfile
}

Also, I need to append to the output file several regex matches/returns, how do I do that?
Also, if I specify a regex match, how do I specify what I want to be returned from this match?

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version