topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • November 16, 2018, 04:50 PM
  • Proudly celebrating 13 years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Extract REGEX matches from multiple text files  (Read 7792 times)

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #75 on: August 20, 2018, 09:36 AM »
How can I do that?
That's why we asked more specific questions, but you never answered them.
So then I gave you the assignment of answering all our unanswered questions, but you haven't done that up until now, so basically, we are waiting (but not holding our breath) for your answers, before accepting new questions. :(


OK I start again:

That finally makes some sense. Here is an example solution for putting that into a .csv formatted file.

The problem with that is that I do not always know the node tree hierarchy and also it may change per record! That's why I cannot use the node tree hierarchy to extract a value, but I can use a guess of it, if that helps, eg //NODE1/*/NODE3/ ?

It doesn't even look a teensy bit like this new data you've given just now, are you playing us?

It looks the same to me??? Only the attribute names and values change. But again, the records do not contain the same attributes and in the same order. There can be some basic rules that all records follow, but unfortunately the data structure is not consistent, that's why I want to use regex, to include some fuzziness in matching!

Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?
« Last Edit: August 20, 2018, 10:22 AM by kalos »

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #76 on: August 20, 2018, 07:15 PM »
Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

Learn to use Powershell's built-in help system:
Code: PowerShell [Select]
  1. Get-Help about_comparison_operators

Learn to use Google:
http://lmgtfy.com/?q...er+including+newline
http://lmgtfy.com/?q...egex+match+multiline
http://bfy.tw/JV48

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #77 on: August 21, 2018, 04:02 AM »
Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

Learn to use Powershell's built-in help system:
Code: PowerShell [Select]
  1. Get-Help about_comparison_operators

Learn to use Google:
http://lmgtfy.com/?q...er+including+newline
http://lmgtfy.com/?q...egex+match+multiline
http://bfy.tw/JV48

 :up: but I cannot see in the list of operators the OR  :tellme:

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #78 on: August 21, 2018, 07:49 AM »
but I cannot see in the list of operators the OR  :tellme:

Code: PowerShell [Select]
  1. Get-Help about_Logical_Operators

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #79 on: August 21, 2018, 08:31 AM »
but I cannot see in the list of operators the OR  :tellme:

Code: PowerShell [Select]
  1. Get-Help about_Logical_Operators

I did that, but I get this:

PS H:\> Get-Help about_Logical_Operators
Get-Help : Get-Help could not find about_Logical_Operators in a help file in this session. To download updated help
topics type: "Update-Help". To get help online, search for the help topic in the TechNet library at
http://go.microsoft.com/fwlink/?LinkID=107116.
At line:1 char:1
+ Get-Help about_Logical_Operators
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ResourceUnavailable: (:) [Get-Help], HelpNotFoundException
    + FullyQualifiedErrorId : HelpNotFound,Microsoft.PowerShell.Commands.GetHelpCommand

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #80 on: August 21, 2018, 08:43 AM »
Also, I still have not figured out how to make Powershell match . any character including newline... Any hint?

Learn to use Powershell's built-in help system:
Code: PowerShell [Select]
  1. Get-Help about_comparison_operators

Learn to use Google:
http://lmgtfy.com/?q...er+including+newline
http://lmgtfy.com/?q...egex+match+multiline
http://bfy.tw/JV48

I tried that but it is not clear if (?s) goes to either:
at the beginning of the regex and after the '
at the beginning of the regex and before the '
just before .

Any idea?

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 9,817
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #81 on: August 21, 2018, 09:04 AM »
but I cannot see in the list of operators the OR  :tellme:

Code: PowerShell [Select]
  1. Get-Help about_Logical_Operators

I did that, but I get this:

PS H:\> Get-Help about_Logical_Operators
Get-Help : Get-Help could not find about_Logical_Operators in a help file in this session. To download updated help
topics type: "Update-Help". To get help online, search for the help topic in the TechNet library at
http://go.microsoft.com/fwlink/?LinkID=107116.
At line:1 char:1
+ Get-Help about_Logical_Operators
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ResourceUnavailable: (:) [Get-Help], HelpNotFoundException
    + FullyQualifiedErrorId : HelpNotFound,Microsoft.PowerShell.Commands.GetHelpCommand

Code: PowerShell [Select]
  1. Get-Help about_*

if you don't see it there,

Code: PowerShell [Select]
  1. get-help update-help

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #82 on: August 21, 2018, 09:42 AM »
Thanks, but it needs me to run it as admin, which I cannot.

Any idea why the below does not work?

(gc *.xml) -match '(?s)<\?xml\ version="1\.0"\ encoding="UTF-8"\?>.+?</dbts:PmryObj>'

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,288
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #83 on: August 21, 2018, 12:26 PM »
Any idea why the below does not work?
As usual you are asking half questions without *any* documentation. And you still haven't answered all previous questions, as requested, (and even have asked new questions in the half-baked 'answer') so in my book, you're not yet ready to ask new questions.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 9,817
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #84 on: August 21, 2018, 05:16 PM »
Thanks, but it needs me to run it as admin, which I cannot.


TOPIC
    about_Logical_Operators

SHORT DESCRIPTION
    Describes the operators that connect statements in Windows PowerShell.


LONG DESCRIPTION
    The Windows PowerShell logical operators connect expressions and
    statements, allowing you to use a single expression to test for multiple
    conditions.


    For example, the following statement uses the and operator and
    the or operator to connect three conditional statements. The statement is
    true only when the value of $a is greater than the value of $b, and
    either $a or $b is less than 20.


        ($a -gt $b) -and (($a -lt 20) -or ($b -lt 20))


    Windows PowerShell supports the following logical operators.


        Operator  Description                      Example
        --------  ------------------------------   ------------------------
        -and      Logical and. TRUE only when      (1 -eq 1) -and (1 -eq 2)
                  both statements are TRUE.         False


        -or       Logical or. TRUE when either     (1 -eq 1) -or (1 -eq 2)
                  or both statements are TRUE.     True


        -xor      Logical exclusive or. TRUE       (1 -eq 1) -xor (2 -eq 2)
                  only when one of the statements  False
                  is TRUE and the other is FALSE.


        -not      Logical not. Negates the         -not (1 -eq 1)
                  statement that follows it.       False


        !         Logical not. Negates the         !(1 -eq 1)
                  statement that follows it.       False
                  (Same as -not)


    Note: The previous examples also use the equal to comparison
          operator (-eq). For more information, see about_Comparison_Operators.
          The examples also use the Boolean values of integers. The integer 0
          has a value of FALSE. All other integers have a value of TRUE.


    The syntax of the logical operators is as follows:


        <statement> {-AND | -OR | -XOR} <statement>
        {! | -NOT} <statement>


    Statements that use the logical operators return Boolean (TRUE or FALSE)
    values.


    The Windows PowerShell logical operators evaluate only the statements
    required to determine the truth value of the statement. If the left operand
    in a statement that contains the and operator is FALSE, the right operand
    is not evaluated. If the left operand in a statement that contains
    the or statement is TRUE, the right operand is not evaluated. As a result,
    you can use these statements in the same way that you would use
    the If statement.


SEE ALSO
    about_Operators
    Compare-Object
    about_Comparison_operators
    about_If

If you can't run an elevated command prompt... good luck in this endeavor.  You're going to need it.

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #85 on: August 22, 2018, 03:55 AM »
Any idea why the below does not work?
As usual you are asking half questions without *any* documentation. And you still haven't answered all previous questions, as requested, (and even have asked new questions in the half-baked 'answer') so in my book, you're not yet ready to ask new questions.

But I need ad-hoc answers, it's not about a specific thing I try to achieve, but mostly to learn

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #86 on: August 22, 2018, 04:44 AM »
But I need ad-hoc answers, it's not about a specific thing I try to achieve, but mostly to learn

And yet every answer given here can be found on Google ... if you're going to learn anything, learn to ask the right questions.

Input
<?xml version="1.0" encoding="ISO8859-1" ?>
<html:products>
    <html:prod id="prod1">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD</html:classificationType>
          <html:productType>PRD_XE</html:productType>
          <html:productId>10004</html:productId>
          <html:assignedDate>2018-07-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS</html:name>
          <html:Entity>REP_XE</html:legalEntity>
          <html:location>ED</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>
    <html:prod id="prod2">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD2</html:classificationType>
          <html:productType>PRD_XE2</html:productType>
          <html:productId>10005</html:productId>
          <html:assignedDate>2018-12-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS2</html:name>
          <html:Entity>REP_XE2</html:legalEntity>
          <html:location>ED2</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>
    <html:prod id="prod3">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD3</html:classificationType>
          <html:productType>PRD_XE3</html:productType>
          <html:productId>10014</html:productId>
          <html:assignedDate>2013-07-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS3</html:name>
          <html:Entity>REP_XE3</html:legalEntity>
          <html:location>ED3</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>
    <html:prod id="prod4">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD4</html:classificationType>
          <html:productType>PRD_XE4</html:productType>
          <html:productId>10567</html:productId>
          <html:assignedDate>2010-07-23</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS4</html:name>
          <html:Entity>REP_XE4</html:legalEntity>
          <html:location>ED4</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>
    <html:prod id="prod5">
      <html:referenceData>
        <html:product>
          <html:classificationType>PRD5</html:classificationType>
          <html:productType>PRD_XE5</html:productType>
          <html:productId>10004890</html:productId>
          <html:assignedDate>2015-05-15</html:assignedDate>
        </html:product>
        <html:book>
          <html:name>REPAIRS5</html:name>
          <html:Entity>REP_XE5</html:legalEntity>
          <html:location>ED5</html:location>
        </html:book>
      </html:referenceData>
   </html:prod>
</html:products>


Code: PowerShell [Select]
  1. gc "test.xml" -Raw | sls '(?smi)(<html:prod\s.+?/html:prod>)' -AllMatches | % {$_.Matches} | % { ((((($_.Value) -replace '(<[^>]+>|\s)', '; ') -replace '`r', '') -replace '`n', '') -replace '(;\s)(;\s)+', '$1').Trim('; ') }

Output
PRD; PRD_XE; 10004; 2018-07-23; REPAIRS; REP_XE; ED
PRD2; PRD_XE2; 10005; 2018-12-23; REPAIRS2; REP_XE2; ED2
PRD3; PRD_XE3; 10014; 2013-07-23; REPAIRS3; REP_XE3; ED3
PRD4; PRD_XE4; 10567; 2010-07-23; REPAIRS4; REP_XE4; ED4
PRD5; PRD_XE5; 10004890; 2015-05-15; REPAIRS5; REP_XE5; ED5


  • Stop trying to put it all on one line, (just because I do doesn't mean you should).
  • Stop using Powershell shortcuts until you to understand what they are because they make the source harder to read, (and yes, I used them for a reason).
  • No, the output is not exactly what you wanted - not my problem since getting any coherent information is like extracting a 3 course meal from a lump of granite and takes just as long.
  • If it doesn't work on the files you have - again, not my problem - see above reason.

Your homework to increase your knowledge: Render the above one line into a multi-line Powershell script with no command shortcuts.
Non-optional extra: Tell us what the RegEx is doing.
Optional extra: Fix it so you get the prod value from the input data at the start of the output lines.
Optional extra: Make it process multiple files without using gc *.xml anywhere in it.

If it doesn't work on your data, you tell us why, don't ask us, we're not mind readers.

I'm done.
« Last Edit: August 22, 2018, 07:57 AM by 4wd, Reason: I made a boo boo ... where is it? »

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,347
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #87 on: August 22, 2018, 04:52 AM »
.. I need ad-hoc answers, it's not about a specific thing I try to achieve, but mostly to learn
it's nice to see the enthusiasm for learning :up:

Regards the responses you're getting here - I know nothing about the topic but can see you're being given a big opportunity to learn how to approach things, how to tackle a problem, how to learn.

Can you tell us:
why don't you take the experts' advice?
why don't you answer their questions?
Tom

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #88 on: August 22, 2018, 06:30 AM »
Thanks, but it needs me to run it as admin, which I cannot.

And it's taken 4 pages to find that out - something that should have been stated earlier.

Any idea why the below does not work?

(gc *.xml) -match '(?s)<\?xml\ version="1\.0"\ encoding="UTF-8"\?>.+?</dbts:PmryObj>'

Sure.

Q: Whats's the input data?
A: We don't know.

Q: What's the command output?
A: We don't know.

Q: What version of Powershell are you using?
A: We don't know.

Q: What OS are you using, (including architecture)?
A: We don't know.

Q: What's the statistics of the input file, (eg. size)?
A: We don't know.

Q: Why the hell are you trying to process all files at once instead of one at a time?
A: We don't know.

etc, etc, etc, etc ... for 4 pages.

Idea: We don't know.
Why: See point 1 here.
« Last Edit: August 22, 2018, 06:40 AM by 4wd, Reason: Yeah OK, I was not really done .... but I am now. »

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,288
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #89 on: August 22, 2018, 07:05 AM »
+1

But I need ad-hoc answers, it's not about a specific thing I try to achieve, but mostly to learn

In that case be as clear as you can be, by asking fully documented questions, meaning: (and I've said this before)
  • provide a complete file with input data, if you have to anonymize it, then only the data should be altered, not the structure
  • ask as explicit and unambiguous as possible
  • give an example of the desired/expected output, based on the input data

This entire thread is full of examples of you not following these business-standard rules...

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 9,817
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #90 on: August 22, 2018, 08:40 AM »
Thanks, but it needs me to run it as admin, which I cannot.

And it's taken 4 pages to find that out - something that should have been stated earlier.

Any idea why the below does not work?

(gc *.xml) -match '(?s)<\?xml\ version="1\.0"\ encoding="UTF-8"\?>.+?</dbts:PmryObj>'

Sure.

Q: Whats's the input data?
A: We don't know.

Q: What's the command output?
A: We don't know.

Q: What version of Powershell are you using?
A: We don't know.

Q: What OS are you using, (including architecture)?
A: We don't know.

Q: What's the statistics of the input file, (eg. size)?
A: We don't know.

Q: Why the hell are you trying to process all files at once instead of one at a time?
A: We don't know.

etc, etc, etc, etc ... for 4 pages.

Idea: We don't know.
Why: See point 1 here.

Did you look up how to get the answers on Google for the ones related to your environment?  And several of those that you're giving are not things that you wouldn't know.  You know what your input data is.  You know what you would expect as an output.  You know this stuff or can get it.  Which leads people to believe that you're not trying to give us the information.  And so why waste time with incomplete information?  Learning difficulties is just an excuse for all of the questions that you postedNone of those have to do with learning.

I literally highlighted what you typed for "What version of Powershell are you using" right clicked on it, searched in Bing, and came up with the answer.  There's no reason you couldn't do the same.

http://lmgtfy.com/?s...rshell+are+you+using

https://www.bing.com...rshell+are+you+using

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #91 on: August 24, 2018, 06:37 AM »
Code: PowerShell [Select]
  1. Clear-Host
  2. $products = (Get-Content "test.xml" -Raw) -xxxxx '(....)^.*?(..........................)'
  3. for ($i = 1; $i -lt $products.Count; $i += 2) {
  4.   $products[$i] -xxxxx '(.........)(...)(...)' | Foreach { Write-Host ($Matches[0] + (((($products[$i] -replace '(<[^>]+>|\s)', '; ' ) -replace '`r', '') -replace '`n', '') -replace '(;\s)(;\s)+', '$1').TrimEnd('; ')) }
  5. }

Output
prod1; PRD; PRD_XE; 10004; 2018-07-23; REPAIRS; REP_XE; ED
prod2; PRD2; PRD_XE2; 10005; 2018-12-23; REPAIRS2; REP_XE2; ED2
prod3; PRD3; PRD_XE3; 10014; 2013-07-23; REPAIRS3; REP_XE3; ED3
prod4; PRD4; PRD_XE4; 10567; 2010-07-23; REPAIRS4; REP_XE4; ED4
prod5; PRD5; PRD_XE5; 10004890; 2015-05-15; REPAIRS5; REP_XE5; ED5


-xxxxx = An operator

'(....)^.*?(..........................)' = A RegEx, number of dots represents number of characters in it.

'(.........)(...)(...)' = A RegEx, number of dots represents number of characters in it.


Mental exercise complete ...

Single Line Version
Code: PowerShell [Select]
  1. gci *.xml | % {(gc $_ -Raw) -xxxxx '(....)^.*?(............................)' | % { if ($_ -xxxxx '(.........)(...)(...)') { ($matches[0] + (((($_ -replace '(<[^>]+>|\s)', '; ' ) -replace '`r', '') -replace '`n', '') -replace '(;\s)(;\s)+', '$1').TrimEnd('; ')) } }}

« Last Edit: August 29, 2018, 12:07 AM by 4wd »

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #92 on: August 24, 2018, 06:58 AM »
Maybe we should just shut this down as it's getting a bit heated, and I think that everyone is done.
Agree. Moderator, please move this thread to underground.

Nah, just remove posts 94-97, 100-104 and then lock the thread - there is useful information in the various posts which won't be seen if moved to the Underground.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 9,817
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #93 on: August 24, 2018, 03:14 PM »
Cleaned up the thread as requested.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #94 on: August 24, 2018, 03:32 PM »
Thanks Wraith.

Regarding the script above:
  • There's enough information within it to discern what the two missing operators are.
  • The first RegEx has been given within this thread.
  • The second RegEx will require a little lateral thinking, searching, and experimentation - just like the third did.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,007
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #95 on: August 25, 2018, 11:28 PM »
Code: PowerShell [Select]
  1. Get-Help about_*

Thanks, didn't know there was so many  ;D

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #96 on: September 11, 2018, 11:08 AM »
Guys, can anyone tell me the command that will find all the regex matches, isolate a specific part of each regex match and output all of them in a file?

I have the regex, but I don't know how to indicate a part in it.
The regex is this: "<html:productType>(.+?)</html:productType>"
I used the parentheses to isolate the part of the regex that I want to be output in the file.

How the whole command should be?

I found online and wrote this:
[regex]::match($s,"<html:productType>(.+?)</html:productType>").Groups[1].Value
But I don't know where you specify the source text or if it is correct. Any hint?

Thanks!

PS: It is really a nightmare to do some simple stuff in Powershell. There is very poor and incomplete documentation. Do you think there could be any other solution? Python maybe or anything else? I need it to work with big data though and if it has GUI it would be nice. Also, it needs to be free for commercial and any use.
« Last Edit: September 11, 2018, 11:37 AM by kalos »

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #97 on: September 12, 2018, 04:23 AM »
Guys, can anyone tell me the command that will find all the regex matches, isolate a specific part of each regex match and output all of them in a file?

I have the regex, but I don't know how to indicate a part in it.
The regex is this: "<html:productType>(.+?)</html:productType>"
I used the parentheses to isolate the part of the regex that I want to be output in the file.

How the whole command should be?

I found online and wrote this:
[regex]::match($s,"<html:productType>(.+?)</html:productType>").Groups[1].Value
But I don't know where you specify the source text or if it is correct. Any hint?

Thanks!

PS: It is really a nightmare to do some simple stuff in Powershell. There is very poor and incomplete documentation. Do you think there could be any other solution? Python maybe or anything else? I need it to work with big data though and if it has GUI it would be nice. Also, it needs to be free for commercial and any use.


Anyone please?

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,288
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #98 on: September 12, 2018, 05:07 AM »

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,685
    • View Profile
    • Donate to Member
Re: Extract REGEX matches from multiple text files
« Reply #99 on: September 12, 2018, 05:20 AM »
Any hint?
We've been here before: http://www.donationc....msg422274#msg422274

Ah great thanks!

I tested it and there is an issue. I searched in the file and there is only one instance of <html:productType>(.+?)</html:productType>
However, the output file mentioned the above value (.+?) twice. What could be the problem?

Thanks!

gci C:\XML.xml | % { sls $_.Name -Pattern '<html:productType>(.+?)<\/html:productType>' -a | % { $_.Matches } | % { $_.Groups[1].Value } >> C:\out.txt }
« Last Edit: September 12, 2018, 08:46 AM by kalos »