topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday December 13, 2024, 1:31 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Regexp help  (Read 7329 times)

Josh

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Points: 45
  • Posts: 3,411
    • View Profile
    • Donate to Member
Regexp help
« on: November 04, 2014, 07:00 PM »
Alright, I am trying to create a regexp that performs the following:

Selects an IP address from a line but only if the line starts with a pre-defined text string. There will be text in between the text string and the IP address -- I want to ignore the entire line except for the IP address. I know this is do-able but it is driving me nuts.

Any ideas?

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,291
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Regexp help
« Reply #1 on: November 04, 2014, 07:23 PM »
^definedstring.+([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).+$

Or perhaps:

^definedstring.+([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).+$

But, that should do it or at least give you enough to test & tweak. You'll have 1 capture group with the IP address.

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,291
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Regexp help
« Reply #2 on: November 04, 2014, 07:40 PM »
This is better:

^definedstring.+([1-9][0-9]{0,2}\.[1-9][0-9]{0,2}\.[1-9][0-9]{0,2}\.[1-9][0-9]{0,2}).+$

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

Josh

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Points: 45
  • Posts: 3,411
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #3 on: November 04, 2014, 07:58 PM »
So...Here is a single line of what I am trying to filter:

SOME DATE TIME GROUP HERE :: CtrlChan Some more text here [192.192.192.192]

I want to filter out the IP address in any line that contains CtrlChan.

Josh

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Points: 45
  • Posts: 3,411
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #4 on: November 04, 2014, 08:01 PM »
Sample line

2014-10-10 19:28:01.110::ErrorMsg::CtrlChan Error_Decode [192.192.192.192]

If, and only if, the line has CtrlChan in it do I want to extract the IP address from the line. This will be a very large file (200+MB per file) with a large number of different entries.

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,291
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Regexp help
« Reply #5 on: November 04, 2014, 08:20 PM »
That helps.

This works, but you must specify that the search is multiline in order for the ^ and $ to operate properly:

^.+CtrlChan .+[^0-9]([1-9][0-9]{0,2}\.[1-9][0-9]{0,2}\.[1-9][0-9]{0,2}\.[1-9][0-9]{0,2}).+$

Here's the result in Expresso:

Screenshot - 2014_11_05 , 1_19_43 PM.png

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,291
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Regexp help
« Reply #6 on: November 04, 2014, 08:25 PM »
And here's the analysis to help explain what's happening in there:

Screenshot - 2014_11_05 , 1_24_04 PM.png

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #7 on: November 04, 2014, 08:33 PM »
2014-11-05_13-31-34.png

With a little help from here.

NOTE: It doesn't check if the IP is from a valid range like @Ren's but on the other hand it won't miss any if the IP is invalid.
« Last Edit: November 04, 2014, 09:06 PM by 4wd, Reason: Cos I\'m stoopid »

Josh

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Points: 45
  • Posts: 3,411
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #8 on: November 04, 2014, 08:49 PM »
Renegade and 4wd, I owe you both big time! You both just helped me save a large amount of time by not having to search these files manually. Thanks!

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #9 on: November 04, 2014, 08:58 PM »
Actually, @Ren's does pick up invalid IPs ... Sorry  :-[

Honestly @Ren, you should have limited the number range it'll match on then I wouldn't have been wrong :P

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,291
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: Regexp help
« Reply #10 on: November 04, 2014, 09:25 PM »
Yeah... generally you can be pretty sloppy when checking for IP addresses though as they're unlikely to be reported badly, and few other strings will match improperly, e.g. version numbering with major, minor, revision, and build -- 10.23.34.987.

Here are some URLs for regex to actually do IP addresses properly:

http://www.shellhack...in-a-File-Using-Grep

http://www.mkyong.co...-regular-expression/

http://stackoverflow...-strings-using-regex

They're kind of ugly with a lot of pipes.

@4wd - Your regex there is elegant and terse. It's one of the things that I tend to avoid as I find it's simply easier to read when being a bit more verbose. I also try to avoid a lot of special matching characters as in general as a lot of the regex I use are in EditPlus, and it's not really the best there, so it encourages being verbose rather than elegant.
Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: Regexp help
« Reply #11 on: November 04, 2014, 10:42 PM »
@4wd - Your regex there is elegant and terse.

The IP bit wasn't mine, it came from here.

It's one of the things that I tend to avoid as I find it's simply easier to read when being a bit more verbose.

Conversely, I find the opposite :)

I have more chance of understanding it if I'm not going cross-eyed trying to take in 50+ characters at once ;D