topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Tuesday April 16, 2024, 7:26 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: bug in regex processing? or bug in my regex?  (Read 5142 times)

mysteryman

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 16
    • View Profile
    • Donate to Member
bug in regex processing? or bug in my regex?
« on: December 30, 2007, 04:57 PM »
This is related to the shipmenttracking.alias i updated last night (this morning). Before i posted the alias file, i tried to combine the keyword based, and regex based regex's together into one single regex to simplify future updates (if ups, etc changes their check url)
the regex for ups that i believed would work (and kinda does) is this

^(?:ups (.*)|(1Z.*))
-regex

my understanding, is that this regex should return either, but not both (.*) or (1Z.*) as $$1. However look at the results of what it does, and feel free to test

$$1=1Z1
-ups 1Z1
see, this works... it detects the first one works, then skips to the end of the (?:)

$$1= | $$2=1Z1
-1Z1
this one does not work. It appears that each time it processes a ()  that does not lead with (?: that it counts up one variable. It seems it does this irregardless of matching. This first (.*) gets counted as 1, even though it does not match (seen by the null return of $$1), and should be ignored (correct me if i'm wrong), then it procedes to check the other half of the (?:) . It does return a match, but instead of returning a match on $$1 like it should (again, correct me if i'm wrong), it returns it on $$2. making it very difficult to make one alias work for both of them


if there is another way to do this, or if i am using the incorrect syntax for the regexp, please let me know...

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Re: bug in regex processing? or bug in my regex?
« Reply #1 on: December 30, 2007, 05:21 PM »
Aren't you supposed to have parens around the whole OR-expression? Christ, my regex-fu is dusty, and I'm too lazy to go into the living-room to pick up my copy of Mastering Regular Expressions. But IIRC it should look something like...
^(?:ups ((.*)|(1Z.*)))
-regex
- carpe noctem

mysteryman

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 16
    • View Profile
    • Donate to Member
Re: bug in regex processing? or bug in my regex?
« Reply #2 on: December 30, 2007, 05:25 PM »
yes, it should, except you missed the earlier (?: that IS an opening parrenthesis for the OR, but it is a special one to prevent capturing/recording the information found (ie, so you will capture "1Z1", NOT "ups 1Z1" in the string ups 1Z1)

a little easier if you break it up (yes i know it would not work like this, but it does make more sense)

^(?:
  ups (.*)
|
  (1Z.*)
)

make more sense now? it at least does to me, but i could still be wrong

EDIT: oh, and btw, if you reverse the regexp, they reverse... 1Z works, and UPS does not... added colors too
« Last Edit: December 30, 2007, 05:27 PM by mysteryman »

jgpaiva

  • Global Moderator
  • Joined in 2006
  • *****
  • Posts: 4,727
    • View Profile
    • Donate to Member
Re: bug in regex processing? or bug in my regex?
« Reply #3 on: January 02, 2008, 10:42 AM »
I can see your problem, mysteryman! (oh, and thanks for teaching me how to use the ?: after parenthesis, i had never understood how that worked).

The problem here is that both the non-matched and the matched groups get numbers, and you'd like to have only the matched groups to be numbered.
I think this requires someone with a higher "regex-fu" (like f0dder called it) than i have, to tell us how this is supposed to work by the regex rules.

BTW: do you know how would other programs do it? (i couldn't find any program that would do a parsing by groups like farr does :()