topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday December 3, 2020, 6:28 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: How to identify email address at the end of lines with RegEx?  (Read 4009 times)

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,751
    • View Profile
    • Donate to Member
Hello!

How to identify space + email address at the end of lines with RegEx?

e.g.

*anything* [email protected]

How can I match these email addresses? The email can have various stuff e.g. dots, underscores etc.

Thanks!

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 10,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #1 on: June 04, 2020, 10:53 AM »
Code: Text [Select]
  1. \b[A-Z0-9._%+-][email protected][A-Z0-9.-]+\.[A-Z]{2,}\b$

erikts

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 186
    • View Profile
    • Donate to Member

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #3 on: June 05, 2020, 04:01 AM »
Code: Text [Select]
  1. \b[A-Z0-9._%+-][email protected][A-Z0-9.-]+\.[A-Z]{2,}\b$


Thanks but why the following does not work?
Code: Text [Select]
  1. \b.+?.com\b$

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 10,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #4 on: June 05, 2020, 08:12 AM »
Thanks but why the following does not work?

That's an exercise I'll leave to the asker.  :Thmbsup:

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #5 on: June 05, 2020, 10:13 AM »
I don't understand, the below matches " text1 text2 [email protected]".

Why does it includes spaces in between since it is enclosed in \b?

\b(.+?\.com)\b

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,489
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #6 on: June 05, 2020, 12:20 PM »
Why are you trying to use something you don't know how to construct (you are not even checking for the mandatory @ and .  >:(), if there are several working examples and links in posts above yours? Please use the advice that's given, for free, by ppl that actually know what they are talking about.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,396
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #7 on: June 05, 2020, 12:57 PM »
You might also check out my program, Regex Captor:
https://www.donation...ndex.php?topic=45497

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #8 on: June 07, 2020, 08:22 AM »
OK so what is the regex to capture the piece of text that starts with space and ends with the end of the line and can contain any character?

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,489
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #9 on: June 07, 2020, 10:47 AM »
OK so what is the regex to capture the piece of text that starts with space and ends with the end of the line and can contain any character?

That would look like this:
.*\s+(\S+)$
What this does is:
.* : Any character
\s+ : white-space, 1 or more consecutive
(  : start group
\S+  : not white-space, 1 or more consecutive
)  : end group
$  : end of line marker

You will have to get the group 1 value for your result, input like this:
a piece of text at the end of the line
will give you the word 'line' as a result

If you are still searching for your original request, I took the first regex from https://emailregex.com, dropped it in the https://regex101.com regex tester, fixed the issues it reported (escape a few slashes because of the PCRE regex engine), and wrapped it with the construction I just showed here, and came up with this:
.*\s+((?:[a-z0-9!#$%&'*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\]))$
If you feed that this text:
test for email address with plain text prefix test@email.com
The only group that's there produces '[email protected]' as a result.

Did I complete your assignment with this? ;)

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 10,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #10 on: June 07, 2020, 11:54 AM »
That first regex that I put in works on that phrase just fine.  I didn't include the space in the regex because I thought that was just a miscommunication, after all, who'd want a space before the e-mail that they capture?

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,489
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #11 on: June 07, 2020, 12:33 PM »
That first regex that I put in works on that phrase just fine.
I fully agree, but as the OP has trouble reading (or understanding?) replies to his questions, I sort of tried to blow him off his socks, as all previous replies probably were too 'easy' :huh:

I didn't include the space in the regex
Well, there has to be some separator between any content and the e-mail address, and that's most likely a space.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 10,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #12 on: June 07, 2020, 05:23 PM »
Well, there has to be some separator between any content and the e-mail address, and that's most likely a space.

No, I mean that I didn't include it in the resulting match.  It would match like yours, just selecting the e-mail address.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,412
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #13 on: June 07, 2020, 11:03 PM »
If would be helpful to know where this is going to be used, eg. in a spreadsheet, PowerShell, JavaScript, etc - there are system calls that could possibly do a better job of validating an email than a single RegEx although you still need to separate a suspected email address out of the input.

In PowerShell you could cast the assumed email address to the MailAddress type and let the system validate it, email addresses are no longer restricted to just ASCII characters.

eg. Valid email addresses:
[email protected]🚌.org
[email protected]やる.net
[email protected]

Plain old ASCII valid email addresses:
John.Doe(Donationcoder)@nowhere.net
[email protected]
"John..Doe"@nowhere.net
"John. .Doe"@nowhere.net
"[email protected]"@nowhere.net

PS. The RegEx's given above fail the simplest email address: [email protected]  ;)

Maybe you'd need to (roughly):
  • look for a @ within the last 256 characters of the line, (max length of the domain part is 255);
  • see if there are any pairs of quotes within the maximum allowed length of the local part, (64 chars), if there are you'd need to allow a lot more ASCII characters than you normally would, (eg. spaces, @);
  • if there is an odd number of quotes then check that the 'odd' ones are escaped (\\);
  • if there are no quotes in that 64 chars then take the chars at the first space backwards from the @;
  • pass what you have to the system for validation.

Probably a lot more needs to be done in there before you can be sure you pick up anything that might be an email address.
« Last Edit: June 08, 2020, 06:00 AM by 4wd »

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 10,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #14 on: June 08, 2020, 07:58 AM »
PS. The RegEx's given above fail the simplest email address: [email protected] 

Yeah, I assumed that he would have a tld, and didn't want to put too much into it as I was sure that it wouldn't be enough in any case.  :huh: :-[ :D :Thmbsup:

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,412
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #15 on: June 08, 2020, 07:09 PM »
... I was sure that it wouldn't be enough in any case.

How can you say that?
😏
🤣

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,751
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #16 on: June 17, 2020, 05:42 AM »
\S

That was very useful.

erikts

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 186
    • View Profile
    • Donate to Member
Re: How to identify email address at the end of lines with RegEx?
« Reply #17 on: August 27, 2020, 11:43 PM »
I would like to share a book on regex for beginners : Regular Expressions for Regular Folk via Hacker News.

This is an experimental “book” about regular expressions. It is largely visual and example-based, as opposed to most regex resources I found while I was learning. I also attempted to choose test cases that highlight some common gotchas. I think it’ll be worth your time.

This book’s intended audience is regex beginners. Some programming experience is assumed. It does not go into advanced regex concepts like engine backtracking and recursive regexes—at least not at the moment.

This is also an open source project, and contributions are welcome.