topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Wednesday April 17, 2024, 11:24 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Maybe someone can help with regular expressions  (Read 4967 times)

Hilario

  • Supporting Member
  • Joined in 2017
  • **
  • default avatar
  • Posts: 32
    • View Profile
    • Donate to Member
Maybe someone can help with regular expressions
« on: December 05, 2019, 06:34 AM »
Hello
I am trying to define a text conversion but don't find how to do. If some one can help it would be wonderfull.

For example I have this text
¿Sabrias evitar los problemas que te genera la documentación en tu negocio? ¡virtualizate! 

and want to convert in 
sabrias_evitar_los_problemas_que_te_genera_la_documentación_en_tu_negocio_virtualizate

That means:
1- no capital letters
2- "space" converted to "_" 
3- all ¿?¡!.;. (etc) deleted
4- special texts like áà substitute by a, éè substitute by e, the same for íì, óò, úù

Hope someone will answer this. Thanks to this good samaritan ;-)

c-sanchez

  • Participant
  • Joined in 2018
  • *
  • default avatar
  • Posts: 46
    • View Profile
    • Donate to Member
Re: Maybe someone can help with regular expressions
« Reply #1 on: December 08, 2019, 08:56 AM »
Why you need use RegEx? I tried to understand regex, but I find it really difficult and annoying, programmers usually avoid it, that's probably why no one has answered you yet.
I think with coding you can make it easier, and also "human readable" :P

So, you need this to replace in text files or something like that?
Or to use with something like PHP, server side?
Maybe Javascript, client side?

In any case, all is doable without RegEx.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,900
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: Maybe someone can help with regular expressions
« Reply #2 on: December 08, 2019, 09:19 AM »
For doing replacement stuff, c-sanchez is probably right.. you may be better off using a language like python to do the search and replacement stuff..

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,612
    • View Profile
    • Donate to Member
Re: Maybe someone can help with regular expressions
« Reply #3 on: December 08, 2019, 11:23 AM »
If you insist on using regexes for the conversions, then sed can do it. When downloading sed for Windows, be sure to get the latest updated version. A GNU version of sed is required as BSD doesn't support the lowercase conversion \L.
This script should be saved as an ASCII/ANSI file for the y command to work properly, when saved as UTF-8, sed will complaint about the original and replacement strings not being of the same length :huh:
# Save this file as ANSI or the y command will cause an error
# Convert to lowercase
s:(.*):\L\1:
# Convert all spaces to _
s:[[:space:]]:_:g
# Remove special characters (][ must be first in range, - must be last in range!)
s:[][?¿/.>,<;\:'"!¡@#$^&*\(\)+=\{\}|-]::g
# Replace diacritics by non-diacritics (to be completed)
y:äáàâëéèêüúùûïíìîöóòôÿýçñ:aaaaeeeeuuuuiiiiooooyycn:

I tried to find all diacritics I could enter on my US-international keyboard layout, I may have left out some you might need. Please add what you need and is missing.
Separator character used is : on all script lines

Command-line should be something like:
sed -r -f above_script.txt <input.txt

Hilario

  • Supporting Member
  • Joined in 2017
  • **
  • default avatar
  • Posts: 32
    • View Profile
    • Donate to Member
Re: Maybe someone can help with regular expressions
« Reply #4 on: December 09, 2019, 05:30 AM »
Thanks a lot for all the comments.
I see this is really cumbersome.
Why did I decide to use regex?
Because I thought It was the option using clipboard. I may be wrong and they are others.
Thanks a lot to Ath for your scripts, it allows me to understand some ideas, really complex (for me) all this sed usage but clarify the sintax idea.

The main idea was to SELECT a phrase and with clipboard PASTE it transformed in a webready form.
May be some one as a better idea than what a I was asking.

Thanks again for the suggestions

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,646
    • View Profile
    • Donate to Member
Re: Maybe someone can help with regular expressions
« Reply #5 on: December 09, 2019, 06:43 AM »
I've never used RegEx frequently enough to be good at it, but I have found WildGem handy for getting through it from time to time. It's a freeware graphical RegEx query builder utility written by one of the members here at DC.