246
Clipboard Help+Spell / Re: Maybe someone can help with regular expressions
« on: December 08, 2019, 11:23 AM »
If you insist on using regexes for the conversions, then sed can do it. When downloading sed for Windows, be sure to get the latest updated version. A GNU version of sed is required as BSD doesn't support the lowercase conversion \L.
This script should be saved as an ASCII/ANSI file for the y command to work properly, when saved as UTF-8, sed will complaint about the original and replacement strings not being of the same length :huh:
I tried to find all diacritics I could enter on my US-international keyboard layout, I may have left out some you might need. Please add what you need and is missing.
Separator character used is : on all script lines
Command-line should be something like:
This script should be saved as an ASCII/ANSI file for the y command to work properly, when saved as UTF-8, sed will complaint about the original and replacement strings not being of the same length :huh:
# Save this file as ANSI or the y command will cause an error
# Convert to lowercase
s:(.*):\L\1:
# Convert all spaces to _
s:[[:space:]]:_:g
# Remove special characters (][ must be first in range, - must be last in range!)
s:[][?¿/.>,<;\:'"!¡@#$^&*\(\)+=\{\}|-]::g
# Replace diacritics by non-diacritics (to be completed)
y:äáàâëéèêüúùûïíìîöóòôÿýçñ:aaaaeeeeuuuuiiiiooooyycn:
# Convert to lowercase
s:(.*):\L\1:
# Convert all spaces to _
s:[[:space:]]:_:g
# Remove special characters (][ must be first in range, - must be last in range!)
s:[][?¿/.>,<;\:'"!¡@#$^&*\(\)+=\{\}|-]::g
# Replace diacritics by non-diacritics (to be completed)
y:äáàâëéèêüúùûïíìîöóòôÿýçñ:aaaaeeeeuuuuiiiiooooyycn:
I tried to find all diacritics I could enter on my US-international keyboard layout, I may have left out some you might need. Please add what you need and is missing.
Separator character used is : on all script lines
Command-line should be something like:
sed -r -f above_script.txt <input.txt