topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Sunday December 15, 2024, 2:47 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: WildOpal - hypothetical new idea for a "find and replace" program  (Read 48385 times)

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
I wanted to do a simple 'find and replace' in Notepad++, and found out that it can only support simple WYSIWYG queries or over-complicated Regex statements which can sometimes feel like using a sledgehammer to crack a nut. After some frustration going down that road, I decided to think about how a "find and replace" feature should work, at least for 99% of purposes. I wanted something elegant and simple, yet clear and relatively powerful. I wanted NOT to worry about escaping special characters, yet I didn't want to lose the power that would usually provide or the efficiency of using the keyboard.

Colour coding, multiple lines, dual/split display, and on-the-fly updating come as standard (as per my other program Opalcalc ;).

This is what I've come up with so far. It's not a working program yet. I wanted to see if there was much interest in developing this further, and maybe even there's something already out there similar as I'm not overly keen on reinventing the wheel.

WildOpal.png

The unique unicode-flavoured characters offer numerous advantages over typical Regex/Extended syntax queries which rely on existing symbols:

1: It's clearer to formulate expressions and see what you're searching for, and whether certain characters are supposed to be 'special' or 'not'.

2: It's clearer for someone who has to read the expression and who had no idea what you were trying to do.

3: No worry about having to escape characters which may otherwise interpret normal text as 'special' commands. In comparison, with Regex, you may need worry about escaping more than ten different symbols over a text file!

4: Expressions can be reused for different texts without worrying about escaping anything in those texts.

5: Each symbol is easier to commit to memory, so the learning curve is greatly shortened.

Function keys are used for the special characters, so no speed is lost if you're only typing.

I've been pretty frustrated with how cumbersome and arcane the syntax can appear when formulating Regex queries. I hope many of you appreciate the advantages such an approach would provide. By all means, I'd be interested to hear any tweaks or additions to the general concept if you can think of anything! What kind of programs come closest?
« Last Edit: May 24, 2015, 03:45 PM by Twinbee »

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Is there a symbol for the found pattern that may be used in the replacement?  For instance if I wanted to change every instance of

for (i=number;i<othernumber;i++)
{

to

for (i=number;i<othernumber;i++)
{
   int x = 0;

It seems inserting matched pattern as part of the replacement is fairly rare in windows editors without resorting to regex

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,190
    • View Profile
    • Donate to Member
I've been pretty frustrated with how cumbersome and arcane the syntax can appear when formulating Regex queries. I hope many of you appreciate the advantages such an approach would provide. By all means, I'd be interested to hear any tweaks or additions to the general concept if you can think of anything! What kind of programs come closest?

Though Regex can seem arcane, it's just because there are a lot of options, and it can do quite a bit.

There are a few programs that help with the formulation of regex queries.

regexbuddy helps more with learning, while regexmagic is the alternative if you just want to make them.  Expresso is a free alternative.  It seems like something along those lines might be a good thing to work on- something to teach in a simplified way without all of the additions, sort of like what you have mocked up, would be a good thing to have, rather than inventing the wheel.

But that's just my take on it.  I was in a similar place to you a while ago, and regexbuddy helped me turn that around.  Because RegEx support is everywhere, and it opened up a whole world for me to utilize it.  It helped that I *had* to learn for work, but I'd recommend that in any case.

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,629
    • View Profile
    • Donate to Member
+1 plea for regexes.

www.regex101.com is a quick online site that  gives a full explanation on the regex and it's results, and has a replace feature as well  :up:

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
MilesAhead: Yes, the "Group" button to allow pseudo-'variables' in the 'Replace' section should handle that in my hypothetical program.
« Last Edit: May 24, 2015, 03:25 PM by Twinbee »

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Thanks everyone for all for the feedback.

Regex has this big problem in that you can't allow a section of text to be treated as 'non-special'. I had some text I wanted to operate on that lots of symbols. Unfortunately, it would have meant I had to escape each and every one - not my idea of fun, especially since special symbols apparently differ according to the Regex implementation.

I know Regex has got a lot of momentum now, but I can't help feeling there are fundamental advantages to this kind of approach offered by WildOpal (as well as possible small disadvantages I admit, such as extra developer effort). For beginners, or the non-techy minded at least, it should be an improvement I think, but even seasoned users should find an improvement in theory, at least in certain ways.

For example, it's clearer for somebody else who didn't create the original expression to read it more easily.
« Last Edit: May 24, 2015, 04:01 PM by Twinbee »

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,190
    • View Profile
    • Donate to Member
It's merely a matter of finding out how to do it in most cases with RegEx, rather than it not having the capability.

Case in point:

Many flavors also support the \Q…\E escape sequence. All the characters between the \Q and the \E are interpreted as literal characters. E.g. \Q*\d+*\E matches the literal text *\d+*. The \E may be omitted at the end of the regex, so \Q*\d+* is the same as \Q*\d+*\E. This syntax is supported by the JGsoft engine, Perl, PCRE, PHP, Delphi, and Java, both inside and outside character classes. Java 4 and 5 have bugs that cause \Q…\E to misbehave, however, so you shouldn't use this syntax with Java.

From http://www.regular-e...info/characters.html

If it's just for one application, what you're proposing would have less of an uphill battle, I think.  You're proposing not just a software tool, but a standard to be used in searching.  Unless your hypothetical program was an editor also.

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
wraith808: Shame it isn't fully standard in the spec, but crumbs, you're right - the \Q...\E does indeed work to treat chars as literal, at least in Notepad++. Asking at SuperUser.com produced no response, and it was missing from practically all of the tutorials I saw from searching with Google, so I assumed it was impossible.

It's going to be a lot easier to create this hypothetical program than I anticipated, since I thought of the (now obvious) idea to use Regex as a middleman. My program simply becomes a sugary wrapper for Regex. Things like ".+?" (without the quotes) can be replaced by that inverted star wildcard symbol for example. Other things like the newline symbol can represent "\r", "\r\n", or "\n" (with the option to use those specifically for more power if need be), and "[0-9]+" or "\d+" can be replaced with a single symbol.

I know they're little things, but it all helps, and can potentially reduce a complex expression to at least half its original size. Like a sugar-coated Regex.

You're proposing not just a software tool, but a standard to be used in searching

Yes, I think it would be neat to see something like this used everywhere, especially for people who don't use Regex very frequently, but obviously the chance of that is virtually zero. Maybe some people may find it useful though, so I'll persevere regardless.
« Last Edit: May 25, 2015, 06:48 AM by Twinbee »

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,190
    • View Profile
    • Donate to Member
Shame it isn't fully standard in the spec, but crumbs, you're right - the \Q...\E does indeed work to treat chars as literal, at least in Notepad++. Asking at SuperUser.com produced no response, and it was missing from practically all of the tutorials I saw from searching with Google, so I assumed it was impossible.

I've found that on the StackExchange sites with a high volume, valid questions can get lost in the noise... whether its from the time posted, how fast things scroll off the front page, or any variety of other reasons :(

Glad I could help, though!  And good luck with your program!

JavaJones

  • Review 2.0 Designer
  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 2,739
    • View Profile
    • Donate to Member
Interesting discussion. As someone who has struggled with RegEx in the past but regularly uses simpler search operators and wildcards (e.g. *.jpg, *DCM_??, etc.) I might find this tool useful. Especially if it interfaces with/generates RegEx. It of course would need to do that if it has no built-in editor capability. If it did, it could get some use especially if it could be invoked easily through e.g. Notepad++ ("Open this file in WildOpal"). Generating RegEx would extend its reach quite a bit more.

Anyway, I find the idea appealing. And since you're already going forward with it, I am curious to see what you come up with.

- Oshyan

Innuendo

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 2,266
    • View Profile
    • Donate to Member
Expresso is a free alternative.

I had not heard of Expresso. It looked very interesting until I noticed that the last version was released in 2013. That wouldn't be so bad except it's coded to stop working on January 1, 2016.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,190
    • View Profile
    • Donate to Member
Expresso is a free alternative.

I had not heard of Expresso. It looked very interesting until I noticed that the last version was released in 2013. That wouldn't be so bad except it's coded to stop working on January 1, 2016.

Is that even after registering?

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Install Expresso 3.0 (Version 3.0.4750 - January 2, 2013*):

...

* If you have not registered, Version 3.0.4750 will expire after 60 days or on January 1, 2016, whichever comes first.

Innuendo

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 2,266
    • View Profile
    • Donate to Member
Oh, in my haste it appears I misread things. Off to download then....nothing to see here.

TaoPhoenix

  • Supporting Member
  • Joined in 2011
  • **
  • Posts: 4,642
    • View Profile
    • Donate to Member
wraith808: Shame it isn't fully standard in the spec, but crumbs, you're right - the \Q...\E does indeed work to treat chars as literal, at least in Notepad++. Asking at SuperUser.com produced no response, and it was missing from practically all of the tutorials I saw from searching with Google, so I assumed it was impossible.

It's going to be a lot easier to create this hypothetical program than I anticipated, since I thought of the (now obvious) idea to use Regex as a middleman. My program simply becomes a sugary wrapper for Regex.

There's a couple of nice points there.

Asking at one forum (SuperUser) and checking the Tutorials is a "fair effort". So that's far beyond what usually gets derisively replied as "RTFM". Lucky that DC has some smart people!

:Thmbsup:

And next, from what I understand, making backbones and frameworks is hard. But there's a great need for little apps that borrow an existing backbone and make it friendly to people who are only middle-skilled users. Because then your "engine" is there - and you can add features a little easier because they can be explanatory / useful features, rather than trying to re-invent a wheel that isn't a wheel and getting your head wrapped into a pretzel on meta design issues!


Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Especially if it interfaces with/generates RegEx.

You bet! :) I've added an option so that it not just displays the Regex on the fly, but you can edit the Regex if you need more power than WildOpal would usually provide.

A nice feature is even editing the text to work on - it will highlight letters as you go along. So everything updates on the fly automatically whether you're typing in the search (regex, or WildOpal style), replacement text, or the main text that it searches on.

If it did, it could get some use especially if it could be invoked easily through e.g. Notepad++

Well, for now, it's easy enough to select text and copy paste it straight into WildOpal, but yes that's worth considering.

rather than trying to re-invent a wheel that isn't a wheel and getting your head wrapped into a pretzel on meta design issues!

Yes that road is littered with headaches, mines and nightmares. Implementing even a few basic features was proving to be more than a little tricky. Piggybacking off Regex is a breeze in comparison!

Anyway, it's coming along pretty well so far. Here are the buttons/symbols I've implemented so far and what they convert to:

  • Single character: .
  • Many characters: .+?
  • Repeat character: +
  • Numeric digit: \d
  • Numeric digits: \d+?
  • Newline: (\r\n|\r|\n)
  • Newlines: (\r\n|\r|\n)+?
  • Symbols: [^\s\w]+?
  • Symbol: [^\s\w]
  • Letters: [a-zA-Z]+?
  • Letter: [a-zA-Z]
  • Charset: []
  • Charset (repetitions):[]+?
  • Single whitespace char: \s
  • Multiple whitespace chars: \s+?

If any of you can think of other very common Regex snippets (excluding ones using {}<>#| as I've yet to implement those), let me know.
« Last Edit: May 26, 2015, 05:04 PM by Twinbee »

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Yes that road is littered with headaches, mines and nightmares. Implementing even a few basic features was proving to be more than a little tricky. Piggybacking off Regex is a breeze in comparison!

I think it was OpenVMS Language Sensitive Editor had its own pattern matching scheme.  A percent sign matched any character is the only one I remember.  I wonder if they used regex internally or rolled their own.

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #17 on: June 07, 2015, 08:50 AM »
Well, I've got a beta, but stable and fairly polished feature-packed version complete. Love to hear some feedback from pro regex users and people who don't have a clue about how regex works. Even if you're not interested in the unique "unicode symbol" way of doing things, it's a pretty nice regex editor anyway.

Here's a couple of screenshots:

wildgem1.png

wildgem2.png

I now know more than I ever expected to know about Regex. Unlike most other find/replacers and regex helpers, "WildGem" (as it is now called) puts the focus on the text you're trying to search in (and the text you're trying to replace). In the end, I kept most of the core functionality that regex provides, but the most common uses are kept at the forefront (e.g: the simple symbol ✪ is the equivalent of ".*?" in regex language and there's a single simple symbol for the messy regex "(?:^|$|\n|\r\n)" which can find the start or end of any line).

And here it is - a single self-contained 53kb exe. No installation necessary, though it does require .NET 3.5 or later: http://www.skytopia....om/stuff/WildGem.exe
« Last Edit: June 07, 2015, 09:21 AM by Twinbee »

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,629
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #18 on: June 07, 2015, 09:38 AM »
I think I like it, especially for (regex) n00bs it'll be very helpful, as intended. :up:

A few remarks:
  • Resizing the window isn't very polished
     Screenshot - 7-06-2015 , 16_29_59.png
  • The window is quite big, it may not fit correctly on smaller screen resolutions (and then the first item comes in)
  • Tooltip time is rather short at it's default setting
  • You might want to add the regex code inserted by a button into the tooltip, for the more technical/interested users

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #19 on: June 07, 2015, 10:52 AM »
Thanks and glad your initial impression was favourable!

Resizing the window isn't very polished
It was either that, or no resizing. A workaround may be tricky. I mean, I could put it into a scrollable pane I guess...

The window is quite big, it may not fit correctly on smaller screen resolutions (and then the first item comes in)
The width is 1185 pixels. I doubt many are using less than around 1300 pixels of width these days. Still, if someone on this forum is, I'll eat my hat, and try and condense it.

Tooltip time is rather short at it's default setting
Ah, the developers of .NET would like to talk to you. Anyway, yes I agree entirely, and a (somewhat tricky) fix is forthcoming. In the mean time however, all the info from the tooltips are available in the main help section.

You might want to add the regex code inserted by a button into the tooltip, for the more technical/interested users
Not a bad idea - I'll consider it.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,190
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #20 on: June 07, 2015, 11:39 AM »
I must say, I still don't get it.  Those symbols are not regular symbols that one would use on the keyboard, so what is it for?  Is it just a regex builder in the end?

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,629
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #21 on: June 07, 2015, 12:05 PM »
Tooltip time is rather short at it's default setting
Ah, the developers of .NET would like to talk to you. Anyway, yes I agree entirely, and a (somewhat tricky) fix is forthcoming. In the mean time however, all the info from the tooltips are available in the main help section.
That is an application-wide setting, but I couldn't find or recall from memory the exact attribute to set :-[, as most of my code isn't on the laptop I was typing my remark on.

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #22 on: June 07, 2015, 01:17 PM »
I must say, I still don't get it.  Those symbols are not regular symbols that one would use on the keyboard, so what is it for?  Is it just a regex builder in the end?

Thanks for the question! I included the special symbols for a few reasons:

1: Each special WildGem symbol can represent two, three or even 10+ regex symbols. I don't want to use the original symbols, as they already have meaning. Something like "the ✪ quick" is nicer than "the .*? quick" ;P

2: They are easier to distinguish with the eye than the standard regex symbols, as they don't look like 'normal text'. Less escapes makes the expression clearer even further. This helps especially for more complicated expressions.

3: You can paste text directly directly into the Find expression without having to worry about escaping any of the characters in said text. The WildGem symbols I use are very rare, and hardly anyone would otherwise use them in day to day work.

Anyway, even if you never use the special symbols, and only use the regex filter, I hope you'll find it's a pretty nifty util anyway. Apart from regexr.com, I don't think anything else has quite this layout where everything is accessible, updates on the fly, and that highlight all regex matches at once in the main text. Be interested to see others programs that can do that if you can find one.

@Ath: Width now reduced to a more compact 1013 pixels instead of 1185 !
« Last Edit: June 07, 2015, 01:53 PM by Twinbee »

cranioscopical

  • Friend of the Site
  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 4,776
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #23 on: June 09, 2015, 08:41 AM »
I don't replace many things (just break 'em). This, however, will be useful for me on the few occasions when I have the need, have forgotten the syntax, and can't find my copy of Searching for Dummies :Thmbsup:

Nice, lucid help, as with OpalCalc  :Thmbsup:  :Thmbsup:
 
Thanks for making it available!
 

Twinbee

  • Member
  • Joined in 2012
  • **
  • Posts: 84
    • View Profile
    • Donate to Member
Re: WildOpal - hypothetical new idea for a "find and replace" program
« Reply #24 on: June 09, 2015, 05:18 PM »
Thanks for the feedback! The non-beta version will be potentially hundreds of times faster (helpful for 100MB or even 1GB+ files!), and will allow the escape syntax in the replacement section too.

The speed increase is due to the Scintilla text component (the one Notepad++ uses!) - so much better than .NET's crappy RichTextBox. Oh HOW I wish I knew about that when I was developing OpalCalc...
« Last Edit: June 09, 2015, 08:35 PM by Twinbee »