Author Topic: Text post processing with KEdit, etc. (Read 5498 times)

helmut85 · « **on:** January 10, 2013, 08:20 AM »

I started this thread as a derivate from this Smart Edit thread, since indeed, KEdit is a very interesting (but not too much known) thing: https://www.donation...ex.php?topic=33574.0

EDIT: I just found this intro pdf: http://www.wolffinfo...hes%20(DATABASE).pdf - and they ain't called "Russian editors" but "Eastern Orthodox editors"

They don't develop it further and say so - no unicode / UTF8 -, but some weeks ago, a minor update has been released (1.61., from 1.60). It costs 129 bucks (not the update!).

It's the commercial version of the "Russian editor" type (XEdit / Rexx and their emulators), i.e. the one that permits to work on "subsets" of your lines (so you can have the same functionality for free, with some competitors, never trialled those though) - it's similar to, but not identical with, a "folding editor", or more precely, it IT a folding editor, but which on top of folding, allows for folding cascades.

I trialled KEdit, was very intrigued by this concept. Let me explain:

- you say you just want all lines with "abc" in them
- then you get a result in which you work as if the whole text didn't comprise but these lines (so it's different from a search results table as in askSam or in an editor like TSE): it IS a search results table, but you work within the "search results"

- now you say you just want all lines withOUT "xyz" (with the "more" command, cf. the "less" command below)
- so you'll get a SUBSET of the previous thing, i.e. since you didn't revert to "whole text body" before, you're now with the subset "+abc -xyz"

- you can do this in cascading style many times (don't know if there is a limit to this cascade)

Now for the problems (and that's why I didn't buy it), PLEASE CORRECT ME IF I'M WRONG:

1) You can't easily get back in this cascade: first result, "all abc lines", second result, "all abc lines without xyz in them" - now, in order to go back to "all abc lines", you have to revert to "all lines", then you opt anew for "all abc lines", or you do the "less" command.

Of course, in this example, this is not a real problem, but then, if within a cascade of 5 or 6 such subsequent subsets, you want to go back from step 6 to step 4 or step 3 (e.g. for then doing different subsets from that point on): helluva!

2) Similarly, within such a cascade of refining your "search" / "selection", even without going back, you'll get lost, early on, i.e. with KEdit, you always will have a legal pad near your keyboard, on which you'll write down your refining cascade by hand - that's really a pain in the youknowwhere. Ok, some people might be able to memorize their 5 or 6 consecutive steps here; I get lost in step 3, or even in step 2 if my multitasking capabilities had been in demand in-between.

3) You can only do one subset at a time, meaning, you cannot enter "(+abc -cde) / (OR) ( (+ ijk) OR (+lmn) ) or such, whilst you can do exactly this in a prog like askSam.

Such a feature would have resolved both the above-mentioned problems 1 and 2, since you'd have your search code within the command line, and you'd made adjustments of your "search" command there: would be good both for "going back", as for "remembering the current state". Cf. the askSam search line which works exactly this way.

In fact, you CAN do it by command line, in KEdit, in the style, e.g.

"ALL ~WORD /abc/"

which means, "all lines not containing "abc" as a word (irrevantly of "abc" as part of a word). But you CAN'T COMBINE such commands in your command line, and that's big problem.

4) As with every "folding editor", you only see the lines in question, whilst for many tasks, you'd need the text beneath those lines. In askSam, e.g., you can put the hit table into one window, select hits there, edit the "lines beneath" (= the respective records) within your main window, switch back to your hit table, select another line / record there, etc. (Of course, you can have your hit table beneath your records, or to the right of them, but in this latest variant, buy a large screen in order to "get" the context of these lines, so a second screen for this is perfect for making heavy use of this feature.)

It's MyInfo 5 that had found a really very clever solution to this problem: In fact, MI, in version 5 - well, that was 2 years ago... -, it had a unique feature, allowing for appearing, by option, two more subsequent lines (regular style) beneath each search result line (bold). Thinking about it, you'll quickly grasp that this not only made MI 5 rather suitable for programming needs, but especiall for client db's, and such: Knowing you've got such a fine feature on your fingertips, you'll do the very first 3 lines of your clients' / prospects' records in a special, so a to search for content within line 1 of these records, then have specific content even within your hit table, and not only after going from there into specific records.

So this was a (rather hidden) gem of MI 5 (that the developer didn't even advertize, whilst it was unrivalled (is there similar function in other sw? please tell me!) - and from MI 6, it's absent, the developer saying it will be back some day. (So much for using sw from a 1-developer venue for your business.)

MI has got other details that are much better than the corresponding solution (if there is any) in UR, and people have understood this lately: UR's forum is as dead (if you do abstraction from my posts there) as MI's forum has been a year ago: With absence of UR development, and MI going steady (if slow indeed), more and more people go to MI, and I certainly don't blame them. (UR will - AGAIN - be on bits in some days: it's becoming ridiculous: with a little development on their side, they could easily sell UR full price instead.)

5) Minor problem, but a very ironic one: For its outstanding "subset of lines" feature, i.e. its "USP", KEDit does NOT have any keyboardcut, so you must do a macro activating the menu. This would have been really easy to implement, and this absence is ridiculous: Yeah, hide your best feature, make it difficult to reach. (Ok, not really difficult for us macro / AHK users, but you see what I mean.)

So where's the real advantage of KEdit? It lies in its facility to directly EDIT these "hits", as said above, and this means, it's not so much suited for programming and data mining (whilst that's a little bit its false "promise" though (I mean, seeing that feature, your first idea is to use it in this big way, then realize it's not possible, for the problems listed above), but for text editing, for writers (!!! hence your find!) and for translators, editors of texts written by others, etc.:

In fact, you search for some search term, or even for some synonyms ("abc OR def OR ghi"...), and then, you edit some of them, right within the "hit table" you'll get, no need for switching back and forth between search results and real text, as in programs of the other kind (AS, TSE, UR (Ultra Recall), MI (MyInfo) and many more). At the end of the day, THAT's the real purpose of this program, and in this task it excels.

Again, if there are errors in my description, please correct me, since that would mean that KEdit's range of usefulness would be far greater than described above.

(If you trial KEdit, know that you can filter the lines in the form hitline, hitline, hitline, or in the form hitline, "x lines not displayed", hitline... - the latter is default, and it makes it ugly (and I don't have the slightest idea in what task this feature might be of use) - but you can do without and then have much cleaner results.)

(You also can do subsets by "selection level", but it doesn't seem to be possible to combine this with the filtering above, so I don't think you'll get specific headers, together with their text body, as you (partly and cleverly made) got in MI 5 (point 4 above), and as for "selection level alone, I didn't find a task in which this feature'd make sense for me. Theoretically, the "selection level" command is very clever, but I couldn't find a way to assign a certain selection level to a certain "more" command, i.e. you do a "more" command, then could do a "set selection level" command, but not, it seems, just for these "more" hits / lines, i.e. the "selection level" setting then would mix up those "new" lines with the old ones, i.e. assign the same selection level to all of them, which is not what we would want. AND the "selection level" is line-specific, not group-specific, meaning you cannot use it to have "+abc" = sl1, "more def" = sl2, and so on - such a function could come handy, but doesn't seem to be there - so in the end, I never understood what they think "sl's" might be meant for.)

(And yes, instead of filtering lines, you can have marked them yellow, i.e. see the hits within the global context.)

Can't read John McPhee's New Yorker 1/2013 article on KEdit, since the link is for subscribers only, and buying the issue, here in Europe, wouldn't cost me 6,99 but 25 euro or something, magazine prices triple and quadruple over the Atlantic - but perhaps he explains even better than I did why for writers / editors, i.e. certainly for specific POST-PROCESSING tasks of long texts (technical as well belles lettres), KEdit is indeed a hidden gem.

P.S.: A last hint: Such plain text editors make you lose any traditional previous formatting, which certainly is not acceptable. So there are two solutions: Working with mark-up codes from the beginning, or doing the export to html (from Word / .rtf / any formatted text), then working on this intermediate text from then on. "Post-processing", I said. Btw, it's easy to do a macro that will re-code your ugly html codes "back" into some more pleasant to the eye, e.g.

from <b>bold words</b> or such (sorry this line resolved into bold text here, but you know the <> coding)

to |bold words|| or such,

then work on this text, then have another macro doing the back-to-html transposition (the same would apply to traditional publishing: from formatted text to html (since that export is often very stable when export to .rtf isn't necessarily stable, not to speak of either's product's respective neatness), then macro translation into "intermediate mark-up" (with which you can live visually), then post-processing, then macro translation to the respective codes for PageMaker, InDesign, Framemaker...), but this formatting issue is certainly the reason why goodies like KEdit remain exotic and ain't further developed.

xtabber · « **Reply #1 on:** January 10, 2013, 10:48 AM »

Kedit, like XEDIT, the IBM editor on which it is modeled, is best thought of as an editable database of lines of text. This has both strengths and weaknesses as compared to the more common type of text editor, in which the contents of a text buffer are a continuous stream of characters. You can make Kedit do most of what you want using scripts written in KEXX (a REXX look-alike). However, there are certain things you cannot do in this type of editor, like searching for a targets that straddle line breaks, which are easy to do in a stream-oriented editor. Of course, there are also many things you can do in Kedit that you cannot do in any stream-oriented editor, which is why some of us continue to use it.

As I noted, I would not considered using Kedit as a word processor. Nor is it my only editor - I have always used at least one other editor for certain purposes -- currently EditPad Pro and 010 Editor, mostly.

One reason McPhee has always used Kedit is that a friend at Princeton wrote a number of programs (probably in KEXX or REXX) that interact with it to structure his research. He is not concerned with formatting his own writing.

If you want to experiment with this kind of editor, there is an open-source freeware editor based on XEDIT and Kedit, called The Hessling Editor. It seems to have most of the functionality of the DOS version of Kedit, with some additional capabilities. I don't believe it has a native Windows GUI.

helmut85 · « **Reply #2 on:** January 10, 2013, 12:45 PM »

The intro to the New Yorker speaks of "macro-driven" or something - of course, if his special version even gets enhancements from within...

Even with other editors - and I should have mentioned this particular problem of all editors - it's all about "wrap lines" vs. "long lines", the latter being the paragraph of your text set to one long line in order to work on that paragraph, as a "line" - problem here is the length of such lines; if you filter long lines by some term, these "hits" then will NOT be in the center of your screen, with equal amounts of "context" before and behind, but there will then be a lot of horizontal scrolling, which is awful. (But this horizontal centering of "lines with hits", around the hit, is what several dedicated search tools do, as well as some specialized translator tools.)

That's why you "flatten out" your paragraphs within your "word processor", then export, and then, in KEdit or such, you should do ("soft") word wrap, and THEN only you filter "lines", which would not constitute paragraphs, but more or less aleatoric parts of your paragraphs - I hope that KEdit be able to filter such "sub-lines" after doing "soft" word wrap there? (I tried it not for such texts, but for data mining, where it "failed" for me for the above-mentioned reasons.)

THE (= TheHesslingEditor) is mentioned by some, but in order to play around with such an editor, KEdit is fine since there is no trial period here, just absence of storage for files bigger than a few lines, so you can load files of any size into KEdit and then play around with them, and even paste the final results back into any other editor / text program; of course, abusing the trial this way in a systematic way, to do work with it, would be illegal.

As with many other sw, KEdit should have been further developed: All the above-mentioned negative points could and should have been exterminated over the years, incl. the formatting prob. In fact, some time ago, I searched very seriously for an .rtf-capable editor, but didn't find any. It has been only afterwards that I understood the interest of html export even when you do not bring your text to the web afterwards, html export being much more macro- and editor-friendly, for further processing (and even for further html-"upgrading") than .rtf export, so .rtf is more or less a defunct format: much too complicated in practice, and not stable enough - it hasn't been after I had written complete macros in order to clean up such .rtf exports, that I got aware that it wasn't even stable enough, let alone all the fuss, whilst html is much less chaotic, and this is an .rtf problem, whilst the lack of stability could be the fault of my exporting program of course.

I could write similar things about the only available Warnier-Orr sw, called b-liner: There also, we've got tremendously good ideas, but end of development, with many details never worked out (and, in b-liner's case, bugs that'll remain forever); it's a pity so many sw outstanding from the crowd ain't developer further: Development stops when the point of no (further financial) "return" for possible further development is reached, and so they never attain real maturity.

But xtabber, if you use KEdit on a regular basis, why not share some tips and tricks? Perhaps KEdit's possibilities go further than I discovered with my playing around with it.

As said, it's a very intriguing concept, and then you realize it can't do all these task you thought it could when you first read its description. I'm not calling them liars, it's just that such features trigger some wishful thinking that then is not fulfilled, because the real sophistication of which such features are theoretically capable of, is then not implemented. (And yes, I know that the last 20/30 p.c. of realization of good ideas take as much work as the realization of the previous 70/80 p.c. - but why everywhere we look in sw, we find ourselves with just "promising", instead of oustanding, sw, and this applies to every field in sw (and also when there's enough money to realize the missing 30, 20 p.c.: Cf. my rant re MindManager for an example, so this is not a 1-developer-house-only phenomenon).)

x16wda · « **Reply #3 on:** January 11, 2013, 02:14 PM »

I would be remiss if I did not mention SPFLite here, which is basically a Windows version of the old IBM ISPF editor, with some enhancements. I used ISPF for years so I'm used to how it works, and it has some capabilities that are hard to match in other editors. I use SPFLite not often but regularly these days.

And now that someone has mentioned KEXX I will have to do some looking... I have used Rexx (Regina specifically) for lots of scripting tasks that plain batch won't handle and I'm curious how Kexx compares.

Author Topic: Text post processing with KEdit, etc. (Read 5498 times)

helmut85

Text post processing with KEdit, etc.

xtabber

Re: Text post processing with KEdit, etc.

helmut85

Re: Text post processing with KEdit, etc.

x16wda

Re: Text post processing with KEdit, etc.