Show Posts - helmut85

General Software Discussion / Cat Videos and Private Outsourcing - 2 Variants in Ostensible Brilliance

« on: January 16, 2013, 01:29 PM »

Years ago, I've had a really good idea, together with some necessary legal and practical considerations, all of which didn't constitute a real obstacle, nor do they constitute today: you've got just to be smart in the execution of the scheme I advocate.

The legal side is quite simple, you don't have the right to have third parties have some knowledge about your firm's details (so they'd have the "right" to flip you, but would they do it, after all? I doubt this, and no harm will be done anyway, except if you try to get some work for really cheap, from somebody without a brilliant future on his own - just don't go for desperate people with lots of time, easy is dangerous), so you must select your contractors with care, but then, it's not a crime to delegate just work to people you pay for this, contrary to academic work you are supposed to have concocted all alone - whilst we all know that most rich people's children have their academic work written by ghost writers paid by their parents, to begin with their homework in the very first year (since you cannot spend 3 months on legal homework and cruise the Mediterranean on the family yacht at the same, everybody will understand this), and up to their doctoral "dissertation". So...

Today, Der Spiegel publishes a variant of my original idea, so I think I better present both, together with publishing mine for the first time, in order also to retain some moral rights on the latter.

I

In this link, http://www.spiegel.de/netzwelt/web/outsourcing-entwickler-liess-chinesen-seinen-job-erledigen-a-877990.html

they tell you a high-paid (big six figures) programmer / IT executive in a U.S. corp didn't do anything on his pc, within the corp, except surfing the web all day long, especially looking at cat videos! (Can I blame him? Not really.)

At the same time, some Chinese regularly accessed the highly-secured computer network of this corp and did do all sorts of things, nothing harmful though, and getting access with a special ID chip card, via a reader, having been issued to the cat video viewer.

So eventually, an external IT security service provider dug these facts: The employee in question (then fired) had privately outsorced all his work to a Chinese IT service corporation, and they did all his work for him, from China; he paid them high in the 5-digits, so this scheme was financially highly profitable for the man in question who was considered the very best IT man within this corporation (I don't know if this was so even before, when he did his own work yet, or if this favorable appreciation of him was the direct result of his contractors doing so well their work).

Now, the question is, why - especially if he was so good at it even by his own means - this man so entirely avoided to do any of his work, i.e. why he was obviously unable to win the slightest satisfaction out of it.

II

Years ago, I had not at all that very same idea, since my idea was, why wouldn't a young executive (or a programmer, as we have got here, but at the time, I had executives or young lawyers in mind) not outsource parts of his work, in order to be considered a very promising young executive. I didn't think about China, but in a very conservative way, I thought of brilliant students he'd hire, in law or business administration - the same should be possible in the sciences, etc. - , and who would work for him, executing parts of the tasks he himself had been assigned. I mused, he perhaps would be paid 10k a month, 6.5k net, and why not, instead of wasting his money on travel, furniture, cars, giving 1.5k each, net, to 2 young students working for him on week-ends, etc.

This way, our young executive would be able, not to see cat videos instead, but to deliver almost 2 times what he was expected to work on, in his office, and I thought that such an investment on his part could be extremely beneficial for his career. I thought myself, there are employees with higher IQ, and who work much faster than their collegues, so they will "make it". Ok, there's also, and very importantly, that "way with people", called "emotional intelligence / EQ" today, and with this career aspect, some "private collaborators" will not help, but I also thought, with equal EQ, with equal IQ, and with equal work measured by time (let's say 40, 45, 50, 55 hours a week), PLUS two student collaborators, our young executive should be able to have very quickly a career that paid back tenfold for those 3,000 bucks he spent on his monthly income, for the very first years of his career.

I was aware that this couldn't work but if his superiors thought it was him who did this exceptional work load (and with correct results, of course); I was aware that the "alternative" to "work more", individually, wouldn't work out, since smarter people than he was, could - and would - work longer hours, too, and thus assure the lead they had on him grew even bigger: secret delegation, AND hard work, seemed to be a viable solution, though, especially since there certainly would be some intimidation effect on his peers, and even on collegues smarter than him, erroneously assuming that it was him the superior intelligence.

I was also aware of the risks of such a scheme: First, our man should be smart enough in order to not appear really stupid in the office or law office; it should be ok that his peers and his superiors are astonished by what he's capable of, but they shouldn't be outright incredulous at what he delivers, for too much inconsistency with what he's orally capable of.

Then, if our man delegated work to inferior students who for themselves didn't make it, there was a risk of extortion: "Give me more share, or your superiors will know who did the work." On the other hand, brilliant students would never do this, since their own career would be put at risk by such a move. Of course, brilliant students don't have so much time, hence my idea to not rely on just one such student, but to take two (or even three - it depends on your risk perception: with 6,500 net each month, you could finance 4 such students and live yourself on 500 bucks a month, with your own income exploding, after two years, to 500,000 bucks a year; a very risk-averse person would just have one such student, and still have 5,000 bucks a month for his expenditures, but would rise his income by perhaps just 20 or 25 p.c. (but then could get a second private contractor)).

I also was aware that our man should delegate with caution, and should do the really difficult parts himself, without, on the other hand, relegating his private staff to menial tasks only - and he should control this work, perhaps with some cross-control also, student 1 checking tasks executed by student 2, and vice versa.

Also, I thought by myself that all this should be organized in a very private way, our man getting work out of the office in order to "work on it at home", then passing parts of it to his contractors, i.e. I was aware of him not being well advised to have any phone or mail conversation with them in the office or by his office pc; about giving them access to the corporate network, I didn't even consider such outrage. And, of course, in order to prevail such discretion, I was aware our man would have to collect the necessary data himself, within the office, if data wasn't available but there, e.g. (this situation is much better now for students in the university itself, so today they can search for this data there) specialised / too expensive db's, available in the office / corporation, but not in the university (as said, today it's probably the other way round, the students having access to data himself will not have access to (which in some cases might even bring up the problem: "But this is brilliant! Where did you find it?")

In countries where there is a tax secret, you could try to deduct your expenses from your own income, the interest here laying in your collaborators' fewer tax rate. On the other hand, this will complicate things for them (their parents deducting them from their income, social security, your paying them more because of their tax / social security expenses), so that sometimes, it could be preferable to just pay them net, and all the worse with your higher tax rate (I know in some countries, this will then create the legal problem of "illegal employment" or such, but after all, you don't really employ them): See your tax advisor in case, but in countries like Sweden, e.g., your superiors could ask you, why do you declare an income of just 2,000 bucks, we paying you 6,000 net?!" So my advice is, beware of unnecessary complications, don't be too stingy-smart-alecky here; the same applies to your treating your subcontractors.

So this is my idea from some years ago, and I think it holds steady. The core element here is, don't tell your superiors you have contractors: It's not your investment of 5,000 bucks out of the 6,500 you get from the corporation that will make them promote you in an exceptional way, but only their misconception that you're incredibly gifted will make your fortune.

Later on, they will assign you so many collaborators in-house that nobody will ever discover the little secret of your early years if you continue to delegate in a smart way.

General Software Discussion / On data storage and applications going cloud (Surfulater, Mindjet et al.)

« on: January 13, 2013, 06:35 AM »

I

1. Just today, the community has received news of its recent loss, two days ago, of a prominent data-availability activist, Aaron Swartz. Interesting here, the criminal prosecution authorities seem to have been much more motivated for him to be treated as a big-criminal than even his alleged victim, MIT. (Edit: Well, that doesn't seem to be entirely true: JStor is said to not having been insisting on him being prosecuted, but M.I.T. wanted him to to be made "pay" - so without them, he'd probably be alive. And it's ironic, that M.I.T., a VICTIM of these "resell them their own academic papers, and at outrageous price" scheme, made themselves prosecutors of Aaron, instead of saying, we're not happy about what he tried to do, but he tried for all of us. M.I.T. as another Fifth Column representative, see below.) So there is cloud for paid content, and your getting access without paying, big style, then perhaps even re-uploading gets you 20 or 30 years in jail if the prosecutors have their way, and that's where it goes (cf. Albert Gonzalez).

(Edit: First he made available about 20 p.c. of an "antecedents rulings" db, absolutely needed for preparing lawsuits in a common-law legal system based on previous, similar cases' rulings (they charge 8 cent a page, which doesn't seem that much, but expect to download (and pay for) perhaps 3,000 pages in order to get those 20, 30 that'll be of relevance), then he tried to download academic journal articles from JStor, the irony here being that the academic world is paid by the general public (and not by academic publishers or JStor) for writing these articles, then pays high prices (via their university, so again the general public pays at the end of the day (university staff and overall costs being public service on the Continent anyway, and in the U.K. and the U.S., it's the students who finance all this, then charge 500 bucks the hour in order to recoup this investment afterwards). The prosecutor asked for 35 years of imprisonment, so Swartz would even have been to be called "lucky", had the sentence stayed under 25 years. (From a competitor of JStor, I just got "offered" a 3-page askSam review from 1992 or so, in .pdf, for 25 euro plus VAT if I remember well...))

(Edit - Sideline: There is not only the immoral aspect of making pay the general public a second time for material it morally (and from financing it to begin with) owns already, there is also a very ironic accessibility problem now that becomes more and more virulent: Whilst in their reading rooms, universities made academic journals available not only to their staff and their students, but also to (mildly) paying academics from the outside, today's electronic-only papers are, instead of being of ubiquitous access now, in most universities, not even available anymore to such third-parties, or not even bits of those texts can be copied and pasted by them, so in 2013, non-university academics sit before screens and are lucky to scribble down the most needed excerpts from the screen, by hand: The electronic "revolution" thus makes more and more people long for the Seventies' university revolution: the advent of photocopiers - which for new material, in most cases, ain't available anymore: Thanks to the greediness of traders like JStor et al, we're back to handwriting, or there is no access at all, or then, 40 bucks for 3 pages.)

2. On the other hand, there's cloud-as-storage-repository, for individuals as for corporations. Now this is not my personal assertion, but common sense here in Europe, i.e. the (mainstream) press here regularly publishes articles convening about the U.S. NSA (Edit: here and afterwards, it's NSA and not NAS, of course) having their regular (and by U.S. law, legal) look into any cloud material stored anywhere in the cloud on U.S. servers, hence the press's warning European corporations should at least choose European servers for their data - whilst of course most such offerings come from the U.S. (Edit: And yes, you could consider developers of cloud sw and / or storage as sort of a Fifth Column, i.e. people that get us to give away our data, into the hands of the enemy, who should be the common enemy.)

3. Then there is encryption, of course (cf. 4), but the experts / journalists convene that most encryption does not constitute any prob for the NSA - very high level encryption probably would but is not regularly used for cloud applications, so they assume that most data finally gets to NSA in readable form. There are reports - or is it speculations? - that NSA provides big U.S. companies with data coming from European corporation, in order to help them save cost for development and research. And it seems even corporations that have a look upon rather good encryption of their data-in-files, don't apply these same security standards to their e-mails, so there's finally a lot of data available to the NSA. (Even some days ago, there's been another big article upon this in Der Spiegel, Europe's biggest news magazine, but that wasn't but another one in a long succession of such articles.) (Edit: This time, it's the European Parliament (!) that warns: http://www.spiegel.de/netzwelt/netzpolitik/cloud-computing-eu-bericht-warnt-vor-ueberwachung-durch-die-usa-a-876789.html - of course, it's debatable if anybody then should trust European authorities more, but it's undebatable that U.S. law / juridiction grants patents to the first who comes and brings the money in order to patent almost anything, independently of any previous existence of the - stolen - ideas behind this patent, i.e. even if you can prove you've been using something for years, the patent goes to the idea-stealing corporation that offers the money to the patent office, and henceforward, you'll pay for further use of your own ideas and procedures, cf. Edit of number 1 here - this for the people who might eagerly assume that "who's nothing got to hide shouldn't worry".)

4. It goes without saying that those who say, if you use such cloud services, use at least European servers, get asked what about European secret services then doing similar scraping, perhaps even for non-European countries (meaning, from GB, etc. straight to the U.S., again), for one, and second, in some European countries, it's now ILLEGAL to encrypt data, and this is then a wonderful world for such secret services: Either they get your data in full, or they even criminalize you or the responsible staff in your corporation. (Edit: France's legislation seems to have been somewhat lightened up instead of being further enforced as they had intended by 2011. Cf http://rechten.uvt.nl/koops/cryptolaw/cls2.htm#fr )

5. Then, there are accessibility probs, attenuated by multi-storage measures, and provider-closing-down-the-storage, by going bankrupt or by just commercial evil: It seems there are people out there who definitely lost data with some Apple cloud services. (Other Apple users seem to have lost bits of their songs bought from Apple, i.e. Apple, after the sale, seem to censor unwanted wording within such songs - cannot say for sure, but read some magazine articles about such proceeding from them - of course, this has only pittoresque value in comparison with "real data", hence the parentheses, but this seems to show that "they" believe to be the master of your data, big-style, AND for the little, irrelevant things - it seems to indicate their philosophy.

(Edit: Another irony here: Our possible data is generally deemed worthless, both from "them", and from some users (a fellow here, just weeks ago: "It's the junk collecting hobby of personal data."), whilst anything you want or need access to (and even a 20-years-old article on AS), deemed "their data", is considered pure gold, 3 pages for 40 bucks - so not only they sell, instead of just making available to the general public its own property, but on top of that, those prices are incredibly inflated.

But here's a real gem. Some of you will have heard of the late French film auteur, Eric Rohmer, perhaps in connection with his most prominent film, Pauline at the Beach. In 1987, he did an episodic film, 4 aventures de Reinette et Mirabelle, from which I recommend the fourth and last episode to you which on YT is in three parts, in atrocious quality but with English subtitles, just look for "Eric Rohmer Selling the Painting": It's a masterpiece of French Comedy, and do not miss the very last line! (For people unwilling to see even some minutes of any French film, you'd have learned here the perfect relativeness of the term "value" - it's all about who's the owner of the object in question at any given moment.) If you like the actor, you might want to meet him again in the 1990 masterpiece, La discrète (There's a Washington Post review in case you might want to countercheck my opinion first. And yes, of course there's some remembrance of the very first part of Kirkegaard's Enten-Eller to be found here.)...)

II

6. There is the collaboration argument, and there is the access-your-data-from everywhere, without juggling with usb sticks, external harddisks and applics like GoodSync Portable and - I'm trying to be objective - where there is a data loss problem, and thus a what-degree-of-encryption-is-needed prob too: Your notebook can be lost or be stolen, and the same goes for these external storage devices. But let's assume the "finder" / thief here will not be the NSA but, on most cases, not even your competitor, but just some anonymous person dumping your data at least when it's not immediately accessible, i.e. here, except for special cases, even rudimentary encryption will do.

7. I understand both arguments under 6, and I acknowledge that cloud services offer much better solutions for both tasks than you can obtain without these. On the other hand, have a look at Mindjet (ex-MindManager): It seems to me that even within a traditional workgroup, i.e. collaborators physically present in the same office, perhaps in the same room, collaboration is mandatorily done by cloud services and can't be done just by the local workgroup means / cables - if this is correct (I'm not certain here), this is overly ridiculous or worse, highly manipulative on part of the supplier.

8. Whenever traditional desktop applications "go cloud", they tend to lose much of their previous functionality within this process (Evercrap isn't but ONE such, very prominent example, but there are hundreds), and the arguments, "we like to hold it simple" and such idiotic excuses, and even when there's highly profession developer, as is in this case of Neville, it seems that the programming effort for the cloud functionality at least heavily slows down any traditional, "enhancement" or even transposition programming of the functionality there has been - of course, how much transposition is needed, depends on the how-much-cloud-it-will-be part of the venue of that particular sw. As a general rule though, users of traditions sw going cloud use a lot of functionality and / or have to wait for years for their sw to recoup afterwards, from this more or less complete stalling of non-cloud functionality. (Hence the "originality" of Adobe's CS's "cloud" philosophy where the complete desktop functionality is preserved, the program continuing to work desktop-based, with only (? or some additional collaborative features, too, hopefully?) the subscription functionality laid off cloudwise.

III

9. Surfulater is one of the two widely known "site-dumping" specialist applics out there, together with the German WebResearch, the latter being reputed "better" in the way, it's even more faithful to the original for many pages being stored, i.e. "diffult" pages are rendered better, and in the way that it's quicker (and quicker perhaps especially with such "difficult" pages), whilst the former is reputed to be more user-friendly in the way of everyday handling of the program, sorting, accessing, searching... whatever. I don't have real experience (i.e. over short trial) with either program, so I the term here is "reputation", not "facts are...". It seems to be common knowledge, though, that both progs do this web page dumping much better than even the heavyweights in traditional pim world, like Ultra Recall, MyInfo, MyBase, etc.

10. Whilst very few people use WebResearch as a pim (but there are some), many people use Surfulater as a general pim - and even more people complain about regular pim's not being as good for web page dump, as these two specialists are. For Surfulater, there's been going on that extensive discussion on the developer's site that has been mentioned above, and it seems it's especially those people who use the prog as a general pim, are very affected by their pim threatening to go cloud, more or less, since it's them that would be most affected by the losing-functionality-by-going-cloud phenomenon described in number 8. Neville seems to reassure them, data will be available locally, and by cloud, which is very ok. But then, even Surfulater as it is today, is missing much functionality that would be handy for making it a real competitor within the regular pim range, and you'll be save in betting on these missing features not being added high-speed too soon: the second number 8 phenomenon (= stalling, if not losing).

11. So my personal opinion on Surfulater and WebResearch is, why not have a traditional pim, with most web content in streamlined form, i.e. no more systematic dumping web pages into your pim or into these two specialist tools, but selecting relevant text, together with the url and the download date/time "stamp", and pasting these into your pim, as plain text you will then format according to your needs, meaning right after pasting, you'll bold those passages there that will have motivated you to download the text to begin with, and this way, instead of having your pim, over the years, collect an incredible amount of mostly crap data, you'll constitute yourself a valid respository of neat data really relevant to your tasks. Meaning, you'll do a first data focussing / condensing of data right on import.

12. If your spontaneous reaction to this suggestion is, "but I don't have time to do this", ask yourself if you've been collecting mostly crap up so far: If you don't have 45 or 70 sec. for bolding those passages that make the data relevant to you (the pasting of all this together should take some 3 sec. with an AHK macro if really needed, or better, by an internal function of your pim, which could even present you with a pre-filled entry dialog to properly name your new item here), you probably shouldn't dump this content into your pim to begin with. Btw, AHK allows for dumping even pictures (photos, graphics) from the web site to your pim, 1-key-only (i.e. your mouse cursor should be anywhere in the picture, and then, it'd be one key, e.g. a key combination assigned to a mouse key), and eventually, you pim should do the same. Of course, you should dump as few such pictures as is absolutely necessary, to your pim, but technically (and I know that in special cases this would be almost necessary, but in special cases only), it's possible to have another AHK macro for this, and also your pim could easily implement such functionality.

13. This resets these two specialists to their specialist role: Dumping complete web pages in those rare cases this might be necessary, e.g. mathematicians and such who regularly download web pages with lots of formulas, i.e. text and multiple pictures spread all over the text - but then, there are less and less such web pages today, since most of them, for such content, have links to pdf's today instead, and of course, your pim should be able to index pdf's you link to from within (i.e. it should not force you to "embed" / "import" them to this end). Also, there should be a function (if necessary, i.e. if absent from your pim, by AHK) that does this downloading of the pdf, then linking to it from within your pim, 1-key style, i.e. sparing you the effort to first download the pdf and then do the linking / indexing within your pim, which is not only an unnecessary step but also will create "orphaned" pdf's, that will not be referenced to within you pim).

IV

14. This means we need better pim's / personal and small workgroup IMS / information management systems, but not in the way, "do better web page import", but in the way, "enhance its overall IM functionality, incl. PM and incl. half-automatted web page CONTENT import (incl. pdf processing)". Please note here that while a "better" import of web-pages-as-a-whole is an endless tilt at windmills that blocks the programming capacity of any pim developer to an incredible degree, ever one, and worse today than it ever did, such half-automation of content dumping / processing is extremely simple to implement on the technical level. This way, pim developers wouldn't be blocked by such never-ending demands (and never-ending almost independently to their respective efforts to fulfill them) to better reproduce imported web pages (and to quicker import them) anymore, but could resume their initial task, which is to conceive and code the very best IMP possible.

15. Thus, all these concerns about Surfulater "going cloud", and then, how much, are of the highest academic interest, but of academic interest only: In your everyday life, you should a) adopt better pim's than Surfulater is (as a pim), and then enhance, by AHK, core data import (and processing) there, and then b) ask for the integration of such feature into the pim in question, in order to make this core data processing smoother. Surfulater, and WebResearch, have their place in some special workflows, but in those only, and for most users, it's certainly not a good idea to constitute web page collections, be it within their pim, or be it presumably better-shaped collections within specialized web page dumpers like Surfulater and WebResearch, whose role should be confined to special needs.

V

16. And of course, with all these (not propanda-only, but then, valid) arguments about cloud for collaboration and easy access (point 6), there's always that aspect that "going cloud" (i.e. be it a little bit, be it straightforward) enables the developer to introduce, and enforce, the subscription scheme he yearn so much for, much better than this would ever be possible for any desktop application, might it offer some synch feature on top of being desktop, or not. Have a look upon yesterday's and today's Ultra Recall offer on bits: Even for normal updates, they now have to go bits-way, since there's not enough new functionality even on a major upgrade, so loyal, long-term users of that prog mostly refuse to update halfprice (UR Prof 99 bucks, update 49 bucks, from which, after payment processor's share, about 48,50 bucks should go to the developer), and so, some of them at least "upgrade" by bits, meaning a prof. version is starting starting price 39 bucks, from which 19,50 go to bits, 50 cent or something to the payment processor, leaving about 19,00 bucks for the developer; it's evident that with proper development of the core functionality (and without having to cope with constant complaints of the web page dump quality of his prog (Edit: prices corrected)), the developer easily could get 50 bucks for major upgrades of his application, instead of dumping them for 19 bucks: And that'd mean, much more money for the developer, hence much better development quality, meaning more sales / returns, hence even better development...

17. As you see here, it's easy to get into a downward spiralling, but it's also easy to create a quality spiral upwards: It's all just a question of properly conceiving your policy. And on the users' side, a rethinking of traditional web page dumps is needed imo. It then would be a faux prob to muse about how to integrate a specialist dumper into your workflow: rethink your workflow instead: Three integral dumps a years don't ask for integration, but for a link to .mht.

18. And yes, I got the irony in downloading for then uploading again, but then, I see that's for preserving current states, while the dumped page will change its content or even go offline. But this aspect, over the policy of "just dump the content of real interest, make a first selection of what you'll really need here", in most possible cases could only prevail for legal reasons, and in these special cases, neither Surfulater nor WebResearch are the tool you'd need.

VI

19. As said, the question of "how Neville will do it", i.e. the distribution between desktop (data, data processing) and cloud (data, data processing again), and all the interaction needed and / or provided, will be of high academic interest, since he's out to do something special and particular. But then, there's a real, big prob: We get non-standardization here, again. Remember Dos printer drivers, as just one but everyday example of the annoyances of non-standardization? Remember those claims, for this prog and that, "it's HP xyz compliant"? Then came Windows, an incredible relief over all these pains. On the other hand, this buried some fine Dos progs since soon no more drivers for then current printers, and other probs; just one example is Framework, a sw masterpiece by Robert Carr et al. (the irony here being that Carr's in cloud services today).

20. Now, with the intro of proprietary cloud functionality, different for many such applications going cloud today, we're served pre-Windows incompatibility chaos again, instead of being provided more and more integration of our different applications (and when in a traditional work group, you at least had common directories for linked- and referenced-to shared files). That's "good" (short-term only, of course) for the respective developers, in view of the subscription advantage for them (point 16), but it's very bad for the speed of setting-in-place of really integrated workflow, for all of us, i.e. instead of soon providing a much better framework for our multiplied and necessarily more and more intervowen tasks (incl. collaboration and access-from-everywhere, but not particular to this applic or that), and for which from a technical pov, "time's ripe", we have to cope with increasing fractionization in what developers in search of a steady flow of income (cf. the counter-example in point 16, and then cf. Mindjet) think what particular offerings are "beneficial" for us.

All the more so you should discard any such proprietary "solution" from your workflow when it's not necessarily an integral part of that. Don't let them make you use 5 "collaborative" applications in parallel just because there's 5 developers in need of your subscription fee.

General Software Discussion / Re: Surprised there's not any talk on Surfulator going Cloud

« on: January 13, 2013, 06:32 AM »

I don't want my analysis of what Surfulater and other such offerings will probably bring to us, to be buried within a thread titled with "Surfulator", so I open up a new one.

Found Deals and Discounts / Re: Action Outline on offer at Bits du Jour, 2011-11-21

« on: January 10, 2013, 03:30 PM »

rjbull, thanks for the warm re-welcome!

I never bought dtsearch, but it's decidedly the best of those search tools: It's all about finding things or not, in proprietary file formats, and especially with accented characters like ü and é and ù, here, dtsearch excels whilst Copernic and X1 are very bad (for standard file formats, X1 seems first-rate though). (For Archivarius, I know many people are fond of it; in my special case, it didn't work well, then crashed...) - As said here or elsewhere, the problem with external search tools is, you then have to go back to your "db" / "pim" / text program, etc., and do another, now more specific search, in order to get to the real "hit", in your application (it occurs to me at this moment that some search tools might be able to have you "go" right to that "hit", from a mouseclick in their hit table, when it's standard progs like Word - but forget this for more exotic file formats).

Both Ultra Recall and MyInfo allow for Boolean search, as does IQ and only SOME other pim's: I remember one which had it, but without "no", and no hit table then; another had the hit table, but no Boolean search, and so on - but it's no wonder that many people use UR or MI in spite all any respective problems of each they otherwise cause.

I once stumbled upon DT/TextWorks, and would be willing to pay 1,200 bucks for a prog that really "has it all", but I discarded it then because of their "ask us for a trial (instead of just downloading it) and for a quote (instead of giving the price) - so I never even got to a screenshot of it, let alone a trial. Then, it's a db, which means it's not a tree superposed upon such a db, as UR and MI and IQ and others are, and even the later AS got trees-on-the-fly (by first line, or by field content - very smart thing, the only prob being that with 5-digit record numbers, this regularly took minutes or even crashed (they dumped their forum because it really become much too much negative feedback from almost everybody). As I today said in my KEdit thread, lately it's MI that seems to leave UR trailing, not because MI was so good suddenly, but because there is steady if slow development, whilst UR don't do much upon their roadmap ("not much" being an euphemism for "nothing" here).

In the web, we use Boolean search all the time, in google (and let alone ebay or specialized sites like Dialog you mention), and most people do it even without knowing: In google, they enter two or three search terms, in order to refine their search from start on: a b is a AND be: people do it intuitively, there. (It's for OR that google asks for some knowledge, since that is an (a,b), far from intuitive but some of us know. Not a is -a, etc., and so it's possible to find things.

Whilst in a non-Boolean pim, you CAN'T search for a b, entering a b there would search for "a b", but not for records with a and b, so these desktop pims are mostly really three steps back from what we use, in the web, all time, even without paying attention.

And there's another thing, many such "basic" desktop pim's do not even allow for searching "just in the tree" / "text only" / both, but invariably search everywhere - but then, it's evident that a search "tree only" will perhaps render 5 hits, from which you choose the right one, whilst the same search "everywhere" will get you 200 "hits" in which then you'll have big problems to identify the one you need, without any possibility to refine your search with a second term that also must be within the same record, since in such progs, as said, a b will not work this way - so you are lucky when you remember a second here that also might be in that record you need, but which is only in 120 "hits"... so no discussion here, if a pim only allows for "normal search and then everywhere", it's to be qualified CRAP, whatever it other qualities might be.

As for Notefrog, if I understand this prog well (without ever having trialled for more than just 2 or 3 minutes or so), it relies exclusively upon searching, since there is no tree: at the left, it's the hit table!

Since I used askSam for almost 20 years or so, from early DOS on, and it got its tree-on-the-fly from version 6 only, I know both worlds: search-only (but in the spectacular AS way), and trees, and I must say, I function with trees, holding together related info, and also offering "a dedicated place" for your info, i.e. within a big tree, you remember, more or less (depending also on the good construction of your tree), "wherearound it must be", and I rely very heavily on this feature, i.e. I "search" for my contents by approaching them physically, by opening up headers, then sub-headers: for me, this is an extremely natural way of getting to info.

On the other hand, my memory for real searching often fails me, and even Boolean search doesn't help too much: I remember a search term: hundreds of hits; I suppose another one should also be in those records (but if I'm mistaken here, I'll inadvertently exclude the record I'm searching for!), and even with the combination, I get too many hits, and then I don't really see a third search time that might have been around there - but perhaps not? And then, with all those more-or-less-synonyms no such current program handles equal!

So I must say that with searching, even with good searching, I've got some problems, hence my interest in sophisticated trees. But of course, searching is of the highest interest wherever you have put something OUT of its tree-heading-subheading "way": somewhere else! There, with a hit table and Boolean search, it's 100 times better than with "just normal search and everywhere": I have to spend several minutes on such a search, sometimes, but I find the thing, in the first case; with only basic searches an 100 hits to then be accessed one by one...

But my point here is, even Boolean search isn't good enough, it should include "semantic search", i.d. half-automatic synonym provision. Meaning: You search for dog, and before searching, the program would list up breeds, "puppy", "cute", "ferocious", whatever, in order for you to decide which of these terms should be searched for (and it's even possible to have some of these in different OR groups).

Couple this with an index, and the prog would only present you such breeds (in our example) that really are present somewhere in your texts, and not unnecessarily clutter this first "what to search table" with search terms, taken from a dictionary but which ain't in your text!

Then, this, for several languages, and for combinations of languages. And finally, you could give the program hit numbers when you work within the "what to work" window, meaning your processing the search terms there will give you real-time results how many hits you'd then get.

I know of a list one very early Dos text db which offered some semantic search (but not in the sophisticated I describe here) - askSam was a best-selling program then and "killed" it, by way of most people then buying AS instead. One of the big ironies here: After having got its then really comfortable market position it then held for years, AS was NOT able to implement any semantic search functionality. So today, we're worse off than we were 20 years before, except for visuals: Of course, Windows ( /Mac ) is pleasant to the eye when Dos gets quickly unbearable because visually at least, we get so much "more" today.

Thank you very much, also for the free Dos progs link. I've encountered another such link, with defunct sw, Windows and Dos combined, like early Wordstar versions, 1-2-3 and such, but wasn't really enthousiastic about these. Your link is for defunct progs that are much more special and much more interesting, incl. Inmagic Plus there, citation: "These are not trivial products." - right they are. Will have a good look into this site!

Btw, this semantic search, google does it all the time for you, and even without taking your advice upon them doing so. Hence the interest of having such a system at home, but with you controlling what's found here, and what's discarded from the results. But no, even those specialised tools, incl. dtsearch, don't do semantic search, let alone let you control it. And this, 35 years after the intro of personal computing. /rant

General Software Discussion / Re: Text post processing with KEdit, etc.

« on: January 10, 2013, 12:45 PM »

The intro to the New Yorker speaks of "macro-driven" or something - of course, if his special version even gets enhancements from within...

Even with other editors - and I should have mentioned this particular problem of all editors - it's all about "wrap lines" vs. "long lines", the latter being the paragraph of your text set to one long line in order to work on that paragraph, as a "line" - problem here is the length of such lines; if you filter long lines by some term, these "hits" then will NOT be in the center of your screen, with equal amounts of "context" before and behind, but there will then be a lot of horizontal scrolling, which is awful. (But this horizontal centering of "lines with hits", around the hit, is what several dedicated search tools do, as well as some specialized translator tools.)

That's why you "flatten out" your paragraphs within your "word processor", then export, and then, in KEdit or such, you should do ("soft") word wrap, and THEN only you filter "lines", which would not constitute paragraphs, but more or less aleatoric parts of your paragraphs - I hope that KEdit be able to filter such "sub-lines" after doing "soft" word wrap there? (I tried it not for such texts, but for data mining, where it "failed" for me for the above-mentioned reasons.)

THE (= TheHesslingEditor) is mentioned by some, but in order to play around with such an editor, KEdit is fine since there is no trial period here, just absence of storage for files bigger than a few lines, so you can load files of any size into KEdit and then play around with them, and even paste the final results back into any other editor / text program; of course, abusing the trial this way in a systematic way, to do work with it, would be illegal.

As with many other sw, KEdit should have been further developed: All the above-mentioned negative points could and should have been exterminated over the years, incl. the formatting prob. In fact, some time ago, I searched very seriously for an .rtf-capable editor, but didn't find any. It has been only afterwards that I understood the interest of html export even when you do not bring your text to the web afterwards, html export being much more macro- and editor-friendly, for further processing (and even for further html-"upgrading") than .rtf export, so .rtf is more or less a defunct format: much too complicated in practice, and not stable enough - it hasn't been after I had written complete macros in order to clean up such .rtf exports, that I got aware that it wasn't even stable enough, let alone all the fuss, whilst html is much less chaotic, and this is an .rtf problem, whilst the lack of stability could be the fault of my exporting program of course.

I could write similar things about the only available Warnier-Orr sw, called b-liner: There also, we've got tremendously good ideas, but end of development, with many details never worked out (and, in b-liner's case, bugs that'll remain forever); it's a pity so many sw outstanding from the crowd ain't developer further: Development stops when the point of no (further financial) "return" for possible further development is reached, and so they never attain real maturity.

But xtabber, if you use KEdit on a regular basis, why not share some tips and tricks? Perhaps KEdit's possibilities go further than I discovered with my playing around with it.

As said, it's a very intriguing concept, and then you realize it can't do all these task you thought it could when you first read its description. I'm not calling them liars, it's just that such features trigger some wishful thinking that then is not fulfilled, because the real sophistication of which such features are theoretically capable of, is then not implemented. (And yes, I know that the last 20/30 p.c. of realization of good ideas take as much work as the realization of the previous 70/80 p.c. - but why everywhere we look in sw, we find ourselves with just "promising", instead of oustanding, sw, and this applies to every field in sw (and also when there's enough money to realize the missing 30, 20 p.c.: Cf. my rant re MindManager for an example, so this is not a 1-developer-house-only phenomenon).)

Messages - helmut85 [ switch to compact view ]

General Software Discussion / Cat Videos and Private Outsourcing - 2 Variants in Ostensible Brilliance

General Software Discussion / On data storage and applications going cloud (Surfulater, Mindjet et al.)

General Software Discussion / Re: Surprised there's not any talk on Surfulator going Cloud

Found Deals and Discounts / Re: Action Outline on offer at Bits du Jour, 2011-11-21

General Software Discussion / Re: Text post processing with KEdit, etc.