topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • September 21, 2018, 10:43 AM
  • Proudly celebrating 13 years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - peter.s [ switch to compact view ]

Pages: [1] 2 3 4 5next
1
General Software Discussion / Re: OneNote is now free
« on: February 18, 2015, 09:56 AM »
From outlinerimpostors.com: "Posted by Stephen R. Diamond - Feb 16, 2015 at 02:47 AM - They’re being incredibly stingy with their “free” version: They still don’t allow it to open new notebooks except on OneDrive."

Being on XP (on which ON 2013 does not run), I cannot confirm, but this seems to be VERY important information.

I've got ON 2003, together with a pc, and ON 2003 is shit; deinstalled it almost immediately. On the other hand, more recent versions seem to have been MUCH better, at least in some respects: It's been told that ON's capabilities re audio/video and, more important for most users, re text recognition / ocr, are outstanding, or to be more precise, are comparable with those of EverNote only, with which (to a lesser degree in the case of ON) it shares the incredible "flatness" of possible IM.

Now you know there are dozens of traditional, so-called "2-pane" outliners, which all LACK these functionalities, and as for EN, at least, it's known that there are MILLIONS of paying users, ON being a program that will probably be paid for by 95 or more p.c. of its customers by buying some "Office" package in which ON is same part (whilst Outlook is not, indicating that MS seems to have given up trying to sell OL to the non-corporate clientele (since the price for OL alone is outrageous).

On the other hand, all those traditional 2-pane outliners combined are sold perhaps at some 20,000 or 30,000 pieces (make it 50,000 if you want, nobody knows, I'm extrapolating from the reach of their respective fora (or lack of such)), compared with the millions of sw copies MS and EN have brought to people in need of some "outlining" / basic IM for web snippets and all that.

Also, there is the aspect of some traditional outliners, and then ON, too (? - but not EN?), INDEXING EXTERNAL pdf's, Word files and such, which is obviously a VERY important aspect in all this (but which curiously does not bring THAT many customers to traditional outliner makes that offer that feature).

Now again, from the above, I had been musing in the past already to what degree a REALLY FREE ON would have impact on the outliner market as a whole: Judging from the success of EN, and from the above-mentioned superiority of both EN and ON in some respects over their more traditional competitors, the WORST is to be feared for traditional outliners, which will have simply be LEFT BEHIND by EN / ON, and the fact that neither of those offers "real hierarchy" (i.e. ON's hierarchical storage is awkward and rather flat, whilst any hierarchy in EN is simply inexistant (from what I've been explained)) does not have been a real obstacle to either EN (for which that's a proven fact) or ON (where it's always been fascinating to see that MS "has" (???) to give away ON for free, whilst they NEVER give anything for free whenever they are able to make some bucks with/out of it). (To TRY to explain this a little better, you could assume that EN is consistent in their abolition of hierarchical storage, where ON tries to "serve both worlds", i.e. flat, AND "deep" information structuring, and hence (and obviously, cf. their giving it away) much less convincing).

All this said, there's another factor playing here that cannot be underestimated: Whilst many people now judge ON sort of a "Trojan horse" or at least don't like the idea they don't have their things on their own hdd anymore, EN's success clearly shows that for any man or woman thinking along these lines, there are 200 or 500 letting go of it, and being quite happy with web-based-only/or-primarily* storage...

* = your working files in the web, with perhaps some backup even on your hdd

... which would better explain WHY MS just SEEMS to repeat their previous mistakes: Not making available ON-free-version, even by option, for exclusive-use-on-hdd, does NOT seem to cut them off from that many possible users indeed, in light of the above. (Note that most traditional outliners do NOT offer web / group functionality, which may have been deeply cutting into their sales figures for quite some time now.)

And yes, it's perfectly possible for MS to PUT AN END, ANYTIME, to ON-web being free. I'm not alleging they would then steal your data; in fact they would never do such a primitive thing. But they will get you into some sort of "subscription", even for ON, when time is ripe for them to do so, and be it "just" by linking ON to some "superior-value" subscription, i.e. ON not being contended within the lesser, "basic" ones - as for your data, it will forever stay available in "reading mode", and even will stay "exportable" (in some quite basic formats you'll be certain to not fall in love with, be assured); thus, ON-free users will subscribe the needed subscriptions in order for ON "actively" staying available.

Whatever, MS killed quite SOME superior sw makes, or then simply bought them, too (Visio), and traditional outliners seem to be quite doomed indeed... as I have said some time ago over at outlinerimpostors.com... but with MS' latest moves re ON, their willingness to overtake that outline market, too, has become even more evident than it had been before.

It'll be of quite some interest to see if they try to follow EN's concept of being totally flat (before trying to kill EN, that is, and that goes without saying), or if they offer some more depth than than they currently do, and then, in which way they will bring that to us (since they are smart enough in order to have "checked" that their current, hybrid outlining structure isn't that well received by anybody, and harms their overtaking the market).

(Over there at outlinerimpostors.com, they censor any try to look ahead or look beyond the obvious, thus I dare annoy fellow readers here on DC with such musings, but I promise I'll only do it here and there and in the appropriate threads, not willy-nilly. Also, a classic afterwit, but it's really spot-on, onto those and onto such people, see my new motto over here; no pun intended to fellow readers, or only some of them, speaking of a minimal minority here.)

2
4wd, thank you for this clarification. As for memory M, the problem almost exclusively lies within the FF memory block growing and growing, other applications just slowed down a bit (virtual memory on hdd since no more work memory), but it's FF alone that's frozen.

This being said, the Win version COULD play a role in that, nevertheless.

Wasn't aware even YT doesn't necessarily need Flash anymore, thank you! (But YT's Flash is not the problem after all, don't run YT in the background, and even then it's

dantheman, tremendous find, thank you so much, wasn't aware of such a tool, could probably bring LOTS of relief!

This being said, I must stay in a (at least, semi-) "controlled environment", so I just deinstalled Avira and reinstalled Avast, and as for now (always speaking of FF with de-activated Flash), it SEEMS to go much better, but cannot say for real yet. Again, with Avira at least, I've had those problems even with Flash de-activated (but to a lesser degree than with Flash running everywhere).

As soon as I'll know more about Avast, I then (only) will try Firemin, too.

3
superboyac,

Thank you for the confirmation. My first "expensive" (as said, 30 euro instead of the usual 5 or 6, not 200) hub was unreliable, BEFORE my stupid continuing by lightning and thunder, and my second one is unreliable, too

(just today, my mouse stopped working, worked again switched to the comp directly, stopped working when switched to the hub, works fine again since switched to the comp now: Now imagine this with usb sticks or worse, with external hdd's...),

but as said, with some lightning-and-thunder experience of my hardware, so I'm not really in my right to blame hardware for any hardware probs I might have.

As for your Datoptic recommendation, Continental Europa is treated like third world, and I often read reviews of hardware in the usual U.S. technics' review site, which is simply not available in Europe, and which will never even become available in Europe: Some (= several) hardware, of big interest for me, had fine reviews years ago, and now is defunct, respectively (= as said, several such occasions, not just one), i.e. not even available in the States anymore, and without ever having been imported into Europe in-between.

This is true for hardware with exotic functionality, i.e. special laptops, or special usb laptop screens (of which only some less interesting makes had made it to the Continent, before becoming unavailable, too), but especially, this is true for accessories of all sorts, and my personal conclusions from what I've seen in this respect is perhaps a little bit out-of-the-way but not entirely illogical:

In fact, I seriously assume that for accessories and such, it's the LESSER-quality things that get imported to Europe by preference (and no Datoptic hub, "of course"). At first sight, this would seem devoid of sense, but it is not: Importers ain't interested in quality, but in profit margin, and so it's quite natural even, on second thought, that crap "imports" much easier than quality stuff:

You've got some quality stuff which would cost the importer 15 euro (incl. VAT on importion; any further margin incl. about 20 p.c. of additional VAT on that margin (that's why it's called "value ADDED tax" as we all know), too). He resells it to some sellers, for 25 (= not enough margin if you ask him); they sell it for 35 (= not enough margin if you ask them); of course, for quality stuff, SOME users (me included) would be willing to pay 50 euro, but it's obvious that at 50 euro, sales numbers would fall to perhaps 30 p.c. (or even less) of what they are at 30 euro.

Now for some crap. Cost for the importer, vat included, 8 euro. We poor "end-users" then pay 30 euro plus postage for that crap, and importer and reseller have got their margins... AND their numbers.

And this explains a lot, as for what we have to live with, over here. (The irony being that more than just some of this stuff is not even built in the U.S., but in China... but even then, the same rule applies. Cf. Apple products / iPhone and their respective sale prices in the U.S. and in Europe: As said, we're treated like we liked to treat the Third World in the Fifties...)



wraith808 and mouser,

I'm not into unnecessary fights, and it's probably a big miunderstanding, not only about which user's post Ath's post was. Cf. current FF thread.

First, and because of my style, I've got enemies, over here, perhaps less so, but in particular over there, and some users read and/or post here and there. It's not only one user who I could identify, it's also some more user(s) writing here and there, and which I could NOT yet "identify" (= in the sense of knowing this avatar here is that avatar there), and such a situation triggers paranoid (over-) reaction to some point.

Most readers from there will also read here, whilst only a minority of here's readers will also read over there I suppose, that's why I take the liberty to explain in 1, 2 sentences: World-wide, there is nobody who writes about outliner theory as I do, and far from it; I'm not alleging by that that I'm some unique "outliner thinker" or such, but then, other "outliner theoreticians" have ceased to publish their findings or musings years ago, i.e. in some cases, their respective blogs always exist, but with newest entry in 2010 or even former.

Now instead of doing some inspiring discussing over there, on that ONLY available specialized outliner forum, or then, instead of reading-or-not-reading my posts, some fascist assholes (I say it like it this, since their speciality is acclaim others when they start the stoning-of-the-month, and that's exactly Middle Ages fascism) over there attack me again and again, on a purely meta-communication basis, i.e. never ever some argument re facts/argumentation, but always in the line of "forum owner, please silence that asshole for good"; it's very intriguing (or how could I say that better?) that some of the users over there claim to have highest-brow professions, e.g. they pretend to be university professors, a profession I take in high respect, but at the same time, their "contributions" over there are totally devoid of any intelligence whatsoever, i.e. in 5, 6 or 7 years such self-proclaimed "university professors" did not publish a single smart idea - not one, in so many years - over there ("wsp" being a blatant example, among others with albeit lesser pretensions re their professed background). (The same cannot be said of this forum, where there's a good mix of easy-going things and more valuable insight graciously being shared.)

The culmination point in this fact-free permanent slander has been reached just some 2 weeks ago when some asshole over there dared tell me I had "obviously not thought much about outliners" (citing from memory), whilst just my posts in that forum (except for them being deleted by that forum's owner, but that he didn't do yet) easily belie that person. Thus, not only my developments (borne, as said, from my own outliner in the Nineties, some 70k of code lines) are met with silence (no problem), but people openly declare them non-existent.

So this is the background of my hyper-sensitiveness when it comes to people saying - perhaps even inadvertently - a thread of mine is "OT", i.e. should not even have been published to begin with. (Cf. funny cat pics in a programmers' and sw users' forum - you know I accepted these being OT.) Perhaps even they did NOT say it, but it was just me that READ that INTO it:

As for the "OT", please read again, above:

"^ I was going to say quite off-topic *and* well worth a new thread, but the thread seems to be already fairly off-topic

+1, combined with the rewritten title and small-essay size post made me TL;DR; (again) huh"

As implied above, I'm beyond any acceptance of such "+1" when it comes to requests to silence me or sayings that I should not have mentioned some subject (length criticism being another story). But tomos is a non-native speaker, as I am, and it all was a language problem (again, with the above, totally unbearable outliner forum background and non-knowing who are those "doubly-writers" there and here):

In correct English, it should very probably have read, "but the thread seems to have gone [and not: to be] already fairly off-topic" - the "already" should have told me better, but in light of the above, I overlooked that in spite of re-reading it thrice - I'm verry sorry ; obviously, from my outliner forum experience and from some people furtively writing there and here, I now see "enemies" where there are none.


Now for the "bringing in traffic".

As said, I had understood you in the line of "your thread should not have been created in the first place". Then for the absence of advertizing. Now let's get serious, please.

The very first element to advertizing is traffic: done. Second point to consider: qualified traffic. Do we need to discuss this, for DC? There are not THAT many such fora in the web, just some "general pc help" fora, and then those "technics" sites, often offsprings from pc magazines: No pc experience sharing.

I don't know what percentage of readers here do also write here, to some notable extent, but I bet you've (/ dare I say, we've?) got a LOT of readers, and they certainly come back with some regularity, because of the non-irrelevancy of our posts.

So what? Can't we all grasp that all prerequisites for good advertizing revenue are present? Or is general traffic only so-so? But why then is DC very high in google's list for every subject that's treated here? mouser, why not start a thread giving some statistics out? Thus, I convene, you first would need some figures, then only we could start discussing if it's worthwile to try to up those numbers, and in case, how to do that. Your next step would be, identify competing ad prices: How much "coverage" sw developers and such would get here, by advertizing here, and at which cost? In other words: Is DC's reach to small so that advertisers would either to have too high ad prices, or that advertising revenues would stay minimal anyway (i.e. by applying prices in accordance with alleged "minimal" reach)?

Then, there are some sites getting money from bringing in contact developers and customers; some such sites do make TOO much money from this, so there should be some possibilities in that, too: There's some room for a site that would take its share if that's a more decent share.

You know, this reminds me of AHK's problem: Just some months ago, AHK_L top developer / admin said, "I've got better things to do than to work for free, upon your making money with AHK" (citing from memory again). Background was, for the xth time, AHK not allowing to scramble its code (just obfuscate it), and worse, AHK_L .exes now showing off the source code openly, without any de-"assembling" needed anymore.

Fact is, it would have been all so simple, and always could be: Have it free for own use and for everything you give away; collect some fee for anything that's sold... but make it as difficult as it possibly gets to steal your code. (I would have been happy to pay for AHK for some years now if I had had the chance to sell some macros; cf. many AHK programlets being given away by some fellow DC contributors: Selling or giving out should be your choice.)

Or this one: We all know developers leave a whopping 50 p.c. fee at bits, from the reduced price they have to grant in order to get impulse buys: original price 80 bucks, bit price 40 bucks, developer gets 20 bucks - well, he'll be happy to get 19 in fact (and more realistically, some 18.30 or something) since his payment processor will get some money, too.

Now, just thinking: Original price 80 bucks, DC price 52 bucks, payment processor gets 42 bucks (10 for DC), developer gets more than 40 bucks, which is quite some more money than 18.30 - win win for everybody (except for bits, which would get less developers piling up their programs for them).

In a word, DC seems to have real good "coverage" (tremendously good google results for as said almost everything), but as for now, it doesn't do anything about even just trying to "commercialize" this value, the irony being most people in the web being willing to do anything to get just a little fraction of DC's seo value.

Thus, the question should not be, what to do with unwanted traffic that just costs hosting fees, but obviously should be, how to capitalize on that traffic (and optimize it further, and so on).

And now, I'm going to delete my above post.

4
"You can change your FF settings to start with the tabs that were open when FF was last 'closed'. Also works nicely if it dies or needs to be killed."

Yes and no. In theory, you are right, and I tried this both in FF and Chrome. Problem is - and I'm speaking from experience here -, that will reload ALL of your previous tabs in a row, without your leaving the option of loading them one by one (or, let's say, in groups of 5 or 6 or such), and thus, after such a "full reload", my system is as unresponding as it will have been before killing the browser (since anywhere in the list, there is/are the 1 or 2 tabs with "100 p.c. cpu"), so there is nothing to be gained from this, in most circonstances. (I even set my Click & Clean accordingly, in order to not kill that list - well, if I knew where it was, I could at least manually process that list then...)

"Oh, and another, well meant, advise: Get rid of Windows XP. - There are no excuses left of keeping that OS in use for internet research (of all purposes mad), that task can easily be ported to an up to date Linux."

You're speaking of security considerations applying to browsing. I understand that, in fact that's why I gave in and installed AV sw. Problem is, my browsing is too much interwoven with the rest of my doings (= AHK macros for snippets' downloads, etc.), in order to be done by some Linux. Anyway, in Chrome I can kill single tabs (without knowing which one I'm killing, just selecting by its cpu eating), whilst in FF, it's "all or nothing"... and as explained above, after killing the process, "all" again, with the same problem as before.

But I appreciate your opinion/advice.

5
Thank you, Ath, I'm thankful for constructive comments.

Re XP: Yes, yes. I'm not saying you're wrong, but fact is that XP is a perfectly stable OS, and the ONLY reason everybody wants it gone, is that Bill can spread even more of our money to little cute black toddlers in sunny Africa.

This being said, what do I know about Avira free? Nothing. Thus it's perfectly possible that the interaction Avira-FF is doing 2/3 of the harm I currently endure from FF (which anyway is ridiculous in its 36th iteration's memory management). So I'm going to install Avast free again, and will report back in some 8 or 10 days; Avira vs. Avast being the non-"controlled" factor in my setup.

And of course, NoScript would always be an option in order to possibly kill UNWANTED js (but the unwanted variety only, and without any side-effects on regular js), were it not about its features/options being far from evident.

EDIT: Avast free from Avast > cnet = as we all know, to be avoided at all cost. google "avast free download" > filehippo.com (= 5th hit or so), never had any prob with them. And yes, it's the same, current version (I checked for that). That being said, I always download from filehippo wherever possible: for the time being, it's possibly the premier download site overall. (Knock on wood.)

6
tomos and Ath,

I'm very sorry.

I misunderstood your posts as a "why this thread to begin with?!!!"

Just a language problem, mixed up with bad experience from another forum where a bunch of mean idiots perennially howl for the one and only "meat" contributions being discarded, and this mix having made me make a bad judgment. Sorry.

7
First of all, and as you know, my working system is XP, and with "only" 2 giga of memory; I acknowledge that in/for 2015, this is decidedly sub-standard.

But then, as you know, too, there have been several threads here in this forum, and many more elsewhere, which treat the incredible sub-standardness of FireFox, re its inexistent memory management.

As said, I'm not into defamation, so I have to admit that in part, my probs could come from Avira free (formerly, I had used Avast free, even more intruding than Avast free), and also, I have to admit that my probs started with the very latest Adobe Flash update (16), which they offered in order to overcome (again, and, felt, for the 1,000th time) "security" probs.

I had installed that Flash 16, and then, after opening just SOME tabs in FF, I quickly not only ran out of memory, but had my system stalled for good, up to killing FF "process" by Win Task Manager, and by thus losing any tab = all the work I previously had put into searching for url's, links, etc. - it should be obvious for any reader that by opening some 12 or 15 tabs from search and links in previous "hits", you've got some "work done", which is quite awful to lose then.

I've always said, "you get what you pay for", and I've always acknowledged there are just SOME exception to that rule, but then, ALL of my experience backs this up, to 99.5 (= not: 99,9) p.c. of all cases, this rule applies perfectly, and FireFox seems to be the perfect example of TOTAL CRAP, delivered by some "volunteers", who like the idea that they are "giving out something valid for free", when in fact, they tell us, hey, dude, I know I cannot sell my shit, but ain't you willing to swallow it for free? Of course, I'm opening this thread not in order to defame FF, but in order to get new ideas about how to do things better, this whole forum being about that, right?

Thus, my very first reaction to FF being stalled* by that infamous Flash update was to deactivate Flash, and to observe things coming from that, for some week or so. Here I've got news for you: Flash, except for YT, is totally unnecessary, AND it's omnipresent (= "ubiquitous"), i.e. almost ANY web site, as poor-in-content or modest in scope it might be, there's virtually ALWAYS that line above my FF content window, "Do you allow FF to activate Flash for this site?" (or something like that, DC NOT doing this shit).

*= Of course, I've got plenty of room for "virtual memory M" by Windows, on c: (since my data, as said, is on some external hdd), and "virtual memory is managed by the system") - but notwithstanding, even if I allow a quarter of an hour (!!!) for any command to become effective, I always end up by killing the FF "process", after HOURS of waiting. At the same time, all other applications functions "quite normally", i.e. they respond to commands, but with that little delay you'd expect by my system's having replaced working memory by multiple hdd accesses, considering FF has eaten all the working memory. It's just FF that doesn't respond any at all.

And fact is, in more than a week, I NEVER had to tell FF to activate Flash, in order to get ANY useful info, from any of those several hundred pages all begging for Flash. (It's understood that for JavaScript, the situation is totally different: If you don't allow for JS, almost any web page of today will not work anymore, in any acceptable way. But again, don't mix up JS and Flash; JS having become a literally unavoidable "standard", whilst Flash is a simple nuisance, except for YT, and then, for rare cases in which you want to see some embedded "film" - IS propaganda? No thanks, and all the rest, no thank you either; let alone for indecently idiotic porn.)

Back to FF: My getting rid of Flash did NOT solve my probs. It's invariably "CPU 100 p.c." over hours, with Flash de-activated though, and as soon as I've got opened more than just 10 or 12 FF tabs; I assume these are JS scripts running, but then, even after MANY minutes, FF never tells me, "that JS is running, should we stop it?".

I have to say that I know about the existence of "NoScript for FF", but then, it's not obvious how to run that NS in some smooth way, just in order to intercept too-demanding scripts whenever they dare run, but leaving alone any menu "scripting" anywhere; do you

I wish to confirm again that I'm NOT speaking of porn or other crap sites, but that I'm just "surfing" among the most innocuous web sites you could imagine.

As for Flash, before deactivating Flash for good, I had tried Chrome, and I had the very unpleasant experience that with Chrome, and that incredible shit of Flash 16, all was as incredible awful as with FF and that incredible shit of Flash 16 (sic), if not worse (!), so it's obvious that Flash 16 is even worse than FF 36 (or was it 35? it's the current version all the same), but then, Chrome will allow your killing ONE tab running, whilst in FF, it's "all or nothing", i.e. if you decide to kill the FF "process", you will lose all your "search" work, too (since FF stalls your FF process (i.e. not your system as a whole, so it's obvious it's all a matter of FF's memory management), so it's not even possible to switch from one tab to another one in order to retrieve the respective url's, even manually).

Btw., WITH that incredible Flash 16, simple Flash sites (which in fact would not have even needed Flash to begin with, see above) brought FF to 1200 meg, then 1,500, then 2,000, 3,500 meg... in fact, Flash's memory demands are simply unlimited, and that's confirmed not currently (I admit), but from Flash users' experience back in August, 2014, i.e. some few Flash versions ago, and who say Flash of summer 2014 asked for unlimited memory, 6 giga, 8 gia, 10 giga... they were on systems of 8 or 16 giga of working memory, and they thought it was unbearable...

The only reason I cling to FF is the fact that "YouTube Video and Audio Downloader" is available for FF only (i.e. not Chrome), and that it's the ONLY YT downloader of my knowledge which lets you select best AUDIO quality, too (and not only best video quality, as its competitors do, at best) - but in the end, you can perfectly use FF for this YT downloading, whilst using Chrome for anything else, so that's "no reason".

Hence :

- Except for very limited usage (YT), Flash is totally useless and, short of viruses, the utmost nuisance on pc (or Mac) (and as usual, Jobs was first to identify this problem, AND to resolve it, for much of the systems he's been marketing)
- ( Similar things could be said about the really ridiculous and useless Adobe pdf viewer, but that's another story. )
- FF is to be considered liquid, stinking, green, morbid shit: If not even in iteration 36, software meets most basic standards, it will probably not meet them in iteration 100 either
- Chrome is "free", too, but we all know you pay with all your data... BUT: At least there, you KNOW WHAT price you pay for their "free" service, whilst FF "do it all benevolently", and obviously serve you perfect crap (whatever the reasons of FF being totally stuck, with 2 giga of work memory, and plenty of "virtual memory", your only alternative is to kill FF throughout if ever you want to get rid of some "CPU 100 p.c." over many, many minutes, with no end, instead of killing JUST SOME tabs going bonkers, is kindergarten)
- And yes, Avira free could be "in it" to some degree, too (= I had less problems, even with FF, when I "surfed" without any "protection") (but Avast free was really "unbearable", by their pop-ups (i.e. at least, I thought so, before my current problems with FF)... but perhaps, function-wise, they would always be preferable to Avira free, which is less intruding re pop-ups, but doesn't work as well with FF, then?)
- Any insight into NoScript for FF? Is there a chance to get it to stop JS scripts running amok but letting go of any "regular" JS script anywhere?

Your opinion/advice/experience is highly welcome.

EDIT:

Sorry, my mistake above, I just read:

"Allow www.donationcoder.com to run "Adobe Flash"?" - Should we not enter some overdue discussion re "Are site developers trying to do Flash even in pure-text pages utterly nuts?", right now?

8
Eleman and Innuendo, thank you for replying, and I think you're right.

Also, I particularly appreciate, "or apple account (if you have problems storing large amounts of money, so you have to dispose of it quickly), or microsoft account (if you are into crippled ecosystems)" - very funny (and 100 p.c. what I think but didn't express in so delightful a way).

As said, sw was IMEI-bound and very expensive; when applics cost 2-10 dollars, even IMEI-coupling would have been "acceptable". But I'm quite astonished I would need a Google account? Say I've got a monthly contract with a typical mobile telephone provider, I always thought I bought the applics for the physical device, like in the old days: coupling to some additional Google (or Yahoo or whatever?? Or then, if it's Android, it's Google - period?!) account is very new to me! Don't people think that's ultimately intrusive on their part? (Well, I suppose they very closely monitor what your apps do? Well. That's why our governments have got other phones, costing real money.)

Well, I had mentioned two other details:

- Don't count on the availability of ("original" or even "third-party") batteries forever
- This implying, don't buy "exotic" devices (or only if they come with batteries compatible with the most common devices there are)
- And implying, don't buy any "used" or "classic" device (bec/of its alleged superior quality or something): You'll run out of batteries in too short a time

- And: do common current smartphones work, connected to the mains, but WITHOUT a battery in them (as notebooks do)?

Just for the record, and since as said, the Nokia 9210/9210i does NOT work then, I took out the battery from my Nokia 6300 and connected it to the mains (battery charger / mains adaptor), and it did not work then either, and finally I took the battery out of my Nokia 3300 (= very old, very robust!), connected it to the battery charger, itself connected to the mains, and it did NOT work.

So this could be Nokia specifics: No (?) Nokia works without a (working) battery in it; or then, phones and smartphones in general don't work in that situation, whilst all (? well, several Toshiba, Sony, IBM/whataretheycallednow do fine) do?

And yes, you're right, in theory, the physical keyboards are superior, but in real life, they are so bad (perhaps not so for a child's hand) that almost any virtual keyboard cannot be worse, given a minimum size of the screen.

9
Very interesting insight and speaking from experience, thank you so much again!

As said, Paragon never let me down, but then, my 3 pc's are XP (cf. MilesAhead's post immediately preceding). EaseUS "let me down", but that was a hardware problem (see below). (And yes, I was speaking of home systems (without clarifying), but particularly appreciate the insight into some more elaborate scenarios.)

Above, the "I suggest always backing up two images from two different programs." (Steven Avery) - I think that's a tremendously worthwile idea, and which is absolutely to be followed. (The idea to use two different backup hdd's is quite widespread, for both variants, concurrently and alternatively, but to use several programs for it, on top of that, is as original as it seems to be a real step further.)

My main system got unusable some months ago, and I spent about 60 hours (! I more or less counted them) in order to re-install it all, together with all the settings and all that, so better backup strategies do interest me now.

I lost my (EaseUS) back-up (thus needing this complete re-install) due to a defective (and quite new) external hdd, and/or to many defective sectors over there, right within the backup (which by this went unusable IN FULL, whilst file-by-file backups, on the same hdd, were mostly fine (and the rest of them was fine on my internal hdd).

Why? Well, notebooks lack up usb jacks, so in those times (= would never do that again) I connected the external hdd to a usb hub; can't say/remember if I had been crazy enough to do the last backup that way, the hdd connected to the hub, and even the previous backup, some months (!) before, on the same hdd (! ("!" standing for "don't this this!)) was as unusable - I think for backups, I had connected the hdd "directly", not via hub, but why not imagine those sectors had been damaged at other occasions, "over" the hub?

I'm positive about hubs being able to damage external data: The same hub (quite expensive, 30-dollar-range, not 5 bucks) had also damaged a usb stick, with other data, to the point of me being forced to try almost 20 different data recovery tools

HDD Recovery Pro did NOT do it, for example. Little, free/cheap tools like Recuva et al. (Puran, Pandora, Glary, EaseUS, NTFS Undelete and many more) do NOT it, they are just handy for retrieving (not-yet-overwritten) files that ain't any more in your paper basket. TestDisk is pure hype, as far as I'm concerned: It's free, it's much "advertized", so I spent about 10 hours with it, but it did nothing for me, and there is a "forum", with no help when it comes to real problems. Stellar Phoenix (preposterous name if there is one) did nothing for me (not even Prof. for 99$), but I have to admit that at least, those trial versions show you BEFORE buying they're not able to do it - it's just that for 100 bucks, I would think they DID it. (Recuva is free, after all, so where's the price difference to be found again in delivery?)

Now for that data on my usb stick, how did I get the data back? (I had my data on that usb stick, and, most of it broken, on that backup hdd (in file-by-file synch backup), so I HAD to get to that stick data!)

- Ontrack Easy Recovery (as said above in another context, very successful image transfer: prof. data recovery service, and hence totally overpriced (it's just for a year!) do-it-yourself recovery sw... but which (very probably, I could of course not really try without buying, just saw it rebuilding the directories) delivers

- RecoverMyFiles (as the tool before, this tool as well showed it would do it, and the price is absolutely acceptable, NB: RMF is not UndeleteMyFiles which did not do it)

- Get Data Back (similar to RecoverMyFiles) - I finally did it with GDB (which did an image of the "broken" stick to my hdd (no, not the broken one), so that I kept the broken stick (which is always broken; GDB could very probably have repaired it, but:) in order to test any other data recovery tool, in similar/identical conditions.

So for similar situations, I highly recommend Get Data Back, and also RecoverMyFiles (whilst the Ontrack tool is simply not necessary and will possibly fail whenever the two other tools fail? Of course it may have some hidden capabilities not needed in my particular case?)

After this diversion, back to usb hubs: So my ("active") hub obviously did damage the usb stick (the web is "full" of similar experiences) and very probably my hdd, too, and of course, NO backup whatsoever should be made via a hub. (Well, there are hubs in the 300 dollar range, but I had thought that spending 30 Euro on a hub would get me "quality"... and to say it all, it even was my second such 30 Euro hub, the first losing connection again and again, by this not damaging the sticks and hdds, but the data to be written (but in those situations, I had been notified at least...).

From the above, you can deduct some other advice:

- Don't assume your data is safe if you have a usb stick as working repository and a (single) hdd as backup: If you're unlucky, the stick gets "broken" (and there is no guarantee the next stick "broken" in that way will be able to read from by the above-mentioned three programs), and then your backup (be it file-by-file or in just-one-big-file form: from which many of the above-mentioned backup programs, in their respective paid versions, can derive single files indeed) IS BROKEN, TOO.

- Whenever such a situation arises, working file broken, backup medium/file broken, too, you will be happy to have got a second backup, but which for most people (= for the very few people who have GOT such a second backup, to begin with) will be a more or less ANCIENT version, i.e. in regular backup scenarios, it's very difficult to assure you will have got TWO RECENT file backups; bear in mind, in this respect, that even better synch tools that do "versioning", will normally put less recent versions into neighboring folders of the "most recent version", which means, if the hardware in question "breaks", you will have to be "happy" with a quite old version somewhere else, if there is any. In other words, it's necessary to do file (= not: c: system) backups CONCURRENTLY and on TWO different DEVICES: I'll have a look if

- The problem here is, you would not really want to have hdd's running all day (which also will put them into any danger the pc itself is possibly exposed to), just for 80 sec. of backup in the evening, and to turn them on just for 80 sec. will wear them out in a similar way as if they had been running all day); this would indicate to do your "multiple daily" file backups on two different sticks, and have two concurrent weekly backups, at (almost) the same time, onto two different hdd's you connect just for this.

- If there is a space problem, well, it just occurs to me that you will want, in case, VERY RECENT backup files, but not necessarily within all their file system sub-structures: You could rebuilt that from your weekly, "fully-comparative" file backup. What you really need is a daily replication of all NEW and CHANGED files, and why not into some "DUMP" folder, considering that in 9 years out of 10, you will never need those dumped files again?

At this very moment, I have got my work files on another external hdd (c: being for system files / applications), and I do a weekly backup onto another external hdd, but a daily backup is blatantly absent in my current workflow: Why not "synch new files, and files changed today, to a stick (with automatic rename in case of two eponymous files)"? That should be perfectly possible to automate (since it's not done otherwise) with Syncovery (with which I'm happy indeed: as explained, it was my wrong strategy that caused desaster), and even a quite approximative daily backup that you will really do is so much better than a more elaborate scenario but that is not regularly done, right?

- I've also got to admit that in thunderbolts, IF the lightning was not just over me, I had a tendency to not cut off my pc, and over many years, this did not cause my any problem, but who knows? It's unanimously discouraged to do so, and I consider myself a fool to have taken those risks, since after all, that careless behavior could have been the culprit, too, or then, perhaps the stick was damaged by the hub, and the hdd by it being connected to the pc in a situation where not even the pc should have been running anymore?! (If you call this idiotic, you're right.)

10
Call me conservative; up to very recently I used two Nokia 9210i - why?

I

Two reasons, not at all related to each other, but equally important:

- I want a physical keyboard (ok, the Nokia kb is really bad, so this criterion is highly debatable), so the only other current alternatives would have been either other old smartphones (used ones), or that RIM stuff (changed their name but you know what I mean)

- I bought lots of expensive sw for those phones, and most readers will know that, it's smartphone sw developers who very early succeeded in forcing hardware linking (or what is it called?) to users: any mobile phone has got an IMEI number, and almost any (from my experience, 99 p.c. or more) sw for smartphones traditionally has been coupled to the IMEI in question: No (legal) chance even to de-install sw from phone 1 and THEN only install it to another phone: When your phone breaks, your expensive sw is dead.

I suppose this is also true for iPhones and Android (in fact I don't know), but the big difference is, there's a plethora of (also quite prof.) sw for both systems, costing between 2 and 15 bucks, when really useful smartphones-of-the-old-days sw came with prices much higher, and even into the 3 figures.

This being said, for sw developers, smartphones of the old days were a dream come true; it's just MS who today insist upon your sw licence being broken, together with your hardware, whilst decent sw-for-pc developers all allow for re-install when you change your hardware.

II

Now for batteries. As you will have guessed, I cannot use my (virtually "unbreakable": good old quality from the ancient times) Nokia phones anymore since I naïvely thought batteries would not become a problem, those "Communicators" having been sold by "millions", in very high numbers at the very least.

Well, I was wrong: Currently, they sell USED "Communicator" batteries for 3 figures, and my own little stock had come to an end, BEFORE I had figured out I should buy some additional supplies (and then, you cannot store "batteries" / cells (rechargebable or not) forever).

Ok, they now sell big batteries (and with quintupled capacity), with various adapters, even for those "Communicators", but buyer beware: Even if you're willing to use a smartphone connected with some crazy cable to some heavy battery in your pocket (well, in the old days a simple mobile phone was about 10 or 12 kg), this is not a solution since all (???) of these (= from their respective advertizing, not one will have the needed additional functionality indeed) will only work if you have got a healthy regular battery in your smartphone, too; in other words, the external battery can spice up your internal one, not replace it. Why do I know or think I now? (Perhaps I'm even mistaken???)

Now for the difference with many (all???) notebooks: I never had the slightest problem to connect my (over the years, multiple) notebooks to the current, and have them work fine, as long as the respective mains/power adapter was working correctly, long after the internal battery working and/or being available.

The same does not seem to be true with smartphones in general (???); at the very least, it's not true for my "Communicators":

It makes no difference if I have got a worn-out battery in the Nokia, or if I leave it out: Just connecting it to the power adapter (which in turn is connected to the mains of course, I'm not that a lunatic) will NOT do anything in order to my being able to start the phone, it remains just dead, and the same is true if I put the phone into its (equally expensive) "desk stand" (which in turn is connected to the power adapter). And since I've got two Nokias, several (worn-out) batteries, several power adapters, several desk stands, and know about permutations, I'm positive that my problems don't come from some broken smartphone.

In other words, my Nokias need a working internal battery in order to be able to take advantage from any external power supply, and from their respective ads, I suppose those external batteries will not make any difference; my question is, is this behavior typical for smartphones, or is it just typical for the dumbness of Nokia staff? (As we all know, Nokia is gone.)

If it's typical for mobile phones and / or smartphones in general, beware of investing too much into (even a well-sold) smartphone: Once you won't get any more batteries for that, all your investments in that phone will have been flushed.

III

So what I do for the time being? Went back to a combi of Nokia 6300 (har, har, batteries available as for now) and my old sub-notebook (with an internal umts card, reverting to "sleep state" in-between, and as long as the third-party cell will be alive) I hadn't really used any more for a long time:

Since those sub-notebooks are total crap: A regularly-sized notebook is difficult enough to type on (with 10 fingers, nor just 2 or 3) when in the office, you do right and use some decent, regular keybord, so it's obviously a very smart idea to buy some lightweight notebook for the road, but which has got a KB OF REGULAR SIZE (if not shape) - and don't forget the oh-so-useful (both for digit entering as for macroing!) dedicated keypad, and trust me about that; any sub-notebook (incl. those immensely pretty Sony sub-sub-notebooks that weren't continued though and now are available, used, for quadruple their price new) will be a constant and real pain-in-the-you-know-where: It's weight, not size that counts*, believe me, I'm judging from enough unpleasant first-hand experience.

IV

I just read, "Nikon kills third-party battery support", i.e. they probably put some additional electronics in their reflex camera preventing third-party battery makers from creating battery compatible cells: Another (for the consumer: very bad) "highly interesting" "development".


Your respective experiences / solutions would be very welcome.


*= this rule does not also apply in inter-human intimacy

11
Thanks for your points of view.

I

My point was not that MR is any bad, my point was, it's generally considered somewhat "superior", and I don't really know why it should be considered "superior", in comparison with its contenders. Also, my point was, marketing-wise, they do lots of things-good-for-them, but that doesn't translate into any advantage for the user.

II

As for speed, I'm positive, Macrium is NOT faster for creating full backups (since I did this consecutively and paid attention to that), and it even was (only) slightly slower; can't say for recreation of the backup since, as said, was not able to reinstall it.

As for the "as advertized" aspect, well, had I known the backup (from external hdd = second device anyway: I learned from bad experience) needed a special boot device (possibly even the backup itself, or then with lots of fuss getting to the backup, on a third device, incl. the comp itself), I would not have installed Macrium to begin with, and I didn't learn this very important detail neither from their site (could be "hidden" somewhere over there, probably, though), neither from quite numerous recommandations of Macrium where Macrium gets high (but quite unspecific) praise in web articles like "what is the best free backup-and-restore sw". Worse, Macrium did NOT tell me my backup would then, afterwards, been useless: It wrote the backup, but in the whole process of WRITING the backup, I did not get ANY info I also would have to create a boot device - they just told me so when I then wanted to use the backup, and this is unacceptable - thank god I wasn't in real need of that backup since it was a trial of mine only, also, as said, from bad experience beforehand.

III

GParted mentioned by 40hz is freeware; may consider to replace my EaseUS with it since EaseUS works faultless (did some real work with it), but is a little bit on the bloatware side - in fact, for Backup and Restore, I went back from EaseUS to Paragon (as said, both free versions) for that reason.

Also, Paragon leaves you alone with nagging for buying paid versions, which I think is really kind of them; whilst both EaseUS and Macrium DID nag me; on the other hand, it's clear as day they are entitled to some nagging after all since they make available really USEFUL sw for free.

IV

Re both speed and possible ease of use once it's all "running". Fact is, I only know the free versions where Macrium is inferior, not superior, by my standards (explained above, and without any new info why I perhaps could be wrong about that, so far). The irony in this is, would I KNOW the paid versions (again, Macrium is more expensive, also compare prices for more than one pc), I would very probably be willing to pay (perhaps even for Macrium), since it's in incremental / differential backup where possibly big differences between these contenders would appear, and which could perhaps even justify a price difference, but at the very least could ease up your choice between them. But it seems there is no valid comparison between those paid versions yet.

And even if Macrium-paid is superior, which is far from being established, saying Macrium-free is superior (and all the more so against the above evidence), would be called successful image transfer (Mercedes Benz S vs. A for a better known example), but smart people as we are should not fall for that.

12
Thank you, xtabber, for your very informative post!

So 010 seems to be somewhat superior indeed for people who know exactly what they do (which is difficult when it comes to binary files since e.g. if you search the web for something like "how to edit binary text files in a hex editor without breaking them", you will not get any relevant hits. (From my experience, when you try to replace n characters with n plus x or n minus x chars, it's already broken, even if x is quite small.) The header thing seems interesting, and the running processes thing, too. As for the easy right-click opening-as-binary of (not-running) files, emEditor can do that, too, but it's true, emEditor's "life" licence has become somewhat "expensive", the double quotes being there bec/of 150 bucks being regular price for some other editors just for their current version (which in some cases is not even developed further anymore), so this puts that price into a more favorable perspective, whilst on the other side, most current editors are not priced in that range anymore though.

On a personal note, I own some quite expensive, "programmable"/scriptable editors and did quite some stuff on texts within them, incl. multiple buffers (i.e. files just in memory, not on screen, too), and it was LOTS OF scripting then, bec/I stupidly avoided regex then. Since I've become quite "fluent" with AHK scripting, I finally delved into regex, and I very quickly discovered that for elaborate text processing, 1) regex is the tool of your choice (in AHK: regexreplace is as important as is regexmatch, and this remark applies to any other regex implementation, too), 2) it's available directly both in many editors and programming languages, too, and 3) by the latter, there is no need to shift text bodies from your scripts into an editor to run scripts of the editor's own, then re-export the results, but you can do it all within your scripting or programming language, within both clipboard and multiple variables (instead of the above-mentioned editor-created buffers), onto which you run (mostly) regex commands... and finally 4) that scripting making use of regex will ease and shorten up your necessary scripting to an incredible degree. Oh, and yes, there is a 5) : Special tools, applying regex internally, and presenting some text manipulation gui to the user (TextPipe et al., and then PowerGrep, as the premier representatives of this kind of sw), seem to add nothing to what your own scripting could do in a much more easy way: on the contrary, just as your try (= mine, some time ago) to include some text editor processing into your automated workflow, they just add unnecessary complications (and ain't that cheap, but that's no consideration here). To complete this OT: There is one good idea though that can be retrieved both from TextPipe and PowerGrep: Don't try to complicate your regexes beyond all measure, in order to prove to yourself how smart you are, but be humble and just do 3 or 4 regexes in a row for what you know some expert could have done in just 1 of them: The result is as good as with the 1-regex alternative, with both writing and debugging time minized.

Ok, enough said OT for the rather limited utility of text editors for text processing (= not: text / code creation), it seems 010 (over at bits or full price) is the buy of choice for people needing to work upon binary files (and knowing about them more than I do). ( Well, my original idea was, if you hamper with running processed, you'll very probably get a blue screen - well, let's say you should probably not try to work upon running core Win processes. ;-) )

13
...are not that superior either?

This is a spin-off from http://www.donationc...ex.php?topic=40074.0 discussing MR update from 5 to 6.

"I'm a big fan of macrium reflect.  Very fast, very stable, no bloat."

MR seems to be the premier backup-and-recovery sw on the market as far as the paid version is concerned (which is discussed above).

As for their free version, though, I only can encourage possible users to refrain from it, not because it was really bad (in fact, I never knew and don't know), but because it does not seem to offer any functionality going beyond what less-renowned competitors offer, in their respective free versions, or more precisely, it does offer even less than they do.

In fact, I went back to Paragon Backup and Recovery Free, where I can start to reinstall of my backup from within running Windows (which for that is than ended, then Linux will loaded for the rewrite of c: (or whatever), and then Windows is loaded again) - why should I fiddle around with doing lots of things manually, with MR (Free) if I can have this repeated os swapping, both by Paragon or EaseUS (and perhaps by others), done automatically?

MR (Free), on the other hand, did the backup (onto my hdd), and when I tried to reinstall that backup (after some bad experiences, I do such tries immediately after the original backup now, not weeks or months afterwards and hoping for the best in-between), it told me I didn't have an external reinstall device (or whatever they call it) from which to run the backup.

After this quite negative experience with MR (Free), I'm musing, of course, why MR (paid) is touted the way it is, since from the moment on you're willing to pay, you'll get incremental/differential backup/restore, from their competitors, too (Paragon, EaseUS and also Acronis: this latter I never touched, having read about very bad experiences from other users, allegedly having lost data with Acronic, and with several versions that is).

Also, MR did not seem anything "fast" to me, not faster than Paragon or EaseUS anyway, and at least for Paragon, I can say it's perfectly stable (I once lost data with their partition tool, but that was my fault, triggered by quite awful, quite ambiguous visuals in the respective Paragon program: So today I use Paragon for backup and EaseUS for partitioning).

And as an aside, MR even has got its own wikipedia entry, of which the wikipedia staff is far from being happy (and they say so), and which contains some direct links to the MR site where you would have expected links to less seller-specific info.

And to say it all, MR, on their homepage, currently advises you to update from 4 to 5, whilst above, it's said that 6 is imminent (?), and that updating from 5 to 6 is NOT free for v. 5 owners.

All this makes me think that perhaps MR do some very good pr and are able to create some hype, whilst at the end of the day, it's just a very regular, decent product which succeeded in realizing higher prices than their competitors are able to realize, by that hype.

If MR (paid) really has some usp(s), please name them; their free version at least is a lesser thing than their contenders' free products.

14
"While not my working text editor (I still use Kedit and EditPad Pro for that, depending on what I am doing), this remains my everyday go-to tool for quickly examining and/or editing ANY kind of file from an explorer right click."

I don't understand the idea behind it; could you please comment about it?

Background:
- I have got (several, I think, but at least one editor doing hex (and regex, too) emEditor, which does the same (?) upon hex files than does 010, the premier prob being that you can NOT edit text in hex files and then assume the file isn't broken, or more clearly, if you try with 010, you'll break your files, as you would do with any other regular text editor doing hex additionally.
- So what's the special interest in using 010 for hex files (let alone for text files)? In other words, if there is some trick 010 DOES do, I'd be very willing to buy it for that, but currently, I don't see this usp of a special-hex editor in general and/or of 010 in particular.

15
"a really great feature for [sic, not: of] rightnote or any similar tree hierarchy is [not: would be] the option to display in masonry-style, all the subnotes"

That was correctly worded, albeit I immediately feared the misunderstanding that didn't take long to appear indeed.

In fact, I cannot mention any sw that does it exactly in that visual rendering, but it's clear as day that's a functionality that is blatantly absent almost everywhere, whilst being of tremendous practical use. Hence:

- Many outliners offer a similar functionality by export / print; of course that's not the same thing at all; it should be done within the program, and those bits should be editable (i.e. your edits therein should be replicated within the original sub-items)

- I know 2-3 outliners or similar which do something LIKE that, but none (?) of them does offer the above-mentioned functionality of your editings-done-within-that-spread-out-view being replicated into the original items:

-- Citavi (of course it's debatable if you could use that program for general IM, and the answer is probably no)

-- Treeline ( http://sourceforge.n...t/projects/treeline/ , free and very interesting, but from an "experimental" pov; it's not really useful for heavy use)

-- UltraRecall has a feature similar to Citavi, but, incredibly, UR then does clip=hide all the intermediate sub-titles, i.e. the titles of the items spread-out. This is completely nuts, since in 99.9 p.c. of possible use cases, you will be in deep need of those titles (and be it only for knowing where the previous original item began, and where the next starts; for the 0.1 p.c. of use cases where these item titles are unwanted, you could have implemented an option that is), and I mentioned this it's-not-a-bug-but-an-incredible-dumb-feature to the developer YEARS AGO, without the slightes reaction; here again, no replication of your possible edits (if they are possible that is, I don't remember, and the same remark applies to the other applics mentioned here) back to the original items

- To SOME (i.e. very poor) degree, 3-pane instead of 2-pane could "help"

- As mentioned, as soon as you think "print/export", your current possibilities widen considerably up, so in some use cases, with a macro, perhaps, there could be some (very limited) means.

As you all know, development of desktop outliners is more or less stalled, for the time being, and this absence of real hope for getting such functionality in an acceptable way may be sufficient reason to mention the lesser, semi-solutions above.

P.S. This problem is similar to the flattening out of the file system's directories... where some file managers help out a lot.

16
General Software Discussion / Re: The impossible thing software
« on: January 30, 2015, 09:20 AM »
Is there any site which publishes the most outrageous obscenities, created inadvertently by their authors' lack of English? If there is, I would happily suggest my top 1 find of many years for their home page.

"You want to automate the presents to your mother.

You must [sic] control the tastes of your mother in a database.
And the limit you may waste. And so on."

My formatting. And my kudos. And let's face it, it's not a language prob to 100 p.c. Simply wonderful. And I very much like the gif idea. Cabaret.

17
General Software Discussion / Re: Scraper too expensive at 20 bucks
« on: January 18, 2015, 12:55 PM »
"I could create a project that only analyzes pages related to gedney and downloads all files used/linked from them."
We assumed that was not we wanted to do, in so general a way.

"After that, one could, if one so chooses, either:
1) start to limit further, so it does not downloaded linked pages/content outside wanted areas ."
Areas? Link groups, I presume.

"2) e.g. limit it to only download and keep images."
Again, that would often download much too much, and I prefer downloading hygiene to lots of discard later on. Ultimate irony: That discarding unwanted pics could then possibly be done within a file manager allowing for regex selection.

"URL / link normalization is always done before any filter is tested, so all that is zero problem. General rule: As long as a browser can understand something in a website, so can the crawler smiley (AJAX being an exception.)"
Brilliant! (But in some contradiction to your next paragraph?)

"Replacing content/links/URLs inside page content is also a far larger system than regular expressions."
As explained above, not necessarily so. I originally started with a (dumb) misunderstanding of mine: That regex in your scrapers was for choosing the parts of the text the user wanted to preserve; obviously, that is not their use in your software, but it's for the users' determining the links to be followed. Now, from a regex match to a regex replace there (and with input, as explained above, from the user, and which the user will have identified from manually loading pages and looking into their respective url pattern, and/or from looking into the respective page sources), it's not that big a step, and...

"It is essentially a huge multi-threaded engine that is behind it all with a ton of queues, lists and what not to ensure everything is done correctly and optimal as possible."
I don't know about traditional multi-threading, but in downloading, the term multi-threading is often used for multiple, concurrent downloading; I don't know about that either. But: It's evident, from my explications above, that all this "do different things at different download levels" is easy, as soon as you accept that in fact, doing different (and specific) things at different (= at the specific) download levels, is the natural way of doing things, and that "download whole webpages" is very Late Ninetees, even if today it's possible to exlude, by real hard work, and totally insufficiently, only SOME unwanted meanders (and all the advertizing, socialising and whatever).

"and what not"
This is a formula regularly used for deprecating, or more precisely, it shows that the writer in question has not yet got the importance of the detail(s) he's hiding behind such an all-englobing "and what not"; btw, another sophist's formula of choice is "questions", when no questions have been asked (yet) (see your top post): in fact, many of us will convene that corporations that don't deliver the service you asked, and paid, for, and if then you dare utter some requirements, these will, by corporations of that (= not-so-much customer-centered) kind, invariably met by them saying, "your questions". (end of ot)

"to ensure everything is done correctly and optimal as possible"
Well, that's why I suggested adding a second "mode", for stringent downloading, instead of trying to mingle it all together into existant, traditional mode, this second "mode" "only" going down some 5, 6 levels, but from the

"with a ton of queues, lists"
Of course, you'd be free to use lists instead of arrays, and even though that would imply some multiplication of storage elements indeed, I wouldn't expect a ton of additional elements from this choice.

"It is of course a never ending process of optimizations"
I mentioned this above: Where a custom-made script is straightforward, the intermediate step of providing, in a commercial tool, dialogs and such in order to build, within some necessary confines, some variants of proceeds, in order to run something what more or less mimicks, for standard tasks, such a custom-made script, or at least a lot of what the latter only would have provided in the last resort, complicates things for the developer of such a tool.  But then, identify standard tasks (e.g. from what you hear / listen to (?) in your forum), then write additional "modes", as the one described above, and I'm even more precise in my description: Do one big list field, with multiple indentation: Let users build up a tree (which then you process by nested loops). This will make available bifurcations even, i.e. down level 3, with 2 (or more) link kinds to follow, each of them having their own process commandments further down, i.e., in this example, 1 level 3, but 2 different levels 4, 1 with level 5 and 6, the other one with just level 1, or even it ends at level 4 for that line of scraping. The next programming step would of course be to integrate your multi-threading into this alternative mode, and then, a prof. version could even allow for user's indication how often these specific lines are to be updated (re-check for changes in that site's contents). It's not "endless", but it can be done in sensible installments, according to real-life scenarios and how to best handle them, and following the (in most cases, tree) structure of a site in a perfectly coordinated way, without gathering rubbish laying on the way, should be a core task (of which the coding will take one week since "everything around" is already there), which then could perhaps be refined in some details from the observations of your forum posters.

"which is why writing such a product can end up taking lots of time smiley"
One week, as said, and including debugging - since most of the design work has been delivered free of charge.

Thomas, it's clear as day I don't need such a crawler anymore, but not anybody who would like to do some neat scraping has got the scripting routine I've got over the years now. It's not that anybody's asking you any particular effort in one way or any other, it's just that you've got some weeks' advance over your competitors; some weeks I say because DC's got quite tons of hits from professionals (I had been astonished you hadn't become aware of DC yet before my linking here?)

;-)

18
General Software Discussion / Re: Scraper too expensive at 20 bucks
« on: January 17, 2015, 10:14 AM »
Hi Thomas,

I did not try myself, but I think you're right, and I realize I was too much focused upon the level1-level2 thing (which A1 (and presumably all the others, either) obviously doesn't do (yet)), but which is not really necessary either; in fact, the core functionality that is needed, is "follow ONLY links that meet (a) certain regex pattern(s)", AND it's necessary to have several such regexes if needed (as in this scenario where the level 1 regex would be different from the level 2 regex).

Then, most of tasks will be realizable this way, understood that BOTH (i.e. all) regexes will invariably be applied to ANY such level, which only in very rare cases could be a problem - in my script I applied to that site, I had differentiated the regex for level 1 and level 2, building a subroutine for that lower level, but we see here that I could have simplified the routine, according to my example as described above.

Unfortunately, and I only discover this now, putting these things together again, it was a "bad" example, i.e. not as straightforward as described my me yesterday. In fact:

1)

level one = thumbs page, code line for the very first thumb is:

      <li class="grid_3 alpha clearBoth"><a href="/digitalcollections/gedney_KY0001/"><img src="http://library.duke....edney/thm/KY0001.jpg" alt=""/><br/>Man with no fingers on right hand lighting a cigarette; view from interior of ...</a></li>

and

http://library.duke....edney/thm/KY0001.jpg

will just bring a single thumb again!,

whilst the intermediate-quality page, displayed by a click on the thumb, has the url

http://library.duke....tions/gedney_KY0001/

Such a direct link is nowhere on the source (= multiple thumbs) page, but compare with the

<a href="/digitalcollections/gedney_KY0001/">

part of the above line; this means you can identify the core info by a regex fetching that line, then you need to build a variable taking this core info

/digitalcollections/gedney_KY0001/

and putting the necessary part

http://library.duke.edu

before that element fetched from the source page.

It goes without saying that I make abstraction here from the specific detail of these pages that all photos are just numbered (with leading zeroes though), so that this part of the script can be greatly simplified; I've seen other such pages where there was some sort of a "numbering", but in a perfectly aleatoric way, some hundreds of numbers only, but from a range of 100,000, so a simple "compound url" function, with just numbering 1...n would NOT be sufficient in many instances, and I very much fear A1 (and "all" the "others") do NOT have such a "compose the target url from a variable, and a static component" function yet?

In other words, you would not only need a "match regex" functionality, but a "regex replace" functionality, either for a copy of the original page source, and before the reading-the-source-for-link-following is done, or, simpler, as described, for building an intermediate variable to be processed as link to follow then.

Also, and this in not new from yesterday, there are MANY such (to-be-compounded-first) links to follow, not just one, and such a scraper (here: A1) should be able to do the necessary processing for all of these. In other words, it would be best if internally, an array would be built up for being processed then url by url.

2)

Now, having followed the compound link, you are on page

http://library.duke....tions/gedney_KY0001/

with a button "All Sizes", and with some more hampering, you'll get to the highest-quality page, with the url

http://library.duke....edney/lrg/KY0001.jpg

Here again, it's evident that by knowing the "architecture" of these links, you simply could run a script with those "KY0001" counting up from 1 to n, but as said, it's not always as easy as Duke Univ. makes things for us; thus we fetch the target url from the "intermediate" 's page source:

http://library.duke....edney/lrg/KY0001.jpg

This link is present in the page source, but if it wasn't, there are several other such links, with "embed", "mid" and "thm", so here again, some regex REPLACE should be possible in cases the direct link is not to be found within the page source.

Whilst in our example, on this second level, there is only 1 link to follow, there are many cases where even on level 2, there are several links to follow - or then, it's the other way round, and on level 1, there is just 1 link, but then, on level 2, there are several (similar).

In fact, my current example is for Gedney 1964, but there is also Gedney 1972 (and others, of a lot less interest in this "artsy" context), i.e. I left out level 1 (= multiple links to multiple portfolios), and also left out level 2 (= multiple thumbs page per portfolio), so that the level 1 in my example is in fact already level 3 = one of those multiple-thumbs pages, of several, within one of several portfolios, of one portfolio (of several).

This means you have to provide functionality for multiple similar links (= not just one) on several levels in a row, in order to meet realistic scenarios, and this means you should not provide simple variables, but arrays, for the very first 4 levels, or perhaps even for 5 levels in a row.

3)

In all this, I assume that all links in one array are "similar", i.e. are to be treated in a similar way "on arrival" on the target page, or more precisely, that on every target page of the links of some such array, the subsequent array building for links will be done in the same way.

It's obvious, though, that for many such scrape jobs, there will be dis-similar links to be followed, but I also think that in order to not over-complicate such a tool, you could ask the user to follow the same "page 0" (= source page), with different such tasks, in several installments.

As long as link following will be done just for "links of one kind" (i.e. not: single links), no chaos whatsoever will ensue.

Also, from the above, I conceptually deduct that there should be two different "modes" (at least):
- follow every link meeting the regex pattern(s), and
- follow only links of a certain kind;

in this second case, different regex pattern should be possible for different levels (you see, I cling to this idea: it would make things so much easier for the user (clarity!), and it would not represent any programming difficulty to achieve; also, it would make the program run "faster", neater: to build up the intermediate link arrays for the levels deeper down, no search for unnecessary regex matches (= not occuring at those levels anyway) would be forced upon the respective source texts (and the machine; and less risk for accidental unwanted matches there)).

From the above, I do not think A1 would have been able to execute my given example (= otherwise than by just counting from 1 to n, that is), as there will probably not be the possibility to construct compound target urls yet, all the less so for similar groups? In other words, and as explained above, I suppose today it's "follow every link matching one or several regex patterns (and not ANY link, as in the previous generation of scrapers)", but it would not be possible to build up such links if they are not found already, in full, within the respective page source?

Btw, with today's more and more Ajax and similar pages, I suppose that functionality to build up links (up from information the user will have to put into the program, by manually following links and then checking for the elements of the respective target url "upon arrival") will become more and more important in order for such a tool to not fail on many pages it could not handle otherwise?

Well, I love to come up with complications! ;-)

19
General Software Discussion / Re: Scraper too expensive at 20 bucks
« on: January 16, 2015, 01:19 PM »
My post above was not so much about any A1 or other, but meant in general, A1 bits offer just being the trigger for my considerations, but of course I gave the link to DC over there, which Thomas then promptly followed... ;-) Here's my answer from over there, there "over there" meaning here, and so on... ;-) :

Thomas,

Just to clarify, it was not my intention to denigrate A1, and I very much hope the title I gave the thread cited above appears as perfectly ironic as it was intended.

I should have clarified above - I do it here though - that I consider A1 as a perfectly worthy example of a "standard scraper", and more so, possibly "perfect" or at least very good, for almost any "un-professional", i.e. amateur scraping task (and from your comments, I see those imperfections of A1 I found described in the rare web comments of it, have been dealt with in the meantime).

Also, there seems to be a slight misunderstanding, automatisms are good, but the possibility of deselecting and tweaking automatisms is even better, since very often, scrapers follow links too fervently, instead of following only some kinds/groups of links (and I mentioned the programming problems in order to make such choices available): it's not about ".jpg only", or even of "pics within a given size range only" and such; also, standard "this page and its children down to 1/2/3 levels" is not really helpful, since (even for "amateurs"), it often would be necessary to follow links of some kind rather deep, whilst not following links of other kinds.

As for the "heavy scraping problem", there is also a legal problem, which consists of new-kind "authors' rights", in most European countries, to "data bases", even if those db's only consist of third-party advertizing / good offerings, with no content contribution whatsoever from the owner of the target site (but who, e.g. for vacancy notices, gets often paid 800 euro, some 1,000 bucks for publishing that ad for a mere 4 weeks, AND holds "authors' rights" to that same ad as part of his db); this being said, it's clear as day that such considerations are perfectly irrelevant within the context of a "consumer product" like A1, and this denomination of mine is certainly not meant in order to tear down A1 either.

But there clearly is a schism between professional use, for which highly elaborate custom scripting is necessary (and, as explained, not even sufficient), and "consumer use", and in this latter category, the above-mentioned tweaking possibilities for "which groups of links to follow then how, respectively", could certainly make the necessary distinction among "consumer scrapers".

Or let's get as precise as it gets: Years ago, I trialled several such "consumer scrapers", in order to get all of William Gedney's Kentucky 1964 and 1972 photos (i.e. not even photo scraping is all about porn, but sometimes it's about art), from the Foundation's web site, but in the best resolution available there, and that was not possible with those scrapers since there was an intermediate quality level, between thumbs and the quality I was after - perhaps I did it wrong at the time; anyway, I succeeded by writing my own download script.

Just for fun, I checked that page again:

http://library.duke....ect/Cornett%20Family

and verified the current state of things:

pages of kind a: some 50 pages with thumbs (for more than 900 photos),

then target pages (= down 1 level, for intermediate photo quality),

and there, links to the third level, with the full quality, but also many, many links to other things:

Whilst from level 1 to level 2, it's "easy", it's obvious that for level 2 pages, highly-selective-only link following (i.e. just follow the link to the pic-in-full and nothing else) would be asked for, but probably is not possible with most consumer scrapers even today: Would it be possible to tweak A1's link following that way? Again, we're speaking not of general specifics, but of specifics applying to level-2 pages only.

Well, whilst my first DC post was rather theoretical, here we've got a real-life example. It's clear as day that if a tool for 20 or 40 bucks did this, with some easy tweaking, I'd call such a tool "near perfect": it's all about discrete, selectivity of link following. ;-)

(Or then, I'd have to continue to do my own scripts for every such task I encounter...)



P.S. If by this scheme of mine, Gedney's work will get some more followers, that wouldn't be a bad thing either. ;-)

20
General Software Discussion / Scraper too expensive at 20 bucks
« on: January 16, 2015, 06:34 AM »
(Original post at bits and referred to here was "$19 + seems an awful lot of money for software you can get the same type of thing for nothing. (...)".)

The problem lies elsewhere. A price of 20 bucks is certainly not a deal breaker, neither would be 40 bucks (original price), and there are competitors that cost several hundred bucks, and which are not necessarily better or much better.

First,

if you search for "download manager", the web (and the people who constitute it by their respective contributions) mix up web scrapers (like A1) and tools for downloading files specified beforehand by the user, but the download of which will then be done within multiple threads, instead of just one, by this using your possible fast internet connection to its fullest; of course, most of the scrapers will include such accelerating functionality, too. Thus, the lacking discriminating effort in what commentators see as a "download manager" does not facilitate the discussion to begin with; you should perhaps use the terms "scrapers", and "download accelerators", for a start, but there is also some "middle thing", pseudo-scrapers who just download the current page, but without following its links.

Second,

the big problem for scrapers nowadays is Ajax and database techniques, i.e. many of today's web pages are not static anymore, but are built up from multiple elements coming from various sources, and you do not even see those scripts in full; scripts you can read by "see page source" refer back to scripts on their servers, and almost anything that is done behind these scenes, cannot be replicated by ANY scraper (i.e. not even by guessing parts of it, and from building up some alternative functionality from those guesses), so the remark that A1's pages from scraped Ajax pages do not "work" is meaningless.

The only other remark re A1 I found in the web was, you will get "the whole page", instead of just the photos, in case you would like to download just the photos of a web page; IF that is right, that was a weakness of A1 indeed, since these "choosing selected content only" questions are the core functionality today's scrapers could and should have, in the above-described general framework in which "original web page functionality" can not be replicated anymore, for many pages (which often are the ones which are of most interest = with the most money behind = with both the "best" content, and lots of money for ace programming).

Thus, "taking up" with server-side programming has become almost impossible for developers anyway, so they should revert to optimization of choosing selected content, and of making that content available, at least in a static way, and it goes without saying that multiple different degrees of optimization of that functionality are imaginable: built-in "macros" could replicate at least some standard connections between screen/data elements "on your side", and of which the original triggers are lost, by downloading, but this would involve lots of user-sided decisions to be made, and hence lots of dialogs the scraper would offer the user to begin with ("click on an element you want as a trigger, then select data (in a table e.g.) that would be made available from that trigger", or then, big data tables, which then you would hierarchically "sort" in groups, in order to make that data meaningful again).

It's clear as day that the better the guesses of the scraper in such scenarios, the easier such partial re-consitution of the original data would often become, and also, that programming such guesses-and-services-offered-from-those would both be very "expensive" in programming, and be a never-ending task, all this because today's web technologies succeed in hiding what's done on the server side.

In other words, from even very complicated but static, and even pseudo-dynamic (i.e. get it all out of databases, but in a stringent, easily-to-be-replicated way) web pages yesterday, to today's dynamic web pages, it has been a step beyond what scrapers sensibly would have been able to handle.

But it's obvious also that scrapers should at least perfectly handle "what they've got", and the above-mentioned example (as said, found in the web) of "just downloading the pics of a page", whilst being totally realistic, is far from being sufficient as a feature request:

In so many instances, the pics of the current page are either just thumbs, or then, just pics in some intermediate resolution, and the link to the full-resolution pic is not available but from the dedicated page of that middle-resolution pic, and the situation is further complicated by the fact that often, the first or second resolution is available, but the third resolution is not, and that within the same start page, i.e. for the same task at arrival, for some pics, the scraper / script would have to follow two or three links, in for other pics linked to at the same page, it would have to follow just one or two.

This being said, of course, such "get the best available resolution for the pics on current page" should be standard functionality for a scraper.

But, all this being said, it also appears as quite evident to me that for tasks beyond such "elaborate standard tasks" (and which could be made available by the scraper "guessing" possibly relevant links, then have the user choose from the intermediate results, and then the scraper building up the necessary "rule(s)" for the site in question), scraper programming comes with the additional problem that such "specific rule building" would be split into a) what the scraper would make available and b) what the user could make out of these pre-fetched instruments, whilst in fact, the better, easier, and ultimately far more powerful solution (because the limitations of the intermediate step would be done away, together with that intermediate step) would be to do scripting, but ideally having some library of standards at your disposal.

(Readers here in DC will remember my - unanswered - question here how to immediately get to "page x" (e.g. 50) of an "endless" Ajax page (of perhaps 300 such partial "pages" (or whatever you like to name those additions), instead of "endlessly" scrolling down to it.)

Anyway, precise selection of what the user wants to scrape, and of "what not", should be possible in detail, and not only for links to follow on start page, but also for links further down, at the very least for links "on page 2", i.e. on several kinds (!) of pages which only have in common the fact that all of them are one level "down" from the respective "start page" (I assume there are multiple but similar such "start pages", all of them to be treated in a similar (but not identical, see above) way.

Third,

so many scrapers (and download accelerators, too) tout their respective accelerating power, but few, if ever one, mention the biggest problem of them all: More and more server programs quickly throw your IP(s!) and even your PC out of their access scheme, should you dare scrape big content and/or, repeatedly, updated content, and again, as above, the more elaborate the content and their server-side page-build-up programming, the higher the chances are that they have sophisticated scraper detection, too.

What most people do not know, when they choose their tunnel provider, is the fact that in such "heavy-scraping" scenarios, it's quite "risky" to get a full-year contract (let alone something beyond a year), and that there are special tunnel providers where you rent multiple IPs at the same time instead - which comes at a price.

With these multiple addresses, many scraping guys think they are on the safe side - well, what's multiple addresses "abroad" (from the server's pov), and when in country x no such provider can provide you any, or more than just a handful of "national" IPs?

And it does not end there. How "visually good" is your script, from the server's pov again? Don't you think they cannot "put it all together again" when your scraping follows detectable rules? To begin with, your scraping is probably mutually exclusive, which is obviously a big mistake, but which facilitates combining the parts on your side, right? He, he...

And you're spacing your requests, of course, in order for the server not to detect it's a machine fetching the data? He, he, again, just spacing the requests in time does not mean the server will think it detects some real person, looking for the data in a way some bona fide prospect would look for that data.

Not to speak of bona fide prospects looking in certain standard ways, but which never are the same though, and that they don't do just sequential downloading ("sequential" does not mean, follow link 1, then 2, then 3, but link 35, 482, 47, whatever, but download, download, download!), but revert to some page before, press F5 here or there (but not systematically of course), and so on, and in endless ways: As soon as there is a possible script to be detected, those servers send a signal on a real person on their side, and who will then look into things, relying on their scripts-for-further-pattern-detection: time of the day for such a "session", amount of data downloaded, number of elements downloaded, order in which (sub-) elements are downloaded (patterns, too similar and/or or not "real-life" enough).

Then, even if you quite perfect all this, by having your machines replicating real-life behavior of different real persons, even most real-life prospects will not remain interested in the same or similar data over the years, and most of them, not even over months in a row!

And all this with the concurrent problem of the geographic repartition of your IPs again: Where almost all of their bona fide prospects would sit in some specific country, or even in some specific region of that country, and so all of the above problems, even if resolved in perfect ways (and this necessarily included lots of overlaps if you want your global scheme to remain "realistic") will be only partial solutions and not work for long if you cannot resolve the problem of how to fake IPs and their geography, instead of just renting some.

My 2 cent to put into perspective some naïve, "$19 + seems an awful lot of money for software you can get the same type of thing for nothing.", and I certainly left out additional aspects I didn't think of on the fly.

21
General Software Discussion / Re: Desktop search; NTFS file numbers
« on: January 11, 2015, 12:42 PM »
Hi jity2,

When I spoke of sharing being a step beyond finding out, I definitely didn't have you in mind, and I apologize for not having made my pov clear enough that I very much appreciate your sharing findings, in fact it was your details that motivated me to put some more details in. ;-)

Since you mention my mentioning path/file name lengths, let me cite from techrepublic:

http://www.techrepub...n-an-ntfs-directory/

"Something else to look out for
TonytheTiger 8 years ago
is the length limits. Filenames can be up to 255 characters, but path and file combined can only be 260 characters (you can get around the path part by using Subst or 'net use' and setting a drive letter farther down in the directory structure)."

Well, I would try file and folder naming hygiene first, and in most cases, that should do it.

And since you mention html and all those innumerable, worthless "service files", that reminds me of the importance of some "smarter" search tool hopefully being able to leave all these out of its indexing, but NOT by suffix name only, but by a combi of suffix and vicinity within the file structure, and possibly, if needed, also content: In fact, you separate your files/folders into an application part, and then a contents part (i.e., most of us do so), but the html format and similar formats blur this distinction again, shuffling servicing code into the "contents" area of your data, so it's obvious a smarter search tool should de-clutter this intrusion again.

Also, above, I forgot to mention Armando's

"Why do I mix Archivarius and DtSearch ? Simply because their algorithms for dealing with space and dashes are different and lead to different results. But if I had to choose one (but I woudn't...), I'd probably go with DtSearch : indexing is fairly quick and there are more search options to get what you want. Archivarius is fast too, but its search syntax isn't as sophisticated. Both could have better interface.

I use everything for filename/foldername search as it's so quick and its search syntax is very flexible and powerful (e.g. Regex can be used). (...) [Edit: about X1 : used to be my favorite, many years ago, but had to drop it because of performance reason and inaccuracy : it wouldn't index bigger documents well enough. See my comments earlier in the thread. To me, accuracy and precision are of absolute importance. If I'm looking for something and can't get to it when I know it's there... and then I'm forced to search "by hand"... There's a BIG problem.]"

The second passage bolded by me is both very important, in case, and subject to questioning since they clearly did some work upon their tool - question is, how deep that work would possibly have been: In short: Problem resolved, or then, not?

The first bolded passage is of the very highest interest, and should certainly not be buried in some page 32 of somewhere and somewhat, but should be investigated further, all the more so since my observation re French accents applies here, analogously: Many parallel wordings for the same phrase, with or without hyphens (let alone dashes), or then, "written together", i.e. in one word (or even in abbrevs), and further complicated when the phrase contains more than just two elements: a space between the first two elements, but then a hyphen between the second and third one, or the other way round...

Which makes me wonder which of these tools might be able to correctly treat as equal English and American English, but without doing so by "fuzzy searching" which would bring masses of unwanted pseudo-hits...

(History's irony: askSam, by its overwhelming success in those ancient times, "killed" another, similar "full text db" program, but which HAD semantic search, whilst AS, even 30 years later, never got to that (and has now be moribund for some 5, 6 or 8 years)... and cannot be found yet in any of those 2- and little-3-figure desktop search tools (but in 4- and 5-figure corporate tools it seems... and all this is about market considerations, not about technology: technology-wise, not speaking of (possible) AI, it all would be some additional concordance tables, especially when indexing, and less so when search time comes).)

And no, I'm not trying to talk you into running dtSearch indexing for days: It would just put unnecessary strain on your hardware, and from your findings and what we think we know, we can rather safely assume it would be somewhere between 6 and 8 full days of indexing, when X1 needs 10, and Copernic needs 15. (Even though I'm musing about possible surprises, and then, you ran your stuff for 25 consecutive days now, so some 5 days more, percentage-wise... ;-) ) Let's just say, that would have been utterly instructive. ;-)


EDIT:

The 8.3 problem/solution is often mentioned; in

http://stackoverflow...-lots-of-small-files it is explained best:

"NTFS actually will perform fine with many more than 10,000 files in a directory as long as you tell it to stop creating alternative file names compatible with 16 bit Windows platforms. By default NTFS automatically creates an '8 dot 3' file name for every file that is created. This becomes a problem when there are many files in a directory because Windows looks at the files in the directory to make sure the name they are creating isn't already in use. You can disable '8 dot 3' naming by setting the NtfsDisable8dot3NameCreation registry value to 1. The value is found in the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem registry path. It is safe to make this change as '8 dot 3' name files are only required by programs written for very old versions of Windows.

A reboot is required before this setting will take effect.
share|improve this answer
edited Oct 24 '08 at 23:06
community wiki
2 revs
Dan Finucane"

ANOTHER EDIT:

Since Copernic is (again) on bits today, here's another element being relevant for our subject (from over there):

"Jaap Aap Hi, is my understanding from the fine print correct that this includes the upgrade to v5? And will sorting by relevance be included at that point?"

I would have worded this "SOME sorting by relevance", since there are innumerable ways of implementing sorting by relevance into a search tool, but it's clear as day this functionality, while being of the highest possible importance, has not been treated by developers (of the "consumer products" discussed here at least) with due attention, up to now, if I'm not mistaken? (This being said, it's obvious that a badly-implemented "display-by-relevance" would need to come with the option to disable it.)

22
General Software Discussion / Desktop search; NTFS file numbers
« on: January 11, 2015, 07:55 AM »
This is a spin-off of page 32 (!) of this thread http://www.donationc...x.php?topic=2434.775 ,

since I don't think real info should be buried within page 32 or 33 of a someday gross-page-long thread of which readers will perhaps read page 1, and then the very last (pages) only; on the other hand, even buried on some page 32, wrong and/or incomplete "info" should not be left unattended.
____________________

Re searching:

Read my posts in http://www.outliners...om/topics/viewt/5593

(re searching, and re tagging, the latter coming with the 260 chars for path plus filename limitations of course if you wanna do it within the file name... another possibly good reason to "encode" tags, in some form of .oac (Organisation(al things) - Assurances - Cars), instead of "writing them out")

Among other things, I say over there that you are probably well advised to use different tools for different search situations, according to the specific strengths of those tools; this is in accordance with what users say over here in the above DC thread.

Also, note that just searching within subsets of data is not only a very good idea for performance reasons (File Locator et al.), but also for getting (much) less irrelevant results: If you get 700 "hits", in many instances, it's not really a good idea to try to narrow down by adding further "AND" search terms, since that would probably exclude quite some relevant hits; narrowing down to specific directories would probably be the far better ("search in search") strategy; btw, another argument for tagging, especially for additional, specific tagging of everything that is in the subfolder into which it "naturally" belongs, but which belongs into alternative contexts, too (ultimately, a better file system should do this trick).

(Citations from the above page 32:)

Armando: "That said, I always find it weird when Everything is listed side by side with other software like X1, DTSearch or Archivarius. It's not the  same thing at all! Yes, most so called "Desktop search" software will be able to search file names (although not foldernames), but software like Everything won't be able to search file content." - Well said, I run into this irresponsible stew again and again; let's say that with "Everything" (and with Listary, which just integrates ET for this functionality), the file NAME search problem has definitely been resolved, but that does not resolve our full text search issues. Btw, I'm sure ET has been mentioned on pages 1 to 31 of that thread over and over again, and it's by nature such overlong threads will treat the same issues again and again, again and again giving the same "answers" to those identical problems, but of course, this will not stop posters who try to post just the maximum of post numbers, instead of trying to shut up whenever they can not add something new to the object of discussion. (I have said this before: Traditional forum sw is not the best solution for technical fora (or then, any forum), some tree-shaped sw (integrating a prominent subtree "new things", and other "favorites" sub-trees) would have been a thousand times better, and yes, such a system would obviously expose such overly-redundant, just-stealing-your-time posts. (At 40hz: Note I never said 100 p.c. of your posts are crap, I just say 95 or more p.c. of them are... well, sometimes they are quite funny at least, e.g. when a bachelor tries to tell fathers of 3 or 4 how to rise children: It's just that some people know-it-all, but really everything, for every thing in this life and this world, they are the ultimate expert - boys of 4 excel in this, too.)

Innuendo on Copernic: Stupid bugs, leaves out hits that should be there. I can confirm both observations, so I discarded this crap years before, and there is no sign things would have evolved in the right direction over there in the meantime, all to the contrary (v3>v4, OMG).

X1: See jity2's instructive link: http://forums.x1.com....php?f=68&t=9638) . My comment, though: X1's special option which then finds any (? did you try capitals, too, and "weird" non-German/French accented chars?) accented char, by just entering the respective base char, is quite ingenious (and new info for me, thank you!), and I think it can be of tremendous help IF it works "over" all possible file formats (but I so much doubt this!), and without fault, just compare with File Locator's "handling" (i.e. in fact mis-treating) accented chars even in simple .rtf files (explained in the outliner thread) - thus, if X1 found (sic, I don't dare say "finds") all these hits, by simply entering "relevement", for finding "relèvement" (which could, please note, have been wrongly written rélèvement" in some third-party source text within your "database" / file-system-based data repository, which detail would make you would not find it by entering the correct wording), this would be a very strong argument for using X1, and you clearly should not undervalue this feature, especially since you're a Continental and by this will probably have stored an enormous amount of text bodies containing accented chars, and which rather often will have accent errors within those original texts.

X1 again, a traditional problem of X1 not treated here: What about its handling of OL (Outlook) data? Not only that ancient X1 versions did not treat such data well, but far worse, X1 was deemed, by some commentators, to damage OL files, which of course would be perfectly inacceptable. What about this? I can't trial (neither buy, which I would have done, otherwise) the current X1 version, with my XP Win version, and it might be this obvious X1-vs.-OL problem has been resolved in the meantime (but even then, the question would remain which OL versions would possibly be affected even then? X1-current vs. OL-current possibly ok, but X1-current vs. OL-ancient-versions =?!). I understand that few people would be sufficiently motivated to trial this upon their real data, but then, better trial this, with let's say a replication of your current data, put onto an alternative pc, instead of runningg the risk that even X1-current will damage any OL data on your running system, don't you think so? (And then, thankfully, share your hopeful all-clear signal, or then, your warnings, in case - which would of course be a step further, not necessarily included within your first step of verifying...)

Innuendo on X1 vs. the rest, and in particular dtSearch:

"X1 - Far from perfect, but the absolute best if you use the criteria above as a guideline. Sadly, it seems they are very aware of being the best and have priced their product accordingly. Very expensive...just expensive enough to put it over the line of insulting. If you want the best, you and your wallet will be oh so painfully aware that you are paying for the best."

"dtSearch - This is a solution geared towards corporations and the cold UI and barely there acceptable list of features make this an unappetizing choice for home users. I would wager they make their bones by providing lucrative support plans and willingness to accept company purchase orders. There are more capable, less expensive, more efficient options available."

This cannot stay uncommented since it's obviously wrong in some respects, from my own trialling both; of course, if X1 has got some advantages (beyond the GUI, which indeed is much better, but then, some macroing for dtSearch could probably prevent some premature decision like jity2's one: "In fact after watching some videos about it, I won't try it because I don't use regex for searching keywords, and because the interface seems not very enough user friendly (I don't want to click many times just to do a keyword search !)."), please tell us!

First of all, I can confirm that both developers have (competent) staff (i.e. no comparison with the usual "either it's the developer himself, or some incompetent (since not trained, not informed, not even half-way correctly paid "Indian"") that is really and VERY helpful, in giving information, and in discussing features, or even lack of features, both X1 and dtSearch people are professional and congenial, and if I say dtSearch staff is even "better" than X1 staff, this, while being true, is not to denigrate X1 staff: we're discussing just different degrees of excellence here. (Now compare with Copernic.)

This being said, X1 seems to be visually-brilliant sw for standard applics, whilst dtSearch FINDS IT ALL. In fact, when trialling, I did not encounter any exotic file format from which I wasn't able to get the relevant hits, whilst in X1, if it was not in their (quite standard file format) list, it was not indexed, and thus was not found: It's as simple as that. (Remember the forensic objectives of dtSearch, but it's exactly this additional purpose of it that makes it capable of searching lots of even quite widespread file formats where most other (index-based) desktop search tools fail.

Also, allow for a brief divagation into askSam country: The reason some people cling to it, is the rarity of full-text "db's" able to find numerics. Okay, okay, any search tool can find "386", be it as part of a "string", or even as a "word" (i.e. as a number, or as part of a number), but what about "between 350 and 400"? Okay, okay, you can try (and even succeed, in part), with regex (= again, dtSearch instead of X1). But askSam does this, and similar, with "pseudo-fields", and normally, for such tasks, you need "real" db's for this, and as we all know, for most text-heavy data, people prefer text-based sw, instead of putting it all into relational db's. As you also know, there are some SQLite/other-db-based 2-pane outliners / basic IMS' that have got additional "columns" in order to get numeric data into, but that's not the same (, and even within there, searching for numeric data RANGES is far from evident).

Now that's for numeric ranges in db's, and now look into dtSearch's possibilities of identifying numeric ranges in pseudo-fields in "full text", similar to askSam, and you will see the incredible (and obviously, again, regex-driven) power of dtSearch.

Thus, dear Innuendo, your X1 being "the absolute best" is perfectly unsustainable, but it's in order to inform you better that I post this, and not at all in order to insinuate you had known better whilst writing the above.

____________________

Re ntfs file numbers:

jity2 in the above DC thread: "With CDS V3.6 size of the index was 85 Go with about 2,000,000 files indexed (Note: In one hdd drive I even hit the NTFS limit : too much files to handle !) . It took about 15 days to complete 24/24 7/7." Note: the last info is good to know... ;-(

It's evident 2 million (!) files cannot reach any "NTFS limit" but if you do lots of things completety wrong, and if you persistently left out 3 zeros, it would have been 8.6 (or, with the XP number, 4.3, but nothing near 2.0:)

eVista on

https://social.techn...forum=itprovistaapps :

"In short, the absolute limit on the number of files per NTFS volume seems to be 2 at the 32nd power minus 1*, but this would require 512 byte sectors and a maximum file size limit of one file per sector. Therefore, in practice, one has to calculate a realistic average file size and then apply these principles to that file size."

Note: That would be a little less than 4.3 (i.e. 2power32-1) billion files (for Continentals: 4,3 Milliarden/milliards/etc.), for XP, whilst it's 2power64-1 for Vista on, i.e. slightly less than 8.6 billion files.

EDIT: OF COURSE THAT IS NOT TRUE: The number you get everywhere is 2power32 = slightly less than 4.3 billion files, and I read that's for XP, whilst from Vista on, it would be double of that, which would make it a little less than 8.6 indeed (I cannot confirm this of course), and that would then be 2power33, not 64 (I obviously got lead astray by Win32/64 (which probably is behind that doubling though)).

No need to list all the google finds, just let me say that with "ntfs file number" you'll get the results you need, incl. wikipedia, MS...

But then, special mention to http://stackoverflow...iles-and-directories

with an absolutely brilliant "best answer", and then also lots of valuable details further down that page.

I think this last link will give you plenty of ideas how to better organize your stuff, but anyway, no search tool whatsoever should choke by some "2,000,000 limit", ntfs or otherwise.

23
Spin off "Desktop search; NTFS file numbers" here:

http://www.donationc...?topic=39992.new#new

24
General Software Discussion / And IT Man of the Year 2014 Is...
« on: December 24, 2014, 08:05 AM »
Diego Garcia.

(who of course stands for a rare, high-brow collaborative programming effort). Here's why (in French, but a Century ago, that was the lingua franca for the educated people of this world anyway, so some google translation effort should not be out of your reach):

http://www.parismatc...vol-MH370-2-2-675084

(this is "part 2", but which includes part 1 - as you all know, the French do have a reputation of being a little unorganized).

You will learn that Boeing have their own, official patent for their ways to remote control their own aircraft, which comes handy e.g. whenever they decide, for whatever reason, that such an engine should be brought down immediately.

It's a secret for no one that ace technology can almost exclusively be found in weaponry, and for programming, that's similar - whilst e.g. if you want to hear the most elaborate lies there are, both government authorities and air carriers (and their paid or free yappies) are prime addressees.

Of course, they ain't bearable as long as you consider their output fun, especially since if you don't take it all on second degree, your intellectual prerequisite should be that you think logic is a new iApp (just one example: yes, in order to bring down an aircraft onto some military base, in order to destroy it, yes, first they will let you do that, instead of intercepting you, and second, it's a brilliant idea to drill traditional landing on their runway: but let's not forget most people will swallow anything that comes from their thinking delegates - if that reminds you of stories of spit in the North Corean camps).

And now for the reasons of all this, well, don't trust "science" and her lies either, just "trust" the bad connections within your own, poor brain (or do even not if you don't want to fall to self-deceit any other second):

Here's a wonderful specimen of why, for example, even corporations like MS ain't able to output decent software (just two lesser-known examples: yes, Word has got cross-links, but have a look at the way they are implemented; yes, there is Active Directory, but look at its incredibly bad permissions M), and it also explains why even very smart people's output, in most cases, is abysmal: the smarter you are, the sooner in your life you will have internalized that total self-censorship is in your primal interest: They call this "survival instinct" (Darwin's "best fit");  it's not but in very few industries where "anything goes" that ironically you are entitled to set your thinking free (and the more perverted and / or strange the better).

But back to the regular way of collaboration and which assures that nothing outstanding will be created, even if the combined I.Q. of 3 people amounts to 500, and how they "sell" you their propaganda (note the lovely pic which I'd call the "shut-up nigga" - well, I'm just the messenger, and of course that pic reminded me of Uncle Tom, and of the mythological three apes; also note the perfect white-collar clothing of the shut-up nigga - so please identify to him if you're white, too: if even bogeyman can be hand-tame, you can be be a "good dog!", too!):

http://www.ozy.com/a...&utm_campaign=pp

Well, if you wonder how a "normal person" can title

"How to Succeed at Work? Censor Yourself",

here's why,

"After a childhood of jumping from country to country, Nathan is used to feeling like a tourist everywhere he goes."

Yes, that's the fate of many a diplomat's child: Lifelong deracination to the point of believing in the salvationary nature of any Ebola saliva they feed you, instead of just gulping it in order to survive for some more days. (Of course, I don't even mention possible insurgency against your gaolers: That's as out of the question for the lifelong inmates of Western oligarchies as it is for Pyongyang's slaves.)

If up to now, you only felt that it was oh so queer that even very smart people a) "believe" and / or b) produce quite underwhelming output, even in big corporations where there's plenty of resources, well, face it:

The human brain's interconnections ain't done that brilliantly yet... which might end up quite soon in some new ai mainframes even queerer than man himself, and that could be another end (except for the mythical cockroaches, and then, in some more million years from now...).

In the meantime, "Merry Christmas!" and similar are sort of an obscenity, don't you think so?

The above "How to Succeed at Work? Censor Yourself" is one side of the coin, the findings of the Milgram experiment (1963) being the other side, the "coin" being Man's Perverted Nature.

Hence, no hope for any decent MS software ever! ;-) And sorry for possibly having impeded your Christmas illusion, but at least smart people should revert to thinking mode here and then, at the very least, and perhaps Christmas' contemplative mood could lower your traditional, human resistance  to home truth.

(Notes: a: Man's worth in this society being determined by his standing, it's consequential that the smartest coders go to MS et al., instead of doing their own thing, since most own things in coding don't generate high 6- to low 7-digit incomes p.a., and that's why even from "independent developers", you don't get real goodies in most occurrences either; b: Don't blame me for not having read the original Cornell Univ article from http://digitalcommon...ll.edu/articles/910/ "Creativity from Constraint? How Political Correctness Influences Creativity in Mixed-Sex Work Groups", I reminded you of the reasons for this just days ago; don't blame me for not developing the bias between Nathan's article and the mixed-sex work group setting we're referring to - believe us: mixed-sex group thinking has very little to do with mental auto-crippling, fascism just being another word for group-dynamics, and vice-versa, and leafing thru any newspaper of your choice does show the effects of this very unhealthy miswiring of most of ours' brains, page after page.)

25
General Software Discussion / Re: Good news for any InfoSelect users
« on: December 21, 2014, 12:52 PM »
1

Your infatuation with IS is established, and I also am acquainted with the ways some people here kick up long-buried threads from the dust again, with titles having become hilarious in time, so I didn't really fell for the obvious trap "Good news", for revival of a more-than-5-years old thread. In other words, I have been zero astonished from the absence of any good news. On the other hand, you never know, there might be a chance there is, so you have to look it up, and that way, they get you anyway - it's disrespect for readers NOT interested in lies but in news, but as long as foxes do protect the hens...

2

Thus, since I have been lured into this anyway, let me say that some, "But what IS has never addressed is the Cloud and all the features that, as in OneNote, allow one seamlessly to store and search things that aren't straight text, including audio." is of sociological interest indeed, but that in fact, illiterates discuss linguistics.

Any discussion about how to group and make available files and info therin in multiple, optimized ways, is a modern one, and an expedient one, cf. http://www.outliners...m/topics/viewt/5593/ , and I do not deny that cloud access has become worthwile to discuss, too ; any whining about not all your stuff being replicated within some proprietary db repository is just completely unsexy, vieux jeu blatant amateur blah blah (not to be confounded with possible enthousiasts' rectifiable beginners' mistakes).

You simply have to conceive that smart sw designers - Mr. Lewis here - might have realized that the "put it in the pim" concept has had its day, and that they act accordingly* (and oh, that, "The response to my second e-mail was no response at all. I don't understand why MicLog refuses even to confirm that they are (a) working on an upgrade; (b) not working on an upgrade. If it's the latter, then there's nothing to lose by saying so." - that was a good one! cf. the never-ending dying of askSam) - and get some other clothes, please, your old ones are unbearable to look at. (Oh, it isn't even you? You a replica? That fits then.)

* = Look at it from another angle: One smart developer at least doesn't degrade himself to taking any more money from idiot users willing to throw their money again and again into the wrong direction. It's just that you ain't accustomed to honest merchants anymore, Slatty.

Pages: [1] 2 3 4 5next