Messages - helmut85 [ switch to compact view ]

Pages: [1] 2 3 4 5 6 ... 12next
Ath, I don't understand this symbol. All I can say is that Paul Keith is my friend. His way of presenting things is terrible, but very rare are those unique people who try to THINK instead of repeating common "truths" / unison. I'll continue this thread in "They're still standing."

"I created this analogy so that you would understood too why full web clipping has a unique future over partial web clipping." Paul, again, most of the time, I clip the whole text of a web page but then quickly bold important passages, or do it later on. It's not so much about cutting out legit material there (but cutting out all the "crap" around), but of FACILITATING FURTHER ACCESS upon that same material: You read a text, you form some thoughts about it or simply find some passages more important than others, and so you bold them or whatever, the point being, your second reading, months later perhaps, will not start from scratch on then, but will concrentrate on your "highlighted" passages, and also, these are quickly recognizable - now for simply downloaded web pages: you'll start from zero, and even will have some crap around your text. I think it RIDICULOUS that these pim's do their own, sub-standard "browsers" (e.g. in MI, in UR...), but don't think about processing nor of clearly distinguishing of bits within these "original" downloaded pages. All this is so poor here, and in direct comparison, the pdf "system" is much more practical indeed. This being said, I hate the pdf format, too, for its numerous limitations. But downloaded "original" web pages are worse than anything - and totally useless; as said, in years, I never "lost" anything by my way of doing it.

An example amoung thousands: You download rulings. You quickly bold the passages appearing important for your current context. Then you clip, from this whole text body, some passages - probable, you'll do this after some days, i.e. after having downloaded another 80 or 120 rulings in your case, i.e. you won't do this after really knowing what's decisive here, hence your need to read, and to "highlight passages within", many more such rulings. So what I'm doing here then, I trigger my clipping macros on some passages within these bold text blocks (or even between them if in the meantime it occured to me that my initial emphasis was partly misplaced), and I paste them, together with the origin, url, name of item and such, into the target texts.

What do YOU do (rhetorical question here), with your "downloaded original web pages" - you all must read them a second time, before doing any clipping. That's what I'm after here: The original web pages becomes a hindrance as soon as you quickly have read it once: After this primal vision, it should become something more accessible than the original web page is. Pdf is much better here, and my system of downloading plain text, then re-format it to my means, is best... if you have just a few pictures, tables, formulas that is - hence the ubiquity of pdf in sciences, and rightly so.

But mass downloading of web pages in their original format is like collecting music tunes (incl. even those you don't even like). It's archaic, it's pueril, it's not thought-out, and it's highly time-consuming if done for real work. And that's why I'm not so hot about progs that do this downloading "better" than other pim's do, but which ain't among the best pim's out there.

It's all about choices - but today's consumers do lots of wrong choices, that's for sure. Hence my "educational stance" I get on so many people's nerves with. But then, somebody explain to me why it would be in anybody's interest to have a replica as much faithful to the original within your Ultra Recall browser (! and which probably is much better in your Surfulater browser), months after downloading that web page, the real prob being that both force you to begin at zero with its respective content then.

This way of doing things is simply, totally crazy, and 95 p.c. of people doing it this way is no reason to imitate their folly. Oh yeah, such pim's sometimes present "commentary" field, in order for your entering your thoughts about that web page or such. Ridiculous, as said, far worse even than so-called "end notes".

Do you realize that in the end, it's again about the "accessibility of information"? Let's say you read all these things, and stored all these things. Then, in order to really have them available, in their bits, for your work, I have to browse 100 rulings for these important passages (remember this was done by first reading, so perhaps my reading time there was about 120 p.c. of yours at the same time) - whilst you will read all these 100 rulings again, which makes more than double reading time, since your second reading will be slowed down by your fear to not have "got" all the important passages here (when, in my work flow, it's probably time for underlining sub-passages now), and then, as I do, you'll export your clips.

Ok, in one single situation your way of doing things would appear acceptable: When you download pages without even reading them first in the slightest possible way. But then, is it sensible to do this?

There's some irony here, too: There's a pim user fraction who says, I'm happy with plain text. I'm reverting web pages to plain text, whilst many people probably fear loss of information when they don't "get" the formatted text of the web page in its original form. And then, I need formatting capabilities in my pim or whatever. So, "do your own formatting". Once more, most of the time, I get the text in whole, then "highlight" by formatting means. It's rare that I just clip a paragraph or so, because indeed I fear my changing clipping considerations. And indeed, it's very probable you need some ruling for a certain aspect now, but for another aspect in the future, so it's sensible to download it in full, then clip parts from this whole text body - but even then, it's of utmost utility to have important passages "highlighted", from which you'll clip again and again.

And in general, please bear in mind that you choose at any given moment. Ok, it's the whole page you download, but from a site containing perhaps 150 pages or more, i.e. even when you try to NOT "choose beforehand" in order to preserve the material in its entirety, your choice will have made upon which pages you download in full, and which you didn't download.

So, put a minimum of faith into your own discernment: When choosing web pages to download, AND when choosing the relevant part of these web pages you'll download.

And Paul, it's of course about bloating the db or not, as you rightly said: It's about response times and such (whilst some db's get much bigger now, cf. Treepad Enterprise, the higher-priced version of it). But the real point is, try to not bloat your own data warehouse, with irrelevant material, even when technically, you're able to cope with it: Your mind, too, will have to cope with it, and if you have too much "dead data" within your material, it'll outgrow the available processing time of your mind.

And as I said elsewhere, finding data after months is greatly helped by tree organization, in which data has got some "attributed position", from a visual pov. Trees are so much more practical than relational db's only are, for non-standardized data. Just have 50,000 items in an UR db, and then imagine the tree was missing but you had to exclusively rely upon the prog's (very good) search function.

All the worse then that there'll be never a REALLY good pim, i.e. that UR and all its competitors will remain with their innumerable faults and missings and flaws. And don't count on Neville to change this - I'd be the happiest person alive if that happened, but Neville won't do it, it's as simple as that.

EDIT : It just occured to me that I never tried to download web pages into a pim, but when you do, you will probably never get them out and into another pim, so even when downloading them, having them in a special format or special application, then just linking to them, seems to be preferable independently of the number you tend to download... And that would be .mht for just some pages, in my case, or WebResearch for pages in numbers, in your case - that'd be my advice at least if you cannot leave this web page collecting habit behind you. Stay flexible. Don't join the crowd "I've got some data within this prog and that I otherwise don't touch anymore" - I read lots of such admissions, and it's evident these people did something wrong, at some point.

Paul, I know I hadn't answered some points you made, and it's my lack of time these days (will be better in Feb when I recoup some of these, promised - it's just I have to read you 5 times VERY SLOWLY before having a chance to get some points, in a minimum of order, and I miss this time

"There's still SO MUCH lacking in PIMs it's crazy.", you say. Oh yes, and that's about CHOICES. So back to Neville for once.

Neville, you made your choices, which is your right, but to be frank, I'm unhappy with your choices. Fact is, from the price of your editor, and from the number you claim to have sold your editor, anybody can do the maths, and even over the years, it's evident you got millions for your work in your editor. This is very fine, and I'm happy for you.

Problem is, you didn't invest much of this money, time-wise, in the perfection of your editor, where I would have exactly done this. I know a little bit about editors, as I know a little bit about pim's, and as far as I can say - I trialled your editor -, you stopped development of it at a rather early stage, i.e. I know lots of functionality available in other editors, even for much lesser price, that's missing in yours. So it's a FACT when I state your editor is overpriced, and I'm too "dumb" to see where the "real" quality of your editor might hide. Stability? Lots of good editors are stable? Ease of use? Not so much. So I don't get it but acknowledge your number... and then I ask myself, with such numbers to back further and farther development, why stop development?

Because, I assume, you've got marketing considerations within your way: Instead of developing the perfect editor - which yours is rather far away from, in spite of its price -, you saw (I assume) that traditional editors are "dead", so further and farther development would have "cost" you big time, without procuring any substantial returns, especially when comparing them to those you got already with your editor as-it-is.

And then, surfulater. Some very good ideas, some real brilliance, and then, instead of developing a really good pim which, on top of being a really good pim, is the best pim-like web page downloader (well, one by one, manually, don't let's too much start to dream here!) - instead of developing such a beast, for marketing reasons, you stop development, and you do it from scratch again, more or less web-based, and in that proprietary fashin I wrote above. All this is up to you, it's your product line, and for your purse, I don't have the slightest doubt that your decisions to stop development on your editor in order to make gains from Surfulater, and then again, to stop development on desktop Surfulater in order to bring out a new product, is highly beneficial.

But then, it's developers like you, Neville, brilliant developers but whose eyes are too much on their purse instead of the excellence of their product, who are responsible for what Paul says: "There's still SO MUCH lacking in PIMs it's crazy."

There will never be a brilliant editor, never be a brilliant pim, never be anything really outstanding in any general sw category - because those brilliant developers who could it, at a given moment stop and then do something else from scratch instead because of the financial gains they see there laying, and they want them, There's sociologists who say, everything above 5,000 euro / 6,500 bucks will not get you any more happy than those 5,000 euro, so there should be lots of room for successful developers to develop a product further than economic "reason" will tell them. But it's simply not done, nowhere.

And don't say, "hold it slick, people want it slick" - people want to have easy access to elaborate functionality, they want intuitive ways to do their work, and the simpler this gets, the work the developer has to do. But no, they don't do their work, they have too much to do in their strive for the dollar. (It's similar for file commanders and many other sw categories: Nothing REALLY good out there, they all stop by that "further work wouldn't be enough return" for me point.

And that's why 35 years after the intro of the pc, and the pc "dying" now, pc sw never has reached a mature state - not even Word which doesn't become a good text processor but by applying lots of internal scripting to its core functionality (but which at least allows for such internal scripting - most pim's do not).

And then again: Where's Surfulater's functionality to smoothly PROCESS what you got from the web by it, and that's my point. Developers do have the right to stop development early on, but I then have the right to not be happy about what I see, and let developers know (not that this made any difference on them: that indeed I've seen a long time ago).


"There has been some debate here about political and related topics. And many at DC (including our host) feel this is not really an appropriate venue for it."

Thank you, 40hz, that explains a lot. On the other hand, this way, the owners (the owner and his "men") of this forum have to ask themselves, at what side of the table do we place ourselves by this stance?

Just today, there's press coverage of an adjacent subject I missed covering in my "essay" above, which is package identification and paying for some packages, payment by the sender (here Google) for the "infrastructure" of the web provider (here: Orange, in France), in order for the customer (= you and me) to receive the content in question.

It's obvious that these Google vids with their lots of traffic constitute a prob for those "providers", but then, in my country, you pay them 50 bucks a months for a "flatrate", and some of these "providers" don't offer a real "flatrate", but impose a limit of 50 GB / giga per month.

So the real problems here are, soon there will be a time where with one provider you'll get "everything", whilst with another, you'll get "anything but Google vids", and then, there is, "anything but a, b, c...z" in the end, and you cannot change your provider each month, you have minimum contract terms, and periods of notice. That's prob 1.

Prob 2 is, more and more it will become accepted to have inspected these packages, and eventually, they could even refuse to transport encrypted packages on the pretext these could contain not even illegal content, but simply non-contractual content.

Thus, I thought that DC was a "users'" forum, and does not represent the "industry".

( Ironic here, in "Die Zeit" site, today, they speak about a possible "perfume" or such that will enhance your natural body odour, and somebody leaves the commentary, well, this is new indeed, for the first time, they will sell you something you've already got by nature! Somebody else, a good work-out could enhance this natural body odour as well (the point in the article being that females would like to smell this odour in order to feel attracted (or not), a case of "biological matching", by "matching genetic material". Why I'm speaking about this? Because above, I said the "industry" sells academic papers, with horrible prices on top of that, to the general public who's already the owner of these academic findings, having financed them all to begin with. )

I acknowledge I shouldn't perhaps have posted these "political" things here, in "sw", but in the "general" part of the forum, but then, I also wanted to explain the mutual reverberations between scraping sw (Surfulater, WebResearch) and pim's, AND then the web in general and content in general - at the end of the day, we're speaking of external content here, and even when we speak about simple pim's here, we're speaking of their ability to handle content original belonging to third parties, so it's all some mix-up where everything I'm discussing belongs to something else within this lot.


"I think the "Of course Surfulater can also grab entire webpages was what lead to helmut85 saying it was for web collectors."

Thank you, Paul, that was my point in this respect. In fact, whenever you clip bits only, any such pim will be more or less apt (and certainly will with some external macro boosting, whilst those two "specialist" offerings are there in order to render whole page pages (much?) better than the task is executed by your ordinary pim. On the other hand, if it's not about whole web pages, I don't see the interest of these "specialists", since as pim's, both ain't as good as the best pims out there are.

This addresses to nevf = Neville, the developer, and I perfectly understand that you defend your product, but then, there have been lots of customers or (in my case, prospects) who eagerly awaited better pim functionality in your product but which never came, and fact today is, as a pim, it's not within the premier league, and that's why I call it a specialist for special cases, but I don't see much of these special cases, because for downloading web pages for legal reasons - I said this elsewhere -, neither your product nor your competitor, WebResearch, are able to serve for this special purpose either.

You've made a choice, Neville, whis is, have the best scraper functionality in pim's, together with WebResearch - it seems Surfulater is not as good as WR here, but then, as a pim-like, it seems to be much better than WR, so it might be the best compromise for people wanting to download losts of web pages in full, but as said, then you have two probs, not enough good pim functionality here (since it was your choice to not develop this range of functionlity to the fullest), and - I repeat my claim here, having asked for info about possible mistakes in what I say, but not having received such info yet -, for annotating these downloaded web pages, what would there be? (Just give me key words for me searching your help file for these, and the url of that help file, and hopefully there are some screenshots, too.)

As soon as you do clips both from web pages in the web, or from downloaded web pages, there's much very different functionality needed, and where some pim's are much better than others, and where any pim isn't that good in the end, but where you can add some functionality with external macros, especially when your pim offers links to items (which Surfulater does if I remember well, so it's not my claim that Surfulater can't be used for such a task, my claim being, lots of other pims are equally usable here, and they offer more pim functionality on top of this.

Paul, as for pains with pdf's, you should know that most sciences today have lots of their stuff in pdf format, and certainly more than 90 p.c. of their "web-available" / "available by electronic means" stuff in this format, hence the interest of pim's able to index pdf's, hence the plethora of alternative pdf "editors" and other pdf-handling sw, allowing for annotating, bookmarking, etc., so your claim (if I understand well) that pdf is a receding format, is not only totally unfounded, but the opposite is true.

Neville, this brings me to the idea that any "specialised sw", specialised in the very best possible rendering of web pages as-they-are (since, as said, it's uneconomical to download lots of web pages, just because, with your sw, it's "possible"), should go one step further and also do pdf M, by this blurring the discrimination between downloaded web pages, and downloaded pdf's - but then, it should also offer lots of, and easy = half-automated web pages annotation / bookmarking features, too.

Paul, with lots of your writings, I have a recurrent problem, and please believe me that my stance isn't a denigrating one, neither a condescending one: I mix up lots of aspects, but then try to have a minimum of discernment there, by numbering / grouping. In your texts, every idea stays mixed up with every else, and so, most of the time, for perhaps about 80 p.c. of your text bodies, I just simply don't get what you try to express, and as said, this is a recurrent problem I have with these texts of yours. I'm not a native speaker, as we all know, but then, I get your English words, but I don't get the possible meaning behind them, and very often, I have the impression (as a non-native speaker, as said) that your sentence construction is in the way, so perhaps, after posting, could you re-read your texts, and then partially revise (as I do, and be it just for typos, in my case)? I repeat myself here: My "criticising" your texts has the only objective to "get" better texts from yours I'd then better and more fully understand, since I suppose up to now I don't get many good ideas buried in them, and staying buried even when I try to read you, and that's a pity.

You must see that when Ath does apply condescendance to us both, giving us advice to write in blogs, insteads, i.e. telling us to be silent here, it's, for one, that most people in "chat rooms" prefer short postings between they then can jump as a bee or other insect would between many different flowers, and also because they don't want to think a lot: Here, it's spare time, it's not meant for getting new insights except when they come very easy - but it's also because the effort of reading some people doesn't seem rewarding enough - a question of formatting texts, of inserting sub-headers, of trying to offer "one-bit-at-a-time", and so on. And when, in your case, there's also a debatable sentence structure and ideas not developed one after another but thrown together, and then perhaps discussions by fractions of them, these discussions thrown together again, and introducing new sub-ideas, then "re-opened" many lines below... well, we can't blame people refusing to read us when we wouldn't like to read ourselves in the end, can we?


Some other off-topic theme that has got some connections, though:


I don't have a television set anymore for ages: I couldn't bear them stealing my time anymore. I always thought - it's different with good films where you dive into the atmosphere of the film in question, instead of wanting it to hurry up, but there ain't many good film in European television's programming being left these days - that they slow down your time on purpose. They do some news, which costs you 10 minutes. Instead, they could have done it by presenting you a "magazine article" or something in which you could have read the same info in 3 or 4 minutes if not in 2, very often. Much worse even, anything that is "entertainment there": They always slow down what's going on there, it's absolutely terrible, and at the same, you might be interested in what will follow, so they force you to do "parallel thinking": You try to not spend these moments exclusively on the crap you're seeing, but at the same time, that very crap there is interrupting any other thinking you're doing, at any moment (since it IS continuing, but at a pace that virtually kills you).

Hence my thinking that tv is meant for stealing time, for making people not think, for filling up the spare time of people in a way that their thinking processes are slowed down to a max - they call this "suspense". Of course, you can remind me of tv "being for everybody", so it "has" to be somewhat "slow" - but to such a degree? Just a little bit slower yet, and our domestic animals could probably follow! This is intellectual terrorism.

Where's the connection? Well, my topic is fractionizing and then re-presentation of information / content, and this "tv way" of doing it, needing 1 minute for presentation of a fact that should need 8 or 13sec., at the opposite of what Paul seems doing, i.e. mixing up 5 different things in 5 sentence, then mixing 3 of them up in the following one, then mixing up again 2 from the first and 1 from the second with another 2 ones, is another apotheosis in information rape.


The French legislator has postponed the subject of a proposed a law on these "data expeditor having to pay, too" issues ad infinitum, meaning they want to see first how it all goes wrong in every which way, then perhaps they'll do something about it. Bear in mind that authorities, and especially the French ones, have historically highly been interested in data content, so they certainly rejoice of this move by Orange / France Télécom (or told them to do this move in the first place: acceptance is everything, so they have to play it cool, first).

Paul, I fully understood your very last post after 5 or 6 times reading now. As for the preceding one, I'm always trying. My prob here being, I didn't ever have similar probs with posts of somebody else, not here, not elsewhere. So it should partly be a prob in your writing, as in my writing conception, there's certainly some flaws, too.

I'm very sorry you didn't find any idea applicable to your own workflow here, and indeed I was just a little bit disappointed that Aaron Swartz' premature death didn't give rise to any obituary here, before mine, and which didn't trigger any thought about that guy and his mission expressed here. As for my wearing out the servers, I'm not so sure that some text takes so much more web space than lots of unnecessary and often rather silly graphics adult men regularly post here just for fun, so I hope that I will not attract too much wrath on my head too early, by trying to share some ideas in a constructive way. Thank you.

Pages: [1] 2 3 4 5 6 ... 12next
Go to full version