topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Wednesday December 11, 2024, 2:33 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: Partial (corrupted) downloads from a server (Not a valid win32 application etc)  (Read 28308 times)

jaden

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 145
    • View Profile
    • Donate to Member
Usually that happens after mouser and Gothi[c] do some config changes to the/a server, they wait for a little time to see if 'that helped', and if it did they happily report it fixed here

That was my first thought too, but I wasn't daring enough to move into full-on wishful thinking mode  ;)

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,649
    • View Profile
    • Donate to Member
I may have just got lucky, but IE9 grabbed that file and all the others at the link f0dder posted just fine on the first shot.

I tried it again and it downloaded completely in all 3 browsers, as well as wget.

Great, We've confirmed the problem is intermittent.  :wallbash:
 :D

Carol Haynes

  • Waffles for England (patent pending)
  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 8,069
    • View Profile
    • Donate to Member
Just tried http://carrolld.dcme...craft/2011-04-07.png in IE8, Firefox 3.6.16, Chrome, Opera and Safari and it loaded in each with no problem.

I have to say I have come across this problem before with client machines that won't complete downloads but I have often found other machines download them fine - even on the same WiFi network.

It's strange and I have never found an acceptable answer to the problem.

Could it be something to do with faulty caching at ISPs when it is a more general problem?

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Works here now too, FireFox as well as wget from the Linux box. Seems to have relatively slow start, but I hit 330kb/s for one of the larger images :Thmbsup:. Hope it's Gothic that's been sprinkling magic dust on the servers :)
- carpe noctem

JavaJones

  • Review 2.0 Designer
  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 2,739
    • View Profile
    • Donate to Member
So far this is what we have seen in this thread:

Initially, at least 3 people (f0dder, jaden, and myself) were able to confirm the issue with downloading those PNGs.
At least 2 confirmed that wget had the same problem during that time period.
Subsequently multiple people confirm that the issue is no longer present either with wget or the browser.

What this says to me is A: it's not browser-specific (wget had the same problem previously while the issue was occurring) and B: it's intermittent.

Hopefully all this is moot and they've found a fix already. :D

- Oshyan

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Hopefully all this is moot and they've found a fix already

Nope. Let me clarify this -- We are at a loss as to how to fix this and what's causing it -- that's why I posted.

It's not a new phenomena but it seems to have gotten worse since the server move in January.  It's obviously not just our server suffering from this, as a web search reveals.

But this is not something easy -- as you have noticed it's very intermittent and random and hard to reproduce.

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,629
    • View Profile
    • Donate to Member
And it isn't related to the datacenter, or it's connection to the backbone, where the servers are hosted?

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
it might be.. if someone could explain how or what could be wrong that could cause this..

i'll repeat the point i keep trying to emphasize -- the internet is a giant heap of malfunctioning junk powered by an infrastructure of software that is insanely retarded.. but even so, when downloading a file if you have connection problems, your browser knows that the download has not finished and waits there until the connection recovers, or else aborts and knows that it has failed with an incomplete file.  and the communication protocols know when there is packet loss and request packets to be resent, etc.

the salient point about this mystery is that something is truly malfunctioning in this normal process which causes downloads to stop in the middle, and the browser is getting completely confused into thinking that the download has completed successfully.

the question is how can this happen -- in what specific way are things getting so fouled up as to result in the browser thinking it has successfully downloaded a complete file without error when it hasn't, bypassing all of the normal processes that normally detect and recover from bad transfers?

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,649
    • View Profile
    • Donate to Member
Can you temporarily disable download resuming to see what (if any) effect it has on the behavior?

Even if the server had a shaky connection to the backbone, the chances of router munging a packet to perfectly resemble transfer complete I just don't see happening.

Nobody jumped at the script timeout angle, and it doesn't appear that any are being used.

There are to many client side variations experiencing the same issue for the problem to exist on that end.

(Bear with me I'm just thinking out loud here. :))

What are the chances of a file system glitch making the server think the file is smaller than it really is?

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Keep up the great ideas, suggestions, clues, etc.  Definitely some good ideas on this thread and I know gothic (dc server admin) will be trying some of this stuff to see if it has any effect or triggers any epiphanies.

What are the chances of a file system glitch making the server think the file is smaller than it really is?
Seems unlikely to me -- if that was happening, i would think the server would be experiencing much graver catastrophic crashes in all kinds of operations.
« Last Edit: April 13, 2011, 08:15 AM by mouser »

nosh

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 1,441
    • View Profile
    • Donate to Member
mouser,

Just mentioning this in case it's related. I'm still having a problem attaching images. I just tried uploading a 230 KB image right now to confirm. "The connection to the server was reset while the page was loading." Got the error in under 10 seconds.

I don't have this problem with other sites so I don't think it has to do with my connection. Maybe it _is_ my connection and other sites are more tolerant about timeouts or something?

BTW, I loaded the above png fine, as expected. I haven't had any issues with pages timing out while surfing the forum.

JavaJones

  • Review 2.0 Designer
  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 2,739
    • View Profile
    • Donate to Member
Given that this was reproduced with direct file access to those PNGs, it doesn't seem like a "script timeout" issue as far as I can imagine.

Mouser reiterates the critical issue: it's not necessarily surprising that the transfers should break or whatever, it's the fact that when they do, the browser doesn't realize it and subsequent attempts to download the same file (from the same location) don't trigger a re-download but instead just use a cache.

This makes me wonder whether it may be some kind of in-between ISP caching. I know someone else mentioned that. Very hard to say indeed.

It's also important to note we may be looking at 2 separate but related issues. On one side certainly there are occasional/intermittent transfer issues, as seen with the PNG issue pointed out earlier in this thread. These would also be likely at the root of, or at least involved in, the reason the installer download problems are occurring. So that issue certainly seems to need investigation and resolution if possible, and it seems that can only be either server-side or at some link in-between since it was reproduced simultaneously on multiple distant systems with (presumably) different Internet providers. It would be interesting to test packet loss to the server from a machine that is currently experiencing the slow download issue. It's also important to note it *seems* to be isolated to particular files on the server when it occurs, which is an additional twist which really makes it confusing, but at least suggests that it is more specific to the server than an in-between link (maybe...).

And then on the other side is the issue of the browser not realizing - or not being properly told - that a download is actually not complete when the user requests to download it again. This seems to occur cross-browsers so it does not suggest an issue in a particular browser's caching mechanism. While there may be such an issue common to many browsers, it seems somewhat dubious given they all work somewhat differently (for example Firefox uses temp files in the same directory as the download, while IE uses the system's temp folder and then copies the finished file to the destination folder when complete). Another possibility is some kind of caching in-between, likely at the ISP level (I don't know if the higher-level interlink providers like Level 3 do caching though). This would explain why it happens cross-browser. However my own experience with this issue suggests in fact that is not always so consistent. Sometimes using another browser *will* help. So it's all very confusing.

Bottom line I think the intermittent transfer speed issues from the server are a clear problem and need to be resolved. If they can be fixed it will at least minimize the chance of the "browser download confusion" that is the 2nd part of this issue.

- Oshyan

40hz

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 11,859
    • View Profile
    • Donate to Member


This makes me wonder whether it may be some kind of in-between ISP caching. I know someone else mentioned that. Very hard to say indeed.

It's also important to note we may be looking at 2 separate but related issues. On one side certainly there are occasional/intermittent transfer issues, as seen with the PNG issue pointed out earlier in this thread. These would also be likely at the root of, or at least involved in, the reason the installer download problems are occurring. So that issue certainly seems to need investigation and resolution if possible, and it seems that can only be either server-side or at some link in-between since it was reproduced simultaneously on multiple distant systems with (presumably) different Internet providers.
*
*
*

Was this happening before the servers went over to VMs?

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,649
    • View Profile
    • Donate to Member
Keep up the great ideas, suggestions, clues, etc.  Definitely some good ideas on this thread and I know gothic (dc server admin) will be trying some of this stuff to see if it has any effect or triggers any epiphanies.

What are the chances of a file system glitch making the server think the file is smaller than it really is?
Seems unlikely to me -- if that was happening, i would think the server would be experiencing much graver catastrophic crashes in all kinds of operations.

Not if the corruption is isolated to an area where the OS isn't. Not to mention (if the servers are virtualized) the issue could also exist at the parent machine level.

I had a virtualized mail server that turned into a zero byte file after the power went out once to often. I had recovered it several times before, and typically ran chkdsk on the virtual machine, and then on the parent machine ... But I (got in a hurry) "Forgot" that time and ran it on the parent first. (sh)I.T. happens... :)

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
If it happens again, perhaps folks who can reproduce the problem can capture some of the sessions using something like Wireshark...

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Was this happening before the servers went over to VMs?

not nearly as often, but yes.. in fact it was.  which i think is a real clue.
and it's also what makes me think it's not hardware, since this server is only totally different hardware.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
This looks like a very promising clue to me, about something that happens when apache web server thread is killed.

http://www.gossamer-.../apache/users/398126

If a server is killed (SIGKILL) during a "large" static file transfer, then the client is not notified by his browser that file has not been completely downloaded. On Win it just says it is not a valid Win32 application..

So my thought is that something might be causing apache processes that are in the middle of serving a file to terminate, causing this problem.

Maybe related: http://osdir.com/ml/...003-07/msg00055.html
« Last Edit: April 28, 2011, 04:07 PM by mouser »

worstje

  • Honorary Member
  • Joined in 2009
  • **
  • Posts: 588
  • The Gent with the White Hat
    • View Profile
    • Donate to Member
Why are you sending SIGKILL to the server anyway? And if it is some sort of bug, it should probably be a segmentation fault. Either way, I doubt your link to the fastcgi stuff since that message is about 8 years old and I really think that bug has been fixed by now.

steeladept

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 1,061
    • View Profile
    • Donate to Member
I don't know much about Apache, but just thinking out loud here.  Isn't there a way to tell the server to stop sending checks so that it always forces a download?  It isn't an ideal solution, but if it is something like mouser's SIGKILL find, that wouldn't matter if it always proceeded with a new download.  Alternatively, couldn't a check file be sent at the end of the downloads and when a new download starts it looks for that file (something like a pad file).  If the file doesn't match some criteria (like being present) then it proceeds with a new download.  Then a new file signature would need to be added to the end of every new version of a software download, but that is okay...isn't it?  Of course I could be talking far too high level here to be useful, but it was an idea as a work around until the culprit could be caught.  At very least if it continues you will know it is at the network level and not at the server level.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Why are you sending SIGKILL to the server anyway?

sorry if my post was misleading.. what i was suggesting is that it sounded like the bug might be caused when an apache process randomly died or was killed for SOME UNKNOWN REASON.

and then the next logical question would be -- are apache processes being killed or dying in mid-delivery.

but gothic has thrown cold water on this idea, saying that apache processes are not being killed.

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
trying a new setting now.. let's see if it does any good.

f0dder

  • Charter Honorary Member
  • Joined in 2005
  • ***
  • Posts: 9,153
  • [Well, THAT escalated quickly!]
    • View Profile
    • f0dder's place
    • Read more about this member.
    • Donate to Member
Why are you sending SIGKILL to the server anyway?
Might be related to linux OOMw handling?
- carpe noctem

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,914
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Trying a new setting now -- those of you who were able to recreate the server delivering a partial file error, can you try again and let us know if it's still happening?

jaden

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 145
    • View Profile
    • Donate to Member
I downloaded the file without errors, but it also downloaded successfully before the new setting was in place, so it doesn't say much. ;)