ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

"Delayed Write Failed" — on FIVE computers at a time?

<< < (3/6) > >>

IainB:
@yksyks: Well, I'm stumped. I can only think of a few questions, and I suspect they have already been covered by your good self. Reading through this, and to summarise and make a few assumptions (so please correct me if I am wrong), it seems that:

1. The error symptoms are:

* After a long period of idle (usually overnight) the system crashes with several error messages "Delayed write failed..." The files involved are usually $Mft and C:\Windows\system32\config, but may include seemingly random others as well.
* In all instances where this happens, the USB ports and the network card stop working.
* Replacement and stress testing of the devices affected indicates no evidence of actual hardware errors.___________________________________
Question: Where any affected hardware has been removed and found to be OK, has it been subsequently reinstalled to the same computer and had its performance monitored/observed, and has it subsequently been affected by the same or a different error?


2. Common/differentiating factors in the population of computers affected:

* All of the affected computers use Windows XP SP3 (but with differing language versions), and they all use BitDefender Free Edition. There are no other substantial/significant similarities.
* The affected computers do not share the same AC power supply. (Is this true?)
* The affected computers do not share the same DC power supply. (Is this true?)
* LogMeIn may/may not be a common factor for the affected computers. (Needs verification.)

3. Population of computers affected:
The error symptoms have spread amongst what appears to be a gradually widening population of computers, some of which share a network and some of which are discrete systems - i.e., not interconnected or intercommunicating. (Is this true?)
___________________________________
Observation: This sort of thing could look like the spread of a virus or bug introduced at a common point to all computers affected - e.g., maybe (say) at the point of a system/software update.
___________________________________
Questions:

* Have the update (change management) logs for all computers - including those computers affected and not affected - been compared to establish what specific updates were done and when, on the run up to and prior to any manifestations of the errors?
* Were any system updates derived from the same media or update data file(s)?
* Does a checksum comparison of all of the "same" update files (and all/any other files on the media used) - for all computers affected and not affected -  show any differences in the files? (There should be no difference.)

4. Conclusions thus far:

* After a great deal of investigation and analysis so far, none of the errors have been deliberately repeatable, and so the cause(s) remain unknown.
* The causal problem seems very likely to be software-related, not hardware.

Shades:
Great summarization IanB.

Delayed write failure errors only occur when a device isn't responding in the allotted time to signal the file system that the write action can take place.
And there are many reasons for this to happen and have been already discussed. However, these are practically always hardware related (in my personal/anecdotal experience at least).

@yksyks:
Are the systems also properly cooled? All the time? Are you sure?
The reason I ask is that here in Paraguay there can be very high ambient temperatures. Because of that I need hard disk cooler, which are screwed on the bottom of the drive and have 2 fans on it (one to blow air onto the device, the other to suck the hot air away). Without these,  I can run these disks for only a few hours (in spring and summer) and then the operating system/file system plain simply "looses" the drive. 

Running hard disks at high temperature seriously shortens their life span and damages it in the mean time.

If you do work with hard disk coolers, are these functioning properly? A fan that is stuck or not moving smoothly draws much more power than you expect and in essence becomes a heating element...residing under your hard disk, making the drive more hot more quickly.

Do you use solutions that make any of your PC fans slow down after a while? Are these coolers perhaps connected to chassis fan connectors? (I have such crap here when people bring me computers to repair.

Software can do crazy things if the hardware supplies a '1' when the software expects a '0', When that happens on a slightly bigger scale a cascading effect occurs that makes your computer behaves erratically. With the densely packed (magnetically/electrically) hard drives of today, there isn't much margin for error anymore on the hardware side and in combination with an intertwined operating systems such as Windows those errors can create havoc easily.

You have a vague problem and it is good of IanB to ask/confirm the details of your setup. Without a good description our guesses are as good as yours  :P

IainB:
...Delayed write failure errors only occur when a device isn't responding in the allotted time to signal the file system that the write action can take place.
And there are many reasons for this to happen and have been already discussed. However, these are practically always hardware related (in my personal/anecdotal experience at least). ...
_______________________________
-Shades (September 08, 2014, 10:57 PM)
--- End quote ---

Of course I looked already into the event viewer. The only suspect messages I found there, related to HD, were repeated warnings from PerfDisk: Unable to read the disk performance information from the system (Event ID: 2001). After a series of these follow assorted crashes of random apps and disk-related errors.
These errors occur solely after some hours of inactivity. So far they never appeared while working on the machines. Even the machine left idle for a couple of days doesn't crash every night, though. So far I was unable to track down any pattern.
_______________________________
-yksyks (September 06, 2014, 06:51 AM)
--- End quote ---

These comments do not seem easy to reconcile. I was reminded of something when I read of "disk performance information" above - it reminds me of the following, but I am not sure whether/how this could be relevant:
EDIT 2012-09-17:
Hooray! This seems to be an effective fix to the episodic real-time performance monitoring issue:
(for more info., refer HDS FAQ page http://www.hdsentinel.com/faq.php)

The real time performance monitoring worked per the Registry settings workaround (see earlier edit below), but after some time (for example after connecting/removing external hard disk, pendrive or similar storage device) it stopped working and I periodically had to reset the Registry settings - i.e., the Registry settings change did not "stick". This was apparently caused by a function in HDS which provides for performance monitoring when a new device - e.g., an external hard disk - is connected/detected. When this happens, Hard Disk Sentinel has a function that clears the performance object cache and re-detects the performance objects. On some systems (regardless of hardware configuration) this function apparently causes the Windows performance monitoring settings in the Registry to be disabled.

If this happens, you can disable this HDS function as follows:

* 1. click "start" (Windows) button and to the search field enter REGEDIT
* 2. open REGEDIT
* 3. navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\HD Sentinel (or HKEY_LOCAL_MACHINE\SOFTWARE\HD Sentinel under 32 bit Windows), where you will see a lot of keys.
* 4. create a new STRING key named DisablePerfCacheClear and specify a value of 1 for that.Then restart HDS, which now will not issue this special function to clear the performance object cache when it detects the change of configuration, so the performance counters will continue working normally - once reset in the Registry. Those Registry settings should now "stick" and not need to be reset again.

--- End quote ---
-IainB (February 02, 2012, 08:31 PM)
--- End quote ---

4wd:
However, these are practically always hardware related (in my personal/anecdotal experience at least).-Shades (September 08, 2014, 10:57 PM)
--- End quote ---

Just to add to this, I've had the same experience here with Delayed Write Failure and in almost every case it's always come down to one piece of hardware - the SATA or USB data cables.

For SATA - those cheap, non-locking ones that were so prevalent a few years ago.  Low contact pressure combined with with the heating/cooling cycles within a PC case don't make for a happy combination.

For USB - the constant plug/unplug or the mechanics of the plug/socket sometimes result in a marginal electrical contact.  (eg. On my netbook there's one USB port where if I plug a cable all the way in, the device isn't recognised - pull it back 1 mm or 2 and it's fine.  Thus any decent vibration will result in the plugged-in device suddenly disappearing from the system.)

yksyks:
Thanks to all.

Let me summarize in brief:

I suppose that any hardware cause is off now. Temperatures are okay, cable replaced, the disks repeatedly tested on other machines, some replaced too (with a cloned contents).

As I mentioned LogMeIn, please forget it. I meant TeamViewer, sorry, I mixed the two.

And as for the PerfDisk warning: it appeared only on one machine. On others there were different problems reported, like unreachable paging file, and so on.

As for the HD Sentinel, I don't have anything like this in all the registry.

Now, some news: On one PC I replaced the Bitdefender with a trial version of Malwarebytes. The behavior changed dramatically: now it doesn't report anything, just crashes to hard reboot. The reboot always fails, as no HD is found, even in BIOS. Reset doesn't help, after switching off the power and restarting the system starts normally. Other change is that it now happens every couple of minutes regardless of being idle or not.

So, my question is: Could there be a virus so powerful that would be capable to disable the disk controller? I still think of some software collisions, but...

I could be cursed as well. Believe it or not, but since yesterday the only working machine in the house (notebook with Vista), shows just a pitch black screen. It's working normally, I can reach it using the TeamViewer, just its display died. Funny.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version