[arch-general] Bad Magic Number in Superblock - Any trick for Arch or for new kernels?

David C. Rankin drankinatty at suddenlinkmail.com
Thu Jun 10 12:06:36 EDT 2010


On 06/10/2010 08:46 AM, Mauro Santos wrote:
>>From my experience only, I find it quite hard to know when a disk is
> about to fail. Currently I am trying to figure out if an hard disk in a
> machine I manage is about to fail or not (3'5 drive), smart says it is,
> badblocks can't find anything wrong with the drive (even after 2 full
> write passes) but one of the smart attributes, the one that says
> failing_now increases by one with each full read cycle, smart attributes
> do not report any reallocated sectors. This is a new drive (6 months
> old, give or take) and the other drives assembled in the machine have
> exactly the same usage and do not show any signs of trouble (the serial
> numbers of the drives are all very close, almost sequential, all from
> the same manufacturer).

Mauro,

	Your experience sounds exactly like mine over the past year. I have had 4
Seagate drives supposedly "go bad" after 13-14 months use (1-2 months after
warranty runs out). The problem is always the same - smart says there is a
badblock problem and it logs the time/date of the error. Subsequent passes with
smartctl -t long shows no additional problem and the drives always 'PASS'.

	Where this behavior between badblock/smart/Seagate drives is killing me is that
most of my drives run in raid1 sets with either dmraid or mdraid. The dmraid
installs seem to be the most sensitive to this problem. I know that the hardware
ought to provide badblock remapping on a per-drive basis on the fly, but I still
don't have a good feel for how dmraid handles this internally.

	Regardless, when I split an array where one drive is showing badblock issues
and then use the drive as a single drive, then I don't have any more problems
with the drive. So, from what I'm seeing, there is a problem in the way
smart/badblocks/dmraid plays together. I don't have a clue what it is, but I've
been through that scenario 4 times in the past 12 months.

	This failues is different. Here the drive was stand-alone to begin with and
contrary to the earlier badblock/dmraid drives, this drive can no longer be read
with any power supply. (when I work on them out of the machine, they have a
dedicated power source provided by the usb connection kit)  I think the only way
I will ever get an answer on this drive is if I find my dump of the CHS
partition info for the drive and then manually re-create the partitions to tell
the drive where to start looking.

	Ceste La Vie.... I'll provide a follow up if I manage to uncover any more on
the reason for the failure. Thanks for your help.

-- 
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com


More information about the arch-general mailing list