D-Link Forums
The Graveyard - Products No Longer Supported => D-Link Storage => DNS-323 => Topic started by: 68catcar on November 17, 2009, 08:20:33 PM
-
There appears to be a software issue with Raid1 recovery under both versions 1.06 and 1.07. I recently purchased this unit and ran my own QA before putting it in service so that I understood the recovery process.
I found that with two Segate ST3750640NS drives installed and formatted as a a RAID1 pair that the DNS323 functions normally and has decent network access.
If either drive is failed, re-inserted and re-started, the system recognizes that it needs to re-sync. I found that the re-sync process never completes and the minutes counter shows timer values very random. It may show 200 minutes and a screen refresh will bump this to a higher value. I have seen it as high as 1000 minutes remaining. Another symptom is that access to the share will be randomly very slow as in up to a minute opening a folder and other times appears normal. I believe that this is because the processor is bound by the re-sync process. The drive activity lights show activity randomly where it may appear that they are copying for a few minutes then stay solid for another few minutes.
I thought that there may be an issue with the DNS323 reading the drive parameter block or some other existing partition information. On the failed drive I used another machine to delete all the partition information and re-inserted the drive. This time I got a message to reformat the drive which I thought was hopeful. After the drive reformatted the re-sync process started with the same behavior as before and never completes.
So far I have found that the only way that I can get a mirrored pair back is to delete the partitions on both drives and start new. Needless to say this is not a very good recovery mechanism if I had a drive full of data.
After going through the painful process of tech support and getting to a product specialist that spent more time questioning what I had put a static IP address on the device rather than use DHCP and not addressing the real issue really frustrates me. So I thought I would try the forum for similar experiences or get the attention of DLINK using this avenue.
Regards
-
Personally - my DNS-323 has never had issues resyncing drives after a simulated failure - however, you appear to be using the unit during the resync, which I have not been doing - based on experience with other RAID devices, this will cause the resync to take longer - much longer.
I also want to point out here, that reintroducing the failed drive as the replacement can give unpredictable results, and I would suggest you avoid doing it.
-
Reset the unit to factory defaults and see if the resync will now complete if you have not reset the unit to defaults after changing firmware versions.
-
I appreciate the responses.
ECF - Yes I did a factory default after upgrading to 1.07. As a fail safe I did so again and will allow a day for re-sync.
Fordem - I am not using the unit after a simulated failure other than access to the status page. I do realize the length of time needed for re-sync. The unit has sat idle for as long as four days without a sync completion.
-
I saw that once when I first formatted in RAID-1, but I've done several simulated failures since and it's always completed the sync. The failure was with older firmware, but now I can't recall if it was 1.06 or 1.07.
-
It is not my intent to argue, however ...
Another symptom is that access to the share will be randomly very slow as in up to a minute opening a folder and other times appears normal. I believe that this is because the processor is bound by the re-sync process. The drive activity lights show activity randomly where it may appear that they are copying for a few minutes then stay solid for another few minutes.
-
Unless I misread the OP's post, I get the impression that one of the HDD's was removed and then re-inserted without first deleting the partition. A proper "fail" simulation test would involve using a different "partionless" HDD as that is what would traditionally occur.
This does raise the issue however, that the DNS should automatically remove any/all partitions present on the newly inserted HDD. Confirmation that this is the case by D-Link Engineering would be appreciated.
In this case since it's the same HDD, the question that needs answer is how the DNS behaves when an existing HDD is removed and reinserted. I suspect the DNS may be flagging the HDD has problematic since the HDD "disappeared" and "re-appeared" which may be causing unexpected/unhandled condition. IMHO, the DNS should treat the HDD as a "new", remove the partitions, establish the mirror, and resync. If the sync should fail, the DNS should flag the HDD is "failed".
-
I think most would agree, the NAS shouldn't choke if you do this! It should either recognize it or do as you suggest, treat it as a new disk. Having the requirement of having a clean disk is a burden that shouldn't be imposed.
-
I agree.
IMHO, this is the type of information that should appear in documentation (or FAQ). Looking at the current documentation, the lack of this type information is what leads to problems as the usage is unclear and leaves it completely open to speculation.
-
The first couple of times I simulated the test, I did simply pull the drive, reset, re-insert the drive and restart again. I thought that I should see a message asking to re-format the drive but never did. I suspected that the information in the DPM block not being reset so I removed the drive and used another machine to delete the partition. This time when the drive was inserted I did get a dialogue asking to format the drive. I thought I was on my way, however after the drive formatted the re-sync process started but never completed after days of running. As I stated in my first message, the only way I can get the drives to sync is to delete the partitions off of both drives and start fresh. The drives will format and sync in about 30 minutes.