<a href="http://www.game-advertising-online.com/" target="_blank">Game Advertising Online</a><br /> banner requires iframes
Page 1 of 2 12 LastLast
Results 1 to 20 of 23

Thread: /dev/sdb failure on mjollnir

  1. #1
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default /dev/sdb failure on mjollnir

    The second of mjollnir's disks, /dev/sdb, just started hitting 90% utilization with basically no usage, like this:
    Code:
    Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.30   102.20    2.30   73.00     0.09     0.68    20.93     0.30    3.97   2.03  15.30
    sdb               1.10    89.30    2.10   55.70     0.19     0.56    26.62     2.41   40.88  16.12  93.20
    sdc               0.70    88.70    4.10   33.30     0.14     0.48    33.88     0.16    4.20   2.22   8.30
    sdd               0.20   101.10    6.10   50.10     0.09     0.59    24.83     0.23    4.13   1.83  10.30
    Notice how we had 0.19 MB read per second and 0.56 written in this ten-second sample, but it was still 93.20% utilized. The site started slowing down a lot and I was getting occasional 503s and load spikes. It looks like pretty much all disk activity halted. I removed sdb from the array so the site would work again, and starting running a SMART test. No errors seem to be logged anywhere. This disk has seen several thousand hours of usage with no problems, so this failure comes out of the blue.

    If the SMART test comes out okay, I'll try re-adding the disk. If we get a SMART test failure, or the disk acts up again when re-added, we're going to have to replace it too. I really hate these things.

    For now, mjollnir is running degraded. sdc will handle sdb's read load now; if sdc fails, mjollnir will die, but I hardly find that likely. Of course, we then still have thor to switch to, so I'm not worried at all.

    I'm going to e-mail Garb and GED about what our budget is like. If we can afford to get another disk or two for backup in case of something like this, we should really consider it.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  2. #2
    Squid's Avatar Opifex
    Join Date
    Feb 2007
    Location
    Frozen waste lands of the north
    Posts
    15,698

    Default Re: /dev/sdb failure on mjollnir

    Are there any alternatives to these raptors that seem to cause no end of problems for us?


    Under the patronage of Roman_Man#3, Patron of Ishan
    Click for my tools and tutorials
    "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe." -----Albert Einstein

  3. #3
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    There doesn't seem to be any other disk in the same price and performance range. Given that the motherboards are SATA, the only non-cheapo drives that seem to be out there are VelociRaptors and SSDs, and the latter are way out of our price range. If anyone has any ideas, though, I'd be happy to hear them.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  4. #4
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    The SMART test came out okay. I'm running badblocks -wsve1 right now, and it's gone through three full passes with no problems yet. I've noticed from iostat that it's doing 100+ MB/s serial reads and writes during badblocks. So once badblocks finishes, if there are no errors, I'll try re-adding it: maybe it was some transient glitch.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  5. #5
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    Maybe, but we got one for thor today too.

  6. #6
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    Quote Originally Posted by Simetrical View Post
    There doesn't seem to be any other disk in the same price and performance range.
    Maybe we should switch models. I have been staying with the same model just for consistency, but NewEgg is now showing another 300 gig Raptor for $169 which is actually less than we paid for the original Raptors ($199 I think). Maybe its just that model that sucks, if you notice they also have a TON of recertified Raptors of the model we currently use listed for $114. It let me add 174 of them to my cart, so I am guessing they have 174 that people returned for similar issues to ours, problems with the drive but passing SMART test.

    I am willing to drop an extra $20 per drive to get a different model, but not too fond of the idea of just buying more of the same.
    http://www.newegg.com/Product/Produc...82E16822136802

  7. #7
    Squid's Avatar Opifex
    Join Date
    Feb 2007
    Location
    Frozen waste lands of the north
    Posts
    15,698

    Default Re: /dev/sdb failure on mjollnir

    Isn't that $20 more but also 200G less?


    Under the patronage of Roman_Man#3, Patron of Ishan
    Click for my tools and tutorials
    "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe." -----Albert Einstein

  8. #8
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    sdb passed badblocks as well as SMART, so I've tried re-adding it. We'll see if it causes problems.
    Quote Originally Posted by GrnEyedDvl View Post
    Maybe, but we got one for thor today too.
    No, that was deliberate. I was testing something while moving stuff off the slow disks on thor. md doesn't distinguish between a disk that's actually failed, and one that the administrator deliberately removes. That way the failure code-paths are better tested, but it means deliberate removal results in spurious alerts.
    Quote Originally Posted by GrnEyedDvl View Post
    Maybe we should switch models. I have been staying with the same model just for consistency, but NewEgg is now showing another 300 gig Raptor for $169 which is actually less than we paid for the original Raptors ($199 I think). Maybe its just that model that sucks, if you notice they also have a TON of recertified Raptors of the model we currently use listed for $114. It let me add 174 of them to my cart, so I am guessing they have 174 that people returned for similar issues to ours, problems with the drive but passing SMART test.

    I am willing to drop an extra $20 per drive to get a different model, but not too fond of the idea of just buying more of the same.
    http://www.newegg.com/Product/Produc...82E16822136802
    We could try it, I guess.
    Quote Originally Posted by Squid View Post
    Isn't that $20 more but also 200G less?
    No, it's 300G. That's same size as our current Raptors. It's the old slow disks on thor that are 500G.

    That said, there are now 600G Raptors available for $250. We don't have a use for them now, but it's worth keeping in mind. When we originally bought them, they only went up to 300G in size.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  9. #9
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    It's now resyncing. That should be done by sometime tomorrow, and then we'll see how it goes.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  10. #10
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    It happened again:
    Code:
    03/18/2011 12:58:45 PM
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               9.29    0.00    0.56    3.41    0.00   86.74
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.10     0.00    0.50    0.20     0.03     0.00    74.57     0.00    4.29   4.29   0.30
    sdb               0.00     0.00    0.20    0.20     0.01     0.00    54.50     2.01 5115.00 2500.00 100.00
    sdc               0.00     0.00    0.40    0.20     0.01     0.00    23.00     0.00   18.33   5.00   0.30
    sdd               0.00     0.00    0.00    0.20     0.00     0.00     5.00     0.00   15.00  15.00   0.30
    
    03/18/2011 12:58:55 PM
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
               0.09    0.00    0.17   12.03    0.00   87.71
    
    Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
    sda               0.00     0.00    0.10    0.70     0.00     0.00     7.25     0.01   12.50   7.50   0.60
    sdb               0.00     0.00    0.00    0.60     0.00     0.00     7.00     3.01 2650.00 1665.00  99.90
    sdc               0.00     0.00    0.00    0.70     0.00     0.00     7.14     0.01   14.29   7.14   0.50
    sdd               0.00     0.00    0.00    0.70     0.00     0.00     7.14     0.01   12.86   7.14   0.50
    I removed sdb from RAID again, and I'm not readding it. I'll e-mail GED about getting some extra disks.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  11. #11
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    Quote Originally Posted by GrnEyedDvl View Post
    I am willing to drop an extra $20 per drive to get a different model, but not too fond of the idea of just buying more of the same.
    http://www.newegg.com/Product/Produc...82E16822136802
    This is ridiculous, I went to order 2 of these today and now they show as out of stock. If they dont have some by Mon or Tues (Amazon shows out of stock as well) I will order the 450 gig drives for $199. I am not ordering the same model that keeps dying on us.

  12. #12
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    Did you order the new drives yet?
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  13. #13
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    Yes I did, we have 2 more 300 gig drives and a null modem adapater on the way. I wasnt sure if we needed that or not since I didnt see you post anywhere about testing serial capability.

  14. #14
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    Oh, right, I forgot about that. I have very little time right now, don't know when I'll be able to.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  15. #15
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    I have the drives. I will plan on Weds for install. I will replace sdb and leave the other new installed in slot 5 as a spare. I will also install the null modem adapter unless I hear from you that we dont need it.

  16. #16
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    I see sde and sdf in mjollnir. I'm running smartctl -t long and badblocks -wsve1 on both of them right now. Will add them when the tests finish.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  17. #17
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    Disks checked out clean. I'm adding both of them. /dev/sde will take over as the fourth disk, and /dev/sdf is a hot spare that will automatically take over if one of the four used disks fails.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  18. #18
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    New disks running fine so far.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

  19. #19
    GrnEyedDvl's Avatar Barackolypse Now
    Join Date
    Jan 2007
    Location
    Denver CO
    Posts
    20,990

    Default Re: /dev/sdb failure on mjollnir

    What do we need to do with sdb?

  20. #20
    Simetrical's Avatar Former Chief Technician
    Join Date
    Nov 2004
    Location
    θ = π/0.6293, φ = π/1.293, ρ = 6,360 km
    Posts
    20,299

    Default Re: /dev/sdb failure on mjollnir

    We're not going to be using it, right? So doesn't matter. It's not being used for anything, you can pull it out if you like.
    MediaWiki developer, TWC Chief Technician
    NetHack player (nao info)


    Risen from Prey

Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •