Apr
28
LSI 9211-8i / m1015 SAS Controller
Filed Under Storage Server
My storage server has been stuck at 8 hard drives for awhile as my current motherboard only had 8 SATA II ports on board. While I picked up this board specifically for the 8 SATA II ports to help make the board last as long as possible, it is time to start looking for potential expansion solutions.
In my search I found that SAS / SATA controllers are more expensive than I would have hoped, so I went looking for more affordable solutions. LSI boards tend to be highly recommended as they have excellent support in both Linux and Windows.
I came across the IBM m1015, which is a rebadged LSI 9211-8i (technically labeled as a LSI 9220-8i), which comes highly reviewed. I picked one up about two weeks ago, but it has been sitting in a box since then as I needed some 8087 to SATA breakout cables.
These cards can be picked up for ridiculously cheap ($65 – $80) on eBay, compared to the LSI 9211-8i which retails for nearly $250. In fact, you can even flash the m1015 directly with the LSI 9211-IT firmware, which is recommended if you want to use the card in Initiator-Target (IT) mode.
About this Post: April 28, 2012
Permalink | Trackback |
|
Print This Article | Leave a Comment
Apr
26
ESXi: “A general system error occurred”
Filed Under Administration, VMware ESXi, Work
Recently, I had a Virtual Machine that was attempting to vMotion from HOST_A to HOST_B on a NFS data store due to the DRS scheduling. However, once ever couple of hours vCenter would attempt to start the vMotion, but we were greeted with a nasty error message:
“A general system error occurred: Source detected that destination failed to resume”
This only occurred on vMotions from or to HOST_A. Hosts B, C, D, and E could all migrate successfully back and forth.
Checking the vCenter Server Log’s showed a small message indicated it was a vMotion failure related to the datastore:
[MIGRATE] VMotion failed: vmodl.fault.SystemError
After double and triple-checking through the vCenter, I could not find any differences. Eventually, we went directly to the ESX hosts to view the NFS data store information:
user@host_a~ # esxcfg-nas -l NFS-01 is /nfs-01 from 192.168.1.101 mounted NFS-02 is /nfs-02 from 192.168.1.102 mounted NFS-03 is /nfs-03 from 192.168.1.103 mounted user@host_b~ # esxcfg-nas -l NFS-01 is /nfs-01/ from 192.168.1.101 mounted NFS-02 is /nfs-02/ from 192.168.1.102 mounted NFS-03 is /nfs-03/ from 192.168.1.103 mounted
Apparently HOST_A was missing a trailing forward slash which was causing discrepancies between the two hosts. What’s aggravating about this is that this was not visible through the vSphere client when viewing the properties of the NFS datastore. On top of that, we had “cloned” the configuration from HOST_A to HOST_B, but the clone scripts adds the formal / at the end.
After removing the three datastores on HOST_A, using the script to re-clone them back using HOST_B as a template, everything was functioning normally.
About this Post: April 26, 2012
Permalink | Trackback |
|
Print This Article | Leave a Comment
Apr
25
After the mini-blunder with OpenIndiana picking up the Intel Quad-Port card out of order the other day, I figured everything should smooth sailing form here on out, as I’ve already setup link aggregation once before.
The Plan: Remove the current aggr0 on bnx0/bnx1 and create a new aggr0 on igb0/igb1, potentially adding in igb2/igb3 later, or possibly setting up igb2/igb3 as dedicated iSCSI ports instead (TBD).
Tearing down the link aggregation was easy:
root@storage:~# dladm delete-aggr aggr0
I temporarily brought up each device igb0, igb1, igb2, igb3 to test each port. Everything was good.
I then went and created a new aggr0 with the igb0/igb1 ports, and set a new static IP:
root@storage:~# dladm create-aggr -l igb0 -l igb1 aggr0 root@storage:~# ipadm create-addr -T static -a 192.168.1.100/24 aggr0/v4
Recycled the system, and everything came up as expected, but I received a nasty error message at the console on boot:
igb0: DL_BIND_REQ failed: DL_SYSERR (errno 16) igb0: DL_UNBIND_REQ failed: DL_OUTSTATE igb1: DL_BIND_REQ failed: DL_SYSERR (errno 16) igb1: DL_UNBIND_REQ failed: DL_OUTSTATE Failed to plumb IPv4 itnerface(s): igb0 igb1
Well, after a little digging, I found that when I tested the network connection I had created two files (/etc/hostname.igb0 and /etc/hostname.igb1) that each contained an IP address. The error above simply was letting me know that I was attempting to set an IP twice, as aggr0 was also setting this IP.
Deleted the two hostname.igb* files, rebooted, and again, everything came up as expected…
Well… maybe.
I got back to my desk to do a little testing, but I was seeing some very strange behavior. Of my two test systems, one of my test Linux boxes could connect via NFS and one couldn’t. In fact, the one that couldn’t had no access at all, even ping failed.
Rebooted, and both systems worked fine. Except now a third system was unable to connect. At this point it was obvious something was incorrectly configured.
After a little searching, I found this article that made me think there might be a driver problem somewhere. With a little more digging and a quick netstat -p, I found that several of my connections were trying to use ports that weren’t connected.
Apparently I had left igb2 and igb3 “configured” but they did not have an actual Ethernet cable. For some reason, OpenIndiana was still trying to optimize when aggr0 was busy and connect on these ports even though they didn’t have a physical connection.
Ugh. Removed the final two /etc/hostname.igb* files, rebooted, and everything came back up as expected.
Really for real this time.
About this Post: April 25, 2012
Permalink | Trackback |
|
Print This Article | Leave a Comment
