…another slave to the machine

Windows 2008 default Virtual Memory Config blows goats

April 22nd, 2009 Posted in Rants | No Comments »

This is merely a rant about how fucking screwy MS had to be to release a product that was as DUMB as Windows 2008. For those of you that don’t know, the default Virtual Memory Configuration of Windows 2008 is to allow the system to manage the resources. Well, when you have a 72GB system disk and 32GB of RAM, that means that Windows 2008 will allow the system to create a 64GB virtual memory page file and won’t stop when the system runs low, essentially fucking your server directly in the ass. Basically, MS thought it was a smarter idea to crash your server than have it run slow. Ok, yeah I am sure the one day I need to dump all of my physical memory into the page file, so the 64GB file can be examined by an MS support guy (good luck with that), I’ll be upset initially, but since I will have won the lottery and hell will have frozen over, I won’t be too concerned.

VDS.EXE Memory Exhaustion on DPM Server on Windows 2008

April 22nd, 2009 Posted in Windows 2008, Knowledge Base | No Comments »

Thanks to some buggy code, if you are wondering why your DPM server keeps bombing out, it is most likely the VDS.EXE service consuming all available memory. My DPM Server has 16GB of RAM and usually the VDS Service is hovering around 7-8GB thanks to this problem. There is a hotfix for this MS has released which should fix the issue for you.

http://support.microsoft.com/kb/958387

SCVMM, Shared ISOs, and the shitshow that ensues…

April 8th, 2009 Posted in System Center Virtual Machine Manager 2008 | 2 Comments »

Like everyone else in the virtualization world, I enjoy a good management interface. SCVMM promised to be my new favorite toy, however the stupid POS has a seriously BAD hangup. It doesn’t like Shared ISOs.

For those of you who are unaware, SCVMM gives the admin the ability to “deploy” shared ISOs from the library to specific virtual machines which in essence operates like dropping a DVD into the drive of the virtual machine. This is all fine and good, so long as it works. It provides two ways of doing this, the first is by copying the ENTIRE ISO to the Virtual Host and running it locally, or using the smart method and accessing the ISO over the network using the library share. Sounds simple enough right? Wrong, MS fucked up the implementation of this handy feature so it causes more headaches than doing everything manually.

The shit thing is, this works PERFECTLY from the Hyper-V management console and only fails from the SCVMM interface.

Here is the error you get when you try and connect an ISO using the Shared ISO option in the SCVMM interface:

Error (12700)
VMM cannot complete the Hyper-V operation on the scvmmhostname.domainname.com server because of the error: 'vmhostname' failed to add device 'Microsoft Virtual CD/DVD Disk'. (Virtual machine ID CC66A9DC-3E63-444D-8FB5-B93908F6DEDB)
'vmhostname': The file '\\scvmmhostname.domainname.com\MSSCVMMLibrary\ISOs\cdimagename.iso' does not have the required security settings. Error: 'General access denied error' (0x80070005). To fix the security settings, remove the device associated with this file from the virtual machine and then add it again. (Virtual machine CC66A9DC-3E63-444D-8FB5-B93908F6DEDB)
(Unknown error (0x8001))
Recommended Action
Resolve the issue in Hyper-V and then try the operation again

So you follow the instructions and nothing happens. Why? Because you forgot to recycle the documentation. This is simply a permissions issue and some googling will find you the answer, however no one ever really seems to describe all of the steps you need to take to rectify the issue so I am stepping in (Apparently steps 2-8 are only required if your Hyper-V server and SCVMM server are two different machines, which is pretty much ALWAYS the case).

  1. Ensure that Authenticated Users has READ permissions (Share & NTFS) on the SCVMM Library Share.
  2. Go into ADUC, right click on each of your Hyper-V Servers and select the Delegation tab
  3. Select “Trust this computer for delegation to specified services only”
  4. Select “Use any authentication protocol”
  5. Click the “Add” button
  6. Click the “Users and Computers” button
  7. Type the name of your SCVMM server and click OK
  8. Select “cifs” and click OK\
  9. Wait 20-30 minutes… why? because Active Directory is annoying that way

If you don’t give the cifs delegation enough time to fornicate/replicate to all of your DCs, you will still get the error and will be scratching your ass/head and cursing at me. 

Also if you try and give the virtual machine a Shared ISO during creation, SCVMM will not allow you to put that machine on a Hyper-V host. WTF? It will bomb out with the following error:

Virtualization platform on host hypervhostname.domainname.com does not support shared DVD ISO images.

Piece of shit. Yes it does and I know it does. The only way around this is to build the virtual machine without assigning a shared ISO to the virtual machine and once it is finished creating, go in and assign the shared ISO.

Ciao.

DPM 2007 on Windows 2008 protecting Exchange 2007 SP1 CCR on Windows 2008

October 2nd, 2008 Posted in Windows 2008, Exchange 2007 | 4 Comments »

I can honestly say without a doubt in my mind that Data Protection Manager 2007 is quite possibly the best and worst product Microsoft has brought to the market.

 Why is it the best? Because I can now recover my Exchange mailboxes in 15 minute increments for that last 2 months (at the cost of 22 tb of disk)

Why is it the worst? Because no one, not even the guys who designed it, know the proper way to install it, configure it, use it, or fix it when something goes wrong. Every MS document out there conflicts with every Technet blog by an expert that set it up. Any every Blog conflicts with someone elses blog. Add Windows 2008 to the mix and now even the errors reported by the program don’t even know what to say. I had a DPM rig fail 300 times in 4 hours while trying to backup Exchange 2007 with “Unknown Internal Error”. Are you kidding me?

Add that to the fact that there is no “update” method, only a link to a KB which now points to a feature pack, and then another link that points to a “hotfix rollup”… What ever happened to “Click here to update this product using Microsoft Update”?

Enough bitching, here is what I did to install DPM 2007 on a Windows 2008 Server and protect an Exchange 2007 SP1 CCR Cluster on Windows 2008 servers.

Other specifics:

  • This is x64 everything
  • Update Rollup 3 is on the Exchange nodes

On the DPM Server:

  1. Add Windows Powershell feature
  2. Install IIS role (Add the required dependencies and add ALL of the Role Services, yes, EVERYTHING)
  3. Install SIS (type “ocsetup.exe SIS-Limited” in the command prompt)
  4. Reboot the bitch
  5. Install the DPM 2007 Software
  6. Reboot the bitch
  7. Install the Feature Pack (http://www.microsoft.com/downloads/details.aspx?familyid=AD5CD1A2-9B87-4A2C-90A2-9DBAF1024310&displaylang=en)
  8. Install the Hotfix Rollup 2 (http://www.microsoft.com/downloads/details.aspx?familyid=8EEFDE76-1A94-4096-BA3A-829EB954E422&displaylang=en)
  9. Reboot the bitch
  10. Add a custom incoming firewall rule allowing ALL programs, ports, etc… from the remote IPs of your Exchange 2007 nodes, the Cluster IP and the CMS (Exchange Virtual) IP.
  11. Copy the ESE.DLL and ESEUTIL.EXE files from one of your Cluster nodes to C:\Program Files\Microsoft DPM\DPM\bin
  12. Start the IIS Manager and navigate to Report$MS$DPM2007$
  13. Open Handler Mappings and click Edit Feature Permissions
  14. Make sure “Script” is checked

On the Exchange cluster nodes

  1. Add a custom incoming firewall rule allowing ALL programs, ports, etc… from the DPM server

Now, go into DPM and install the Agents remotely using the DPM Console to your Exchange nodes. Then add a protection group, select your Virtual Server, select your Storage Groups, and you are off to the races.

Ciao

MS Network Load Balancing - Not always the solution…

July 31st, 2008 Posted in Knowledge Base | No Comments »

Allow me to set the stage…

 You have a few big bore terminal servers, handling about 400 users per day. All 3 of those server are configured to unicast spec with the best practices of MS. All 3 of those servers also sit on the same switch in your network infrastructure. You decide to move a couple of the nodes to another datacentre (on your giant flat network, in the same subnet) and all of a sudden, a portion of your clients are getting errors and are unable to connect to the cluster name. Connecting the the dedicated IP or backend adapter works just fine.

 I love the words “Unicast mode works with all routers and switches” because it puts this massive false sense of security in your head.

 Allow me to correct that statement… “Unicast DOES NOT mean your NLB cluster will work with your infrastructure” and here is the example from my environment.

So we now have the following configuration:

RDPCLUSTER (192.168.0.201) - Unicast Mode - MAC 02:BF:xx:xx:xx:xx

NODE NAME DED IP ADDRESS SWITCH PORT OUTBOUND MAC
NODE1 192.168.0.198 SWITCH1 1 02:01:xx:xx:xx:xx
NODE2 192.168.0.199 SWITCH2 1 02:02:xx:xx:xx:xx
NODE3 192.168.0.200 SWITCH2 2 02:03:xx:xx:xx:xx

All of the nodes are on the same subnet and VLAN. The clients accessing the nodes are coming from different VLANS.

So in order to prevent the switches from learning the MAC of the cluster, the nodes send outbound packets with their custom MAC address. When you get on the edge switches and look at ARP table, the address does not exist. This is a good thing. The switch actually learns the custom MAC for each node.

 However, when a client makes it’s way in and wants to access the gateway it sends the ARP to locate the MAC of RDPCLUSTER, and it hits a router. The router has no flippin clue where it is and then floods out an ARP. The problem is, each of the nodes in the cluster gets this and the 3 replies comes back from the nodes. The router then caches the location of RDPCLUSTER. This is a BAD thing. Here is an example:

So the ARP hits all the nodes and NODE2 fires back with the reply. The router then caches the ARP saying that RDPCLUSTER is on SWITCH2. So now, another new client comes in and does the same thing, looking for RDPCLUSTER. This time, NODE1 gets the duty of responding. However, the router wants to toss all the packets for RDPCLUSTER at SWITCH2, not SWITCH1 because of the intial cache. You can see this by creating a network trace and watching the TCP SYN packets leaving the clients, but no TCP SYN/ACK coming back from the cluster, because it never got to NODE1. And since it is only when connections get load balanced to NODE1, you won’t always see the problem, it will only pop it’s head up sporadically.

If you are using the “Router on a Stick” topology, then this is never a problem because all of your hosts are theoretically in the same “location”, so the router can cache it’s little heart away.

This also has alot to do with your actual network equipment and topology, and in my instance, we use Foundry hardware. Foundry equipment caches the PHYSICAL port number in the router, not the Virtual Interface, which is why people with Cisco equipment won’t see this behavior. This basically prevents us from putting NLB servers on Geographically separate physical segments, because we use a mesh/web routing design for high availablity. Maybe this can help someone else, and maybe not.

Re-Installing SCCM 2007 on Windows Server 2008

July 10th, 2008 Posted in Knowledge Base | No Comments »

If you un-install SCCM 2007 from a Windows 2008 Server, when you go to re-install it, you will run into some problems. You need to perform the following to get it working again,

To remove the “NetworkModel” namespace:

  1. Launch “wbemtest” from start/run or command line.
  2. Connect to the “root” namespace.
  3. Press the “Enum Classes” button, select “Recursive” radio button and press “OK”.
  4. In the Query Result window, locate and double click on the “__NAMESPACE     (__SystemClass)” class
  5. In the “Object editor for __NAMESPACE” window, push the “Instances” button.
  6. In the Query Results window, locate and delete the “NetworkModel” instance.
  7. Close out all of the open wbemtest windows.

http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=3466852&SiteID=17

SCOM 2007 Discovery / Agent Installation - Ports to open

July 9th, 2008 Posted in Knowledge Base | No Comments »

Since Windows 2008 is locked down tighter than my chastity belt, there are some ports you need to open to allow the SCOM 2007 Discovery process to locate and install the agent:

  1. Windows Management Instrumentation (WMI-In)
  2. File and Printer Sharing (SMB-In) - 445
  3. File and Printer Sharing (NB-Session-In) 139
  4. COM+ Network Access (DCOM-In) 135

Error “Failed to prepare storage for testing on node hostname.domain.com status 234″

July 3rd, 2008 Posted in Windows 2008 | 1 Comment »

So I was building a Windows Server 2008 Cluster for my SQL 2005 SP2 installation. I was re-using the hardware from an old Exchange Cluster and an older EMC CX300 SAN.

When you build the Windows 2008 Cluster, it says that in order for Microsoft to support you, your cluster nodes must pass all of the prerequisite checks using the Cluster Validation tool. So I ran it, thinking nothing was wrong with the hardware, and I am not a noob when it comes to clusters. Sure enough it failed with the error:

Failed to prepare storage for testing on node hostname.domain.com status 234

Baffled, I thought maybe I missed something, and kept trying it. Nada.

Turns out, there is a problem with the validation tool. If you have a Windows based software mirroring on the system disk, the Validation tool does not work. If you break the mirror, it works just fine. So, break the mirror, validate the cluster, and re-mirror the system disk.

Here is a Technet forum posting from someone else with the problem, and I am curious to see microsoft’s response.

http://forums.technet.microsoft.com/en/winserverClustering/thread/e26d2004-8da2-4546-9142-b36795466b0d/

Removing an Exchange 2003 Active/Passive Cluster

June 25th, 2008 Posted in Knowledge Base | No Comments »

This is more a of a tech note for myself, as everytime I go to do this, I forget the order:

  1. Move all of the mailboxes to another server and use LDP.EXE to confirm there are no stagnant mailboxes left on the Exchange Virtual server
  2. Move all replicas of Public Folders to another server 
  3. On the Passive node, run Add/Remove programs and remove Microsoft Exchange
  4. When asked if this is the last node in the cluster, click NO
  5. Once completed, on the Active Node open Cluster Administrator
  6. Right click the Exchange Virtual Server and select REMOVE EXCHANGE VIRTUAL SERVER
  7. Once completed, run Add/Remove programs and remove Microsoft Exchange
  8. When asked if this is the last node, click YES

Roaming profiles are not saved if you log off a Microsoft Windows Vista-based computer

June 13th, 2008 Posted in Knowledge Base | No Comments »

So I bound a factory installed Dell Latitude D531 laptop with Vista Business installed on it to my Windows 2008 Domain and tried to get roaming profiles to work. My other machines with Vista were behaving properly but this machine would load the profile from the profile server, but would NEVER write it back on log off. Oh yeah, and there were NO event log entries saying there was a problem.

 A long time ago, this problem reared it’s head with XP/2000 and some shady NVIDIA drivers, so going that route, I disabled the factory loaded ATI driver that was in the Services portion of MSCONFIG (the service was called “ATI External Event blah blah”). Low and behold, roaming profiles began working again.

 I remember there being a story that ATI hired a whole pile of NVIDIA driver guys after NVIDIA was killing ATI in the performance display adapter market. Perhaps those terrible programmers decided to work their magic into the latest series of ATI drivers.

Either way, now you know, so you can at least fix it yourself.