I now have an environment of Microsoft Data Protection Manager 2012 R2 set up as a replacement for Backup Exec 2010.
Despite the lack of some features, it has been performing quite well. However I recently started receiving email notification of errors and it relates to my secondary DPM server.
The primary DPM server exists in the head office, and provides back up of Hyper-V VMs from my main cluster. The secondary DPM server exists in a branch office 300KM away and provides back up of the primary DPM server.
I started receiving errors like the following from the Secondary server to individual resources on the primary:
Synchronization for replica of \Online\servername(servername.clustername) on PrimaryDPM failed because the replica is not in a valid state or is in an inactive state. (ID 30300 Details: VssError:The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. (0x800423F4))
Every time I tried to perform a consistency check on these resources, it would begin and then end within 30 seconds.
To be honest I didn’t have a lot of time to troubleshoot this one. I tried restarting both DPM servers as well as the Hyper-V host and VM itself, and none of that seemed to have an impact.
At some point I noticed that the resources giving the errors on the Secondary server hadn’t had a recovery point on the Primary server in quite some time.
I forced an Express Full Backup of the VMs on the Primary server and allowed it to complete (successfully). I then initiated a consistency check on the Secondary server protected resources, and it too completed successfully!
Where I’m still confused is why didn’t I receive alerts from my Primary DPM server that recovery points were being missed?
I’m currently setting up a controlled WiFi network to adhere to my parent company’s standards. We’re using the HP MSM760 controller with MSM460 access points.
I had everything set up and tested within my head office environment, however I ran into an issue when I moved the AP’s to a branch office on a different subnet.
Every AP that I moved registered with the MSM controller in Australia rather than Canada where I am. After some reading of the manual I determined it did this because the discovery of the controller works in this order:
- UPD broadcast
- DHCP options
- DNS lookup (to cnsrv1)
Because the Australian controller predates mine, they had already set up and used the DNS name “cnsrv1″. Since my APs no longer detected a controller through the UDP broadcast because of the new subnet, it resolved the DNS name and re-registered.
To move my APs back to my controller I had to do the following:
From the Australian controller, change the AP to Autonomous mode:
Then I checked my DHCP server for the current IP of the AP, because it changed after switching to autonomous mode
Following that, I logged onto the web interface of the AP.
Then I used Maintenance > System > Provision to enter the static provisioning settings:
I enabled discovery, enabled discovery by IP, and entered my Canada controller IP and clicked save:
Then from the left side of the screen, clicked Restart to confirm the static provision:
When the AP came back up, it registered on my Canada controller and all is good!
I’m in the process of configuring DPM to back up my Hyper-V environment, which resides on an EqualLogic PS6500ES SAN.
It was during this that I encountered an issue with the DPM consistency check for a 3TB VM locking up every other VM on my cluster, due to high write latencies. During this period I couldn’t even get useful stats out of the EqualLogic because SANHQ wouldn’t communicate with it and the Group Manager live sessions would fail to initialize.
After some investigation, I did the following on all my Hyper-V hosts and my DPM host:
– Disabled “Large Send Offload” for every NIC
– Set “Receive Side Scaling” queues to 8
– Disabled the Nagle algorithm for iSCSI NICs (http://social.technet.microsoft.com/wiki/contents/articles/7636.iscsi-and-the-nagle-algorithm.aspx)
– Update Broadcom firmware and drivers
Following these changes, I still see very high write latency on my backup datastore volume, but the other volumes operate perfectly.
I’m currently in the process of upgrading a standalone Server 2012 machine running Hyper-V to Server 2012 R2.
Due to resource constraints, I’m performing an in-place upgrade, despite this server residing 800km away from me. Thank goodness for iDRAC Enterprise.
However, during this process, during the “Getting Devices Ready” section I received a Blue Screen Of Death, with the error message:
After it hit this BSOD twice, the upgrade process failed out and reverted back to Server 2012. I was unable to find a log file of what occurred in any more detail, and was worried that I would be stuck on Server 2012.
Thankfully, I discovered a log file on the iDRAC with the following message:
A bus fatal error was detected on a component at slot 1.
This triggered my memory, and I recalled that we have a USB3 PCI-E card installed for pre-seeding an external drive with backup info.
I used the BIOS setup (Integrated Devices > Slot Disablement) to disable Slot 1, and then retried the upgrade with fingers crossed.
I have never attended Microsoft TechEd, but I REALLY want to, especially after seeing the sheer number of sessions that are relevant to me in 2013.
I’ve been keeping an eye on the TechEd website for some time after the event closed, watching for the posting of the recorded sessions, and now they are up, all 60 something pages of them.
I’ve gone through the list of every session, and collected a link list of 59 of them I want to watch, with some standouts like:
- Windows PowerShell unplugged
- How many coffee’s can you drink while your PC starts
- DFSR 2012 R2 enhancements
- Storage Efficiency with Dell and Storage Spaces
- Mark Russinovich on Cloud Computing
- Upgrade your IT skills and Infrastructure: Cloud Computing
- Performance Optimize your ASP.NET
- O365 Identity Management
- What’s new in Windows 8.1 Deployment
- Hyper-V Recovery Manager
- Backup Strategy for Private Cloud
And so much more…