Technology is awesome (and how I can’t afford it all)

Lately I have been doing some planning for budget season, and thinking about the medium-term future and where I’d like to take my infrastructure

A big part of this is storage, and my company is in a bit of an odd place in that we’re growing so fast we need to add to our MD3220i SAN, but the MD3220i itself has an expiring warranty in December 2015. I feel like it would be a waste of money to add a disk shelf in 2014 to just have it go unused by 2015.

To address this I began with my Dell team, and had a product specialist in the office today to go over their mid size and enterprise storage products: Equallogic and Compellent. He did an excellent job in making it clear the advantages of a ‘frameless’ storage infrastructure over a ‘framed’ one like we’re in now.

Since then (only a few hours ago really) my mind has just been buzzing at all the possibilities and Projects that this meeting has kickstarted.

In the form of one long run-on sentence:

If we upgrade our storage next year to an Equallogic we can utilize the storage tiering to reduce rack space and power use while maintaining performance and increasing capacity, while at the same time decommissioning old hardware (our MD3000) and re-using our slightly old hardware (MD3220i) for purposes such as backup and disaster recovery, which we’re looking at something like AppAssure or Veeam of Unitrends to handle as long as we have the appropriate disk space, which needs to be shared with Hyper-V Replica for DR purposes, because I’m severlely lacking in that area right now which is dangerous but can be solved with a multi-tier backup and DR plan of having storage on the LAN AND offsite with replication of the backup database and Hyper-V Replica but this requires a cluster upgrade to Server 2012 R2, which would be nice anyways because then I can do live VHDX expansion to avoid having to disrupt my file server because the less off-hours maintenance I have to do the better so that I can use my time doing things like analyzing performance benefits and presenting to the Executive why we need to do all this stuff RIGHT NOW.

 

Strange Sonicwall network issue

Since May I’ve been struggling with a very odd issue with the Sonicwall NSA 2400 in my head office. It was first discovered when our VPN’s kept going down without warning, multiple times per day.

After some internal investigation, my team noticed a pattern; one of us was trying to configure SSL-VPN for the first time, and every time they made a change to the settings, our X2 interface went down.

Only X2 went down though; we have X1 connected to an entirely different ISP, and it never had any issue. Unfortunately X2 was the interface providing connectivity for all our site-to-site VPNs, as we well as our external client-facing services.

I narrowed down how to replicate the issue, and discovered that any change to a NAT policy caused it, as well as other random settings changes. However firewall access rules did not impact X2 connectivity.

I could verify the issue by pinging my X2 gateway from the Sonicwall. Before enabling/disabling a NAT Policy, the ping was successful. However as soon as I made a change, ping timed out.

Connectivity was automatically restored after 5-6 minutes; there was nothing I could do to force traffic to resume.

I got in touch with my ISP but they confirmed that it wasn’t a problem on their network.

I had a ticket open with Sonicwall for quite some time, and diligently followed their directions, including wiping the Sonicwall and starting from factory defaults (that didn’t work).

Next they asked me to reconfigure the link on X5 to replace X2, but that didn’t work either.

After a few delays in troubleshooting, it was recommended to do a hard-reset; boot into safe mode, upgrade to 5.9 firmware and then reset to factory defaults. Apparently the first reset to defaults was considered a ‘soft reset’ and isn’t as effective. To be honest, I don’t understand how a hard reset could resolve an issue like this, but I was willing to give it a shot.

After planning a 2 hour maintenance window, I began the hard reset procedure. When the Sonicwall came back up in Safe Mode, I upgraded to 5.9 firmware and booted to factory defaults. Then I reconfigured the LAN and WAN interfaces, and tested my original issue. Success! X2 didn’t go down.

I was really hoping to avoid a full reconfigure from scratch, so after my successful test I imported my most recent config backup and crossed my fingers that the problem wouldn’t return. After the reboot I disabled a NAT policy, and determined that X2 stayed up the entire time. Success again!

Overall, I was very pleased with Sonicwall support. Despite the fact that they couldn’t pinpoint the problem to a resolvable issue, they were always quick to respond and understanding that I needed to schedule maintenance windows for any work on the device. Sonicwall gets a bad reputation in some IT circles but I will have no hesitation in purchasing additional units and recommending them to others.

 

Web development is a lot like building LEGO

IMG_9831I am by no means a “web developer“, however I have spent a significant amount of my time this year building web applications for my company.

I am by no means a “lego designer“, however I have spent a significant amount of my time this year building whatever my son asked me to build.

Through these two experiences I have learned that they are both very similar.

 

It starts with an idea; something that catches my attention, something useful or productive, or just fun.

From there I begin building, but rarely is the building linear. A piece here, a piece there, a section at a time, the building begins.

Building takes time, and during that time I always come up with enhancements and ways to make it better. The scope changes and grows but it is usually for the better.

Of course, even when the project is incomplete, I start thinking about how it looks. I smooth out the rough edges, inspect the symmetry, and makes sure it moves the way I want it to.

Eventually I begin looking at security; making sure the project won’t break, or be broken into. I test it again and again, and I get other people to look at it.

 

I take a lot of joy knowing that I’m creating something. That’s a little different than a typical System Administrator responsibility, and it is something I’m thankful I have the opportunity for. Whether it’s a line-of-business application that will be used by my entire company, or a 2 foot robot my 5 year old will play with for 30 minutes before destroying; it is a lot of fun.

I always forget the basics

I had a strange issue with one of my branch offices, where they would lose access to local resources and external Internet sites whenever our Site-to-Site VPN with the head office went down.

I spent around 3 hours troubleshooting this issue, desperately looking for a logical cause. It wasn’t until I paid closer attention to the DNS settings that were being received from the DHCP server did I notice that the primary DNS nameserver was a legacy domain controller within the branch office that no longer existed, and the secondary DNS was a domain controller in our head office, across the VPN.

When the VPN link went down, the clients had no resolvable DNS servers, and thus couldn’t access anything except by direct IP.

When I discovered this, it was a quick fix that brought services back online promptly.

Unfortunately it is all too often that I dive into a problem looking for a cause that is complex without seeing the simple issue right in front of me. I need to learn to be a little more methodical in my problem solving, and start with Layer 1 first.

Hyper-V 2012 R2

If the release date of Windows 8.1 is any indication, Server 2012 R2 is nearing RTM and I’m super excited, despite the fact that there are only one or two features that I’d likely be using.

Most of my interest comes from Hyper-V improvements, especially VHDX online expand. I’ve been slowly converting my VHD’s to VHDX during maintenance windows, and I’m sure glad I’ve been spending the time. Being able to expand the size of my VM disks without downtime is a huge benefit.

Unfortunately it looks like I’m going to have to rebuild my cluster again since you can’t have dissimilar host OS within a cluster. That really sucks, but I take solace in the fact that I can do an upgrade of Server 2012 rather than a complete bare-metal reinstall.

 

There’s still lots of improvements to be made in my Hyper-V environment, starting with backup and disaster recovery. My backup plan from 2012 never really got off the ground due to a variety of issues, but that is going to be picked back up right away. Now my preliminary thoughts (before spending time researching) is to get a second SC847 disk chassis, and set up one inside the LAN for backup using something like AppAssure, Veeam, Unitrends or Altaro. Then replicate that backup repository offsite to the second disk chassis over whatever link I have available. This way my primary backup is done over gigabit, and then the replication can take advantage of deduplication and other replication technologies. Then I’ll add Hyper-V replica to the mix for disaster recovery plans.

So far in my environment I haven’t had to scale up to a 3 node cluster, but I’m budgeting for it this coming fiscal year anyways because it’s going to happen, and I’m excited for that too. It will give me more RAM headroom per host when doing server maintenance, and offer improvements in performance for some of our heavier VMs.