Storage Spaces and Parity – Slow write speeds

i’ve recently been playing around with Windows Storage Spaces on Microsoft Windows Server 2012 R2. They are fantastic. ReFS brings so many benefits over NTFS.

But it’s half complete it seems.

I originally created a parity volume, as I assumed this would be quite similar to RAID 6. You have the option of having a write array, or write cache using SSD drives. I haven’t done this at this stage. I’m currently using 6x6TB Western Digital 7200RPM drives.

After creating the very large volume, I started copying some data. I was copying the data over a 1gbit network interface, so I was expecting to see 100mb/s, or close to it.

At first, I did get 100mb/s. For a minute or so anyway. Then I saw the speed slowly drop to around 30-45mb/s. I thought this was rather strange.

I upgraded all the drivers on the server, mainly the network drivers, as I saw the network speed drop to around that level at the same time as well. However, this made no difference.

I then started to do some research to figure out what was going on.

What I saw was the following: The memory was increasing to a certain, pre-defined point, then it would stop. This indicated that the copying was actually being buffered to memory (write-cache). I assume this is happening because I used the default options when creating a parity drive without a SSD array. This creates a 2GB buffer in memory, which you can clearly see here.

memory

Once the memory buffer, or write-cache is full, you can see the speed drop and the memory start writing the data to disk.

memory

Annoying huh? One way to fix this is by using a cache array of SSD hard drives, but there is another fix.

In PowerShell, you can set the storage space to believe it has battery backup. This is like having battery backup on a raid card. First you need to get the friendly name of your storage volume.

The command is

Get-StoragePool

You will get something similar to the following
powershell

Now set the power protected mode of the pool as follows

Set-StoragePool -FriendlyName Backup -IsPowerProtected $true

replace backup with the name of your storage pool.

Here it is set as $false

3

Here it is set as $true

4

Quite a difference.

**** I should warn you though that if your server crashes, or has a power failure, your storage space may become corrupt. Make sure you have a UPS in place ****

Like I said earlier, this can be improved with a SSD cache array.

Hopefully this helps someone out there.

*** UPDATED 15/12/2015 ***

I highly recommend you view the Fujitsu white paper on Storage Spaces here.

Exchange 2013 Dag BSOD

We’ve recently been having issues with an Exchange 2013 DAG running CU2v2.

The server would reboot randomly with a BSOD. Exchange services would not start. I took the dump file and ran it through windbg and got the following output:

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

CRITICAL_PROCESS_DIED (ef)
        A critical system process died
Arguments:
Arg1: fffffa800b415080, Process object or thread object
Arg2: 0000000000000000, If this is 0, a process died. If this is 1, a thread died.
Arg3: 0000000000000000
Arg4: 0000000000000000

Debugging Details:
------------------

----- ETW minidump data unavailable-----

PROCESS_OBJECT: fffffa800b415080

IMAGE_NAME:  wininit.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  0

MODULE_NAME: wininit

FAULTING_MODULE: 0000000000000000 

PROCESS_NAME:  MSExchangeHMWo

BUGCHECK_STR:  0xEF_MSExchangeHMWo

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT_SERVER

CURRENT_IRQL:  0

ANALYSIS_VERSION: 6.3.9600.17029 (debuggers(dbg).140219-1702) amd64fre

LAST_CONTROL_TRANSFER:  from fffff80230e04795 to fffff802308db440

STACK_TEXT:  
fffff880`083419a8 fffff802`30e04795 : 00000000`000000ef fffffa80`0b415080 00000000`00000000 00000000`00000000 : nt!KeBugCheckEx
fffff880`083419b0 fffff802`30d9be2e : fffffa80`0b415080 00000000`144d2c41 00000000`00000000 fffff802`30a57794 : nt!PspCatchCriticalBreak+0xad
fffff880`083419f0 fffff802`30d12a01 : fffffa80`0b415080 00000000`144d2c41 fffffa80`0b415080 00000000`00000000 : nt! ?? ::NNGAKEGL::`string'+0x4a25a
fffff880`08341a50 fffff802`30d1880e : ffffffff`ffffffff fffffa80`0c889380 fffffa80`0b415080 00000000`00000001 : nt!PspTerminateProcess+0x6d
fffff880`08341a90 fffff802`308da453 : fffffa80`0b415080 fffffa80`0ce7c080 fffff880`08341b80 00000000`ffffffff : nt!NtTerminateProcess+0x9e
fffff880`08341b00 000007ff`5f312eaa : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`237fdeb8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x000007ff`5f312eaa


STACK_COMMAND:  kb

FOLLOWUP_NAME:  MachineOwner

IMAGE_VERSION:  

FAILURE_BUCKET_ID:  0xEF_MSExchangeHMWo_IMAGE_wininit.exe

BUCKET_ID:  0xEF_MSExchangeHMWo_IMAGE_wininit.exe

ANALYSIS_SOURCE:  KM

FAILURE_ID_HASH_STRING:  km:0xef_msexchangehmwo_image_wininit.exe

FAILURE_ID_HASH:  {3f0b292e-4bc2-5c6e-99a7-f9a74df1101c}

Followup: MachineOwner

The failed process is MSExchangeHMWo. This is the Microsoft Exchange Health Monitor service.

Reading more in to this, it seems to happen when the server has an issue, like low memory, slow IO etc. It also seems to be a bug in CU2 and CU3.

A good thread on the issue can be found here.

You can run the following powershell command on a CU2 server to disable automatic reboots (BSOD).

Add-GlobalMonitoringOverride -Identity ExchangeActiveDirectoryConnectivityConfigDCServerReboot -ItemType Responder -PropertyName Enabled -PropertyValue 0 -ApplyVersion “15.0.712.24

If you have CU3 or another version, change the ApplyVersion to the specific version of Exchange. The versions can be found here.