Quantcast
Channel: You Had Me At EHLO…
Viewing all articles
Browse latest Browse all 703

Analyzing Exchange Transaction Log Generation Statistics

$
0
0

Update 11/5/2013: added a section on firewall rules to try.

Overview

When designing a site resilient Exchange Server solution, one of the required planning tasks is to determine how many transaction logs are generated on an hourly basis. This helps figure out how much bandwidth will be required when replicating database copies between sites, and what the effects will be of adding additional database copies to the solution. If designing an Exchange solution using the Exchange Server Role Requirements Calculator, the percent of logs generated per hour is an optional input field.

Previously, the most common method of collecting this data involved taking captures of the files in each log directory on a scheduled basis (using dir, Get-ChildItem, or CollectLogs.vbs). Although the log number could be extracted by looking at the names of the log files, there was a lot of manual work involved in figuring out the highest the log generation from each capture, and getting rid of duplicate entries. Once cleaned up, the data still had to be analyzed manually using a spreadsheet or a calculator. Trying to gather data across multiple servers and databases further complicated matters.

To improve upon this situation, I decided to write an all-in-one script that could collect transaction log statistics, and analyze them after collection. The script is called GetTransactionLogStats.ps1. It has two modes: Gather and Analyze. Gather mode is designed to be run on an hourly basis, on the top of the hour. When run, it will take a single set of snapshots of the current log generation number for all configured databases. These snapshots will be sent, along with the time the snapshots were taken, to an output file, LogStats.csv. Each subsequent time the script is run in Gather mode, another set of snapshots will be appended to the file. Analyze mode is used to process the snapshots that were taken in Gather mode, and should be run after a sufficient amount of snapshots have been collected (at least 2 weeks of data is recommended). When run, it compares the log generation number in each snapshot to the previous snapshot to determine how many logs were created during that period.

Script Features

Less Data to Collect

Instead of looking at the files within log directories, the script uses Perfmon to get the current log file generation number for a specific database or storage group. This number, along with the time it was obtained, is the only information kept in the output log file, LogStats.csv. The performance counters that are used are as follows:

Exchange 2013:

MSExchangeIS HA Active Database\Current Log Generation Number

Exchange 2007/2010:

MSExchange Database ==> Instances\Log File Current Generation

Note: The counter used for Exchange 2013 only contains the active databases on that server. The counter used for Exchange 2007/2010 contains all databases on that server, including passive copies. To only get data from active databases on an Exchange 2007/2010 server, make sure to manually specify the databases for that server in the TargetServers.txt file.

Multi Server/Database Support

The script takes a simple input file, TargetServers.txt, where each line in the file specifies the server, or server and databases to process. If you want to get statistics for all databases on a server, only the server name is necessary. If you want to only get a subset of databases on a server (for instance if you wanted to omit secondary copies on an Exchange 2007 and 2010 server), then you can specify the server name, followed by each database you want to process.

Built In Analysis Capability

The script has the ability to analyze the output log file, LogStats.csv, which was created when run in Gather mode. It does a number of common calculations for you, but also leaves the original data in case any other calculations need to be done. Output from running in Analyze mode is sent to multiple .CSV files, where one file is created for each database, and one more file is created containing the average statistics for all analyzed databases. The following columns are added to the CSV files:

  • Hour: The hour that log stats are being gathered for. Can be between 0 – 23.
  • TotalLogsCreated: The total number of logs created during that hour for all days present in LogStats.csv.
  • TotalSampleIntervalSeconds: The total number of seconds between each valid pair of samples for that hour. Because the script gathers Perfmon data over the network, the sample interval may not always be exactly one hour.
  • NumberOfSamples: The number of times that the log generation was sampled for the given hour.
  • AverageSample: The average number of logs generated for that hour, regardless of sample interval size. Formula: TotalLogsCreated / NumberOfSamples.
  • PercentDailyUsage: The percent of a full days’ worth of logs that the AverageSample value for that hour accounts for. Formula: (AverageSample / AverageNumberOfLogsPer24Hours) * 100.
  • AverageSamplePer60Minutes: Similar to AverageSample, but adjusts the value like each sample was taken exactly 60 minutes apart. Formula: (TotalLogsCreated / TotalSampleIntervalSeconds) * 3600 * 24.
  • PercentDailyUsagePer60Minutes: Similar to PercentDailyUsage, but adjusts the value like each sample was taken exactly 60 minutes apart. (AverageSamplePer60Minutes / AverageNumberOfLogsPer24Hours) * 100.

Parameters

The script has the following parameters:

  • -Gather: Switch specifying we want to capture current log generations. If this switch is omitted, the -Analyze switch must be used.
  • -Analyze: Switch specifying we want to analyze already captured data. If this switch is omitted, the -Gather switch must be used.
  • -ResetStats: Switch indicating that the output file, LogStats.csv, should be cleared and reset. Only works if combined with –Gather.
  • -WorkingDirectory: The directory containing TargetServers.txt and LogStats.csv. If omitted, the working directory will be the current working directory of PowerShell (not necessarily the directory the script is in).
  • -LogDirectoryOut: The directory to send the output log files from running in Analyze mode to. If omitted, logs will be sent to WorkingDirectory.
  • -MaxSampleIntervalVariance: The maximum number of minutes that the duration between two samples can vary from 60. If we are past this amount, the sample will be discarded. Defaults to a value of 10.
  • -MaxMinutesPastTheHour: How many minutes past the top of the hour a sample can be taken. Samples past this amount will be discarded. Defaults to a value of 15.
  • -MonitoringExchange2013: Whether there are Exchange 2013 servers configured in TargetServers.txt. Defaults to $true. If there are no 2013 servers being monitored, set this to $false to increase performance.

Usage

Run the script in Gather mode, taking a single snapshot of the current log generation of all configured databases:

PS C:\> .\GetTransactionLogStats.ps1 -Gather

Run the script in Gather mode, and indicates that no Exchange 2013 servers are configured in TargetServers.txt:

PS C:\> .\GetTransactionLogStats.ps1 -Gather -MonitoringExchange2013 $false

Run the script in Gather mode, and changes the directory where TargetServers.txt is located, and where LogStats.csv will be written to:

PS C:\> .\GetTransactionLogStats.ps1 -Gather -WorkingDirectory "C:\GetTransactionLogStats" -ResetStats

Run the script in Analyze mode:

PS C:\> .\GetTransactionLogStats.ps1 -Analyze

Run the script in Analyze mode, sending the output files for the analysis to a different directory. Specifies that only sample durations between 55-65 minutes are valid, and that each sample can be taken a maximum of 10 minutes past the hour before being discarded:

PS C:\> .\GetTransactionLogStats.ps1 -Analyze -LogDirectoryOut "C:\GetTransactionLogStats\LogsOut" -MaxSampleIntervalVariance 5 -MaxMinutesPastTheHour 10

Example TargetServers.txt

The following example shows what the TargetServers.txt input file should look like. For the server1 and server3 lines, no databases are specified, which means that all databases on the server will be sampled. For the server2 and server4 lines, we will only sample the specified databases on those servers. Note that no quotes are necessary for databases with spaces in their names.

image

Output File After Running in Gather Mode

When run in Gather mode, the log generation snapshots that are taken are sent to LogStats.csv. The following shows what this file looks like:

image

Output File After Running in Analyze Mode

The following shows the analysis for a single database after running the script in Analyze mode:

image

Notes

By default, the Windows Firewall on an Exchange 2013 server running on Windows Server 2012 does not allow remote Perfmon access. I suspect this is also the case with Exchange 2013 running on Windows Server 2008 R2, but haven’t tested. If either of the below errors are logged, you may need to open the Windows Firewall on these servers to allow access from the computer running the script.

ERROR: Failed to read perfmon counter from server SERVERNAME

ERROR: Failed to get perfmon counters from server SERVERNAME

Update:

After noticing that multiple people were having issues getting this to work through the Windows Firewall, I tried enabling different combinations of built in firewall rules until I could figure out which ones were required. I only tested on an Exchange 2013 server running on Windows Server 2012, but this should apply to other Windows versions as well. The rules I had to enable were:

File and Printer Sharing (NB-Datagram-In)
File and Printer Sharing (NB-Name-In)
File and Printer Sharing (NB-Session-In)

Mike Hendrickson


Viewing all articles
Browse latest Browse all 703

Trending Articles