Pular para o conteúdo principal

Entender os contadores de performance (Performance Counters)

Segue uma descrição dos principais contadores que muitas vezes recorremos para medir a performance de um sistema

Understanding Performance Counters data while troubleshooting Performance issues
To troubleshoot overall Database system performance issue, analyzing performance counters is the best way to start. By collecting performance counters during busy period for few days consistently and analyzing those data would give a better idea about overall system problems regarding Memory, CPU, and/or Disk I/O. Please note, for troubleshooting a particular SQL problem such as a stored procedure or a piece of T-SQL, it is better to look at the query execution plan and SQL Trace data and identify the need of redesigning a query or table indexes.

Some key performance counters to use while assessing a performance issues on SQL Server.
Memory:
Memory and Disk I/O complements each other. Memory issues on the system could affect disk I/O and vice versa. It is very critical to carefully observe the trend of performance counters data over a long period of time to identify the real problem.

· Memory\Available Mbytes

Threshold: A consistent value of less than 20 to 25 percent of installed RAM is an indication of insufficient memory.
Significance: This indicates the amount of physical memory available to processes running on the computer. Note that this counter displays the last observed value only. It is not an average.

· Memory\Page Reads/sec

Threshold: Sustained values of more than five indicate a large number of page faults for read requests.
Significance: This counter indicates that the working set of your process is too large for the physical memory and that it is paging to disk. It shows the number of read operations, without regard to the number of pages retrieved in each operation. Higher values indicate a memory bottleneck.

If a low rate of page-read operations coincides with high values for Physical Disk\% Disk Time and Physical Disk\Avg. Disk Queue Length, there could be a disk bottleneck. If an increase in queue length is not accompanied by a decrease in the pages-read rate, a memory shortage exists.

· Memory\Page Input/Sec:

Threshold: The value should not exceed 15. Higher the value poor the performance.

Significance: Pages Input/sec is the rate at which pages are read from disk to resolve hard page faults. Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk.

· Memory\Pages/sec

Threshold: Sustained values higher than five indicate a bottleneck.
Significance: This counter indicates the rate at which pages are read from or written to disk to resolve hard page faults. To determine the impact of excessive paging on disk activity, multiply the values of the Physical Disk\ Avg. Disk sec/Transfer and Memory\ Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time, which indicates that you need more RAM. If this occurs over a long period, you probably need more memory.

A high value of Pages/sec indicates that your application does not have sufficient memory. The average of Pages Input/sec divided by average of Page Reads/sec gives the number of pages per disk read. This value should not generally exceed five pages per second. A value greater than five pages per second indicates that the system is spending too much time paging and requires more memory (assuming that the application has been optimized).

Processor
To measure processor utilization and context switching, you can use the following counters:

· Processor\% Processor Time

Threshold: The general figure for the threshold limit for processors is 65 percent.
Significance: This counter is the primary indicator of processor activity. High values many not necessarily be bad. However, if the other processor-related counters are increasing linearly such as % Privileged Time or Processor Queue Length, high CPU utilization may be worth investigating.

· Processor\% Privileged Time

Threshold: A figure that is consistently over 75 percent indicates a bottleneck.
Significance: This counter indicates the percentage of time a thread runs in privileged mode. When your application calls operating system functions (for example to perform file or network I/O or to allocate memory), these operating system functions are executed in privileged mode.


· System\Context Switches/sec

Threshold: As a general rule, context switching rates of less than 5,000 per second per processor are not worth worrying about. If context switching rates exceed 15,000 per second per processor, then there is a constraint.
Significance: Context switching happens when a higher priority thread preempts a lower priority thread that is currently running or when a high priority thread blocks. High levels of context switching can occur when many threads share the same priority level. This often indicates that there are too many threads competing for the processors on the system. If you do not see much processor utilization and you see very low levels of context switching, it could indicate that threads are blocked.

Disk I/O
To measure disk I/O activity, you can use the following counters:

· PhysicalDisk\Avg. Disk Queue Length

Threshold: Should not be higher than the number of spindles plus two.
Significance: This counter indicates the average number of both read and writes requests that were queued for the selected disk during the sample interval.

· PhysicalDisk\Avg. Disk Read Queue Length

Threshold: Should be less than two.
Significance: This counter indicates the average number of read requests that were queued for the selected disk during the sample interval.

· PhysicalDisk\Avg. Disk Write Queue Length

Threshold: Should be less than two.
Significance: This counter indicates the average number of write requests that were queued for the selected disk during the sample interval.

· PhysicalDisk\Avg. Disk sec/Read

Threshold: No specific value.

Less than 10 ms – very good

Between 10-20 ms – okay

Between 20-50 ms – slow, needs attention

Greater than 50 ms – Serious I/O bottleneck


Significance: This counter indicates the average time, in seconds, of a read of data from the disk.

· PhysicalDisk\Avg. Disk sec/Transfer

Threshold: Should not be more than 18 milliseconds.
Significance: This counter indicates the time, in seconds, of the average disk transfer. This may indicate a large amount of disk fragmentation, slow disks, or disk failures. Multiply the values of the Physical Disk\Avg. Disk sec/Transfer and Memory\Pages/sec counters. If the product of these counters exceeds 0.1, paging is taking more than 10 percent of disk access time, so you need more RAM.

· PhysicalDisk\Disk Writes/sec

Threshold: Depends on manufacturer's specification.
Significance: This counter indicates the rate of write operations on the disk.

· Physical Disk: %Disk Time

Threshold: Greater than 50 percent, it represents an I/O bottleneck

Significance: Represents the percentage of elapsed time that the selected disk drive was busy servicing read or write requests.

· Physical Disk\Avg. Disk Reads/Sec and Physical Disk\Avg. Disk Writes/Sec

Threshold: It should be less than 85% of the disk capacity

Significance: It represents the rate of read operations on the disk.

When using above counters, you may need to adjust the values for RAID configurations using the following formulas.

Raid 0 -- I/Os per disk = (reads + writes) / number of disks

Raid 1 -- I/Os per disk = [reads + (2 * writes)] / 2

Raid 5 -- I/Os per disk = [reads + (4 * writes)] / number of disks

Raid 10 -- I/Os per disk = [reads + (2 * writes)] / number of disks

For example, you have a RAID-1 system with two physical disks with the following values of the counters.

Disk Reads/sec 80

Disk Writes/sec 70

Avg. Disk Queue Length 5

In that case, you are encountering (80 + (2 * 70))/2 = 110 I/Os per disk and your disk queue length = 5/2 = 2.5 which indicates a border line I/O bottleneck.

Disk I/O issues can be minimized by having larger and sequential I/O activities. To achieve than you should consider partitioning. Partitioning can be done in many ways – Database partitioning, Tempdb partitioning, Table partitioning, index partitioning. At a minimum, it is highly recommended to partition your database in such a way that clustered index, non-clustered index, and transaction logs are each on a separate physical drive and spread across multiple logical filegroups/files within that drive.

Comentários

Postagens mais visitadas deste blog

Improve Windows Security By Closing Open Ports

Improve Windows Security By Closing Open Ports : " A standard Windows operating system has a number of ports open after installation. Some of these ports are needed for the system to function properly while others might not. These ports can pose a security risk as every open port on a system might be an entry point for a malicious user. A port basically allows communication to or from the device. Characteristics are a port number, an IP address and a protocol type. This article will give you the tools at hand to identify and evaluate the open ports on your Windows system to make a decision in the end whether they can or should be closed or left open. Software programs and tools that we will use: CurrPorts : Available for 32-bit and 64-bit editions of Windows. It is a port monitor that displays all open ports on a computer system. We will use it to identify the ports and the programs that are using them. Windows Task Manager: Also used to identify the programs and link some p

Diagnosing a Blue Screen of Death Error in Windows

Diagnosing a Blue Screen of Death Error in Windows : For many years now the famous Blue Screen of Death (BSoD) has been the ultimate indication that something disastrous has happened to make your computer die, but how useful is the information in the BSoD and the respective crash dump file that Windows produces? The best article I ever found explaining the BSoD in depth is here on the Microsoft website, however it’s quite technical and doesn’t discuss how to actually troubleshoot a problem. The crash dump file is just technical details of what was being held in the computer’s memory at the time of the crash, and this will include details on every driver and service that was loaded, and every piece of software that was running. The most useful pieces of information are to be found on the BSoD itself and are highlighted on the screenshot below. These are the BSoD error name, the stop error code and the name of the driver or service that has failed (this last one might not always appea

Use BGInfo to Build a Database of System Information of Your Network Computers

Use BGInfo to Build a Database of System Information of Your Network Computers : " One of the more popular tools of the Sysinternals suite among system administrators is BGInfo which tacks real-time system information to your desktop wallpaper when you first login. For obvious reasons, having information such as system memory, available hard drive space and system up time (among others) right in front of you is very convenient when you are managing several systems. A little known feature about this handy utility is the ability to have system information automatically saved to a SQL database or some other data file. With a few minutes of setup work you can easily configure BGInfo to record system information of all your network computers in a centralized storage location. You can then use this data to monitor or report on these systems however you see fit. BGInfo Setup If you are familiar with BGInfo, you can skip this section. However, if you have never used this tool, it takes ju