A general question for a SQL Server DBA while monitoring counters in performance monitor is 'What is the threshold for each of the counters that can be considered as alarm bells for further investigation?'
The below table summarizes the counters and their ideal values. It was given to me by a Microsoft Engineer who works at our client place. All the values have been put together by his team at Microsoft.
The below should be a very good starting point for a DBA to either zoom in or rule out OS as the culprit.
PerformanceObject | Counter | Observed value | Threshold |
Memory | Available MBytes | | <10mb> |
Memory | Cache Bytes | | |
Memory | Commit Limit | | |
Memory | Committed Bytes | | > Physical Bytes is bad |
Memory | Free System Page Table Entries | | <5,000>very bad; <7,000> |
Memory | Page Faults/sec | | <> |
Memory | Pages/sec | | >2500/s bad for modern servers; less for older servers |
Memory | Pool NonPaged Bytes | | 32-bit: 100MB-260MB; 64-bit: 1GB-128GB (depends on total physical RAM; is it close to being consumed) |
Memory | Pool Paged Bytes | | 32-bit: 120MB-480MB; 64-bit: 1GB-128GB (depends on total physical RAM; is it close to being consumed) |
Memory | System Cache Resident Bytes | | |
Network Interface | Bytes total/sec | * | look for descending numbers to indicate network problems (kk) |
Paging File | % Usage | * | >70% is a cause for concern |
Paging File | % | * | |
PhysicalDisk/LogicalDisk | % Disk Time | * | >85% (consider spindle count) |
PhysicalDisk/LogicalDisk | % Free Space | * | 10-30% |
PhysicalDisk/LogicalDisk | % Idle Time | * | |
PhysicalDisk/LogicalDisk | Avg. Disk Bytes/Read | * | |
PhysicalDisk/LogicalDisk | Avg. Disk Bytes/Transfer | * | Avg. I/O size; theoretical ~90-100MB/s sequential; reality significantly less |
PhysicalDisk/LogicalDisk | Avg. Disk Bytes/Write | * | |
PhysicalDisk/LogicalDisk | Avg. Disk Queue Length | * | 1.5 x # of spindles |
PhysicalDisk/LogicalDisk | Avg. Disk Read Queue Length | * | 1.5 x # of spindles |
PhysicalDisk/LogicalDisk | Avg. Disk sec/Read | * | Disk Latency > 50ms POOR |
PhysicalDisk/LogicalDisk | Avg. Disk sec/Transfer | * | Disk Latency > 50ms POOR |
PhysicalDisk/LogicalDisk | Avg. Disk sec/Write | * | Disk Latency > 50ms POOR |
PhysicalDisk/LogicalDisk | Avg. Disk Queue/sec | * | Disk Latency |
PhysicalDisk/LogicalDisk | Avg. Disk Write Queue Length | * | 1.5 x # of spindles |
PhysicalDisk/LogicalDisk | Current Disk Queue Length | * | 1.5 x # of spindles |
PhysicalDisk/LogicalDisk | Disk Bytes/sec | * | |
PhysicalDisk/LogicalDisk | Disk Read Bytes/sec | * | |
PhysicalDisk/LogicalDisk | Disk Write Bytes/sec | * | |
PhysicalDisk/LogicalDisk | Disk Reads/sec | * | }~150 (for random) |
PhysicalDisk/LogicalDisk | Disk Transfers/sec | * | } |
PhysicalDisk/LogicalDisk | Disk Writes/sec | * | } |
PhysicalDisk/LogicalDisk | | * | Fewer the better |
Process | % Processor Time | _Total, | |
Process | Page Faults/sec | _Total, | <> |
Process | Handle Count | _Total, | |
Process | IO Data Operations/sec | _Total, | |
Process | IO Other Operations/sec | _Total, | |
Process | Private Bytes | _Total, | |
Process | Virtual Bytes | _Total, | |
Process | Working Set | _Total, | |
Processor | % Idle | | |
Processor | % Privileged Time | | |
Processor | % Processor Time | | |
Processor | % User | | User+Priv>80% CPU bottleneck |
Processor | Interrupts/sec | | |
System | Context Switches/sec | | 5,000/s x # of CPUs; somewhat flexible if other CPU-related counters are good--including low SQL compiles/s |
System | Processor Queue Length | | 2 x # of CPUs; threads waiting to be executed; >2 means CPU bottleneck |
Access Methods | Full Scans/sec | | (Index Searches/sec)/(Full Scans/sec) > 1000 |
Access Methods | Index Searches/sec | | (Index Searches/sec)/(Full Scans/sec) > 1000 |
Access Methods | Mixed Page Allocations/sec | | baseline |
Access Methods | Page Splits/sec | | baseline or <> |
Access Methods | Mixed Page Allocations/sec | | baseline |
Access Methods | Workfiles Created/sec | | baseline |
Access Methods | Worktables Created/sec | | baseline |
Access Methods | Worktables from Cache Ratio | | baseline |
General Statistics | Temp Tables Creation Rate | | baseline |
General Statistics | Temp Tables for Destruction | | baseline |
Buffer Manager | Buffer Cache Hit Ratio | | 98% OK, 99% better, 99.8% desirable |
Buffer Manager | Checkpoint pages/sec | | baseline |
Buffer Manager | Database Pages | | <> |
Buffer Manager | Lazy writes/sec | | baseline |
Buffer Manager | Page Life Expectancy | | <300s>1,000 or >10,000s much better |
Buffer Manager | Procedure Cache Pages | | (2000) prefer stable values, otherwise cache thrash may be occurring |
Plan Cache | Cache Pages | | (2005) prefer stable values, otherwise cache thrash may be occurring |
Buffer Manager | Total pages | | baseline |
Cache Manager | Cache Hit Ratio | | <99.9>98 not optimal |
Databases | Data File(s) Size (kb) | | |
Databases | Log Flushes/sec | | baseline |
Databases | Percent Log Used | | |
Databases | Transactions/sec | | baseline |
General Statistics | Logins/sec & Logouts/sec | | baseline; >2/sec suggests app's implementation of connection pooling may be flawed |
General Statistics | User Connections | | ~ |
Latches | Average Latch Wait Time (ms) | | ~ |
Latches | Latch Waits/sec | | (Total Latch Wait Time) / (Latch Waits/Sec) <> |
Latches | Total Latch Wait Time (ms) | | (Total Latch Wait Time) / (Latch Waits/Sec) <> |
Locks | Avg. Wait Time (ms) | | ~ |
Locks | Lock Timeouts/sec | | > 0 |
Locks | Lock Wait Time (ms) | | |
Locks | Lock Waits/sec | | > 0 |
Locks | Number of Deadlocks/sec | | > 0 |
Memory Manager | Granted Workspace Memory (KB) | | baseline |
Memory Manager | Memory Grants Pending | | >0 is cause for concern |
Memory Manager | Memory Grants Outstanding | | |
Memory Manager | Target Server Memory | | high or rising value indicates insufficient memory |
Memory Manager | Total Server Memory | | ~=Physical Ram |
SQL Statistics | Batch Requests/sec | | OLTP workloads: server should support ~2,500/s/core |
SQL Statistics | SQL Compilations/sec | | if compiles > 10% of Batch Requests, then app may not be caching effectively |
SQL Statistics | SQL Recompilations/sec | | Recompiles > 10% of Compilations should be investigated |
SQL Server:Buffer Manager | Free list stalls/sec | | <> |
SQL Server:Buffer Manager | Free pages | | > 640 |
SQL Server:Buffer Manager | Lazy Writes/Sec | | <> |
SQLServer:Buffer Manager | Page lookups/sec | | (Page lookups/sec) / (Batch Requests/sec) <> |
SQL Server:Buffer Manager | Page reads/sec | | <> |
SQL Server:Buffer Manager | Page writes/sec | | <> |
No comments:
Post a Comment