SharePoint DB Environment issues
VM Environment
2 Servers
Web Front End
Windows 2008 R2
SharePoint 2010
Database Back End
Windows 2008 R2
MS SQL Server 2008 x64
Going over the SQL Server logs we have seen a high occurrence of Error 823 and 824. They also show up in the Windows Application Event Viewer.
This is the general description we have been getting:
823) The operating system returned error incorrect checksum (expected: 0xb016ce52; actual: 0xb016ce52) to SQL Server during a read at offset 0x00000000fc6000 in file 'E:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
824) SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x686d8353; actual: 0x8ab05a4c). It occurred during a read of page (1:204495) in database ID 2 at offset 0x00000063d9e000 in file 'E:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
We restarted the SQL Server service so that tempdb would be recreated. We also ran DBCC CHECKDB on all databases and found no issues.
We have also used SQLIOSim to stress test the database server. The stress test gives an error of:
Error: 0x80070467
Error Text: While accessing the hard disk, a disk operation failed even after retries.
Description: Buffer validation failed on C:\sqliosim.mdx Page: 398593, offset 0x8
We also get the following warnings
Error: 0x00000000
Error Text:
Description: 296 IO requests are outstanding for more than 15 sec.
Our system administrator performed a diagnostic on both the hard drive and memory for the environments. No issues were found.
Our question is why would this happen to only temp.mdf? And what are the chances that other databases could start getting checksum errors? Any suggestions on what we could test next?
VM Environment
2 Servers
Web Front End
Windows 2008 R2
SharePoint 2010
Database Back End
Windows 2008 R2
MS SQL Server 2008 x64
Going over the SQL Server logs we have seen a high occurrence of Error 823 and 824. They also show up in the Windows Application Event Viewer.
This is the general description we have been getting:
823) The operating system returned error incorrect checksum (expected: 0xb016ce52; actual: 0xb016ce52) to SQL Server during a read at offset 0x00000000fc6000 in file 'E:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
824) SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x686d8353; actual: 0x8ab05a4c). It occurred during a read of page (1:204495) in database ID 2 at offset 0x00000063d9e000 in file 'E:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
We restarted the SQL Server service so that tempdb would be recreated. We also ran DBCC CHECKDB on all databases and found no issues.
We have also used SQLIOSim to stress test the database server. The stress test gives an error of:
Error: 0x80070467
Error Text: While accessing the hard disk, a disk operation failed even after retries.
Description: Buffer validation failed on C:\sqliosim.mdx Page: 398593, offset 0x8
We also get the following warnings
Error: 0x00000000
Error Text:
Description: 296 IO requests are outstanding for more than 15 sec.
Our system administrator performed a diagnostic on both the hard drive and memory for the environments. No issues were found.
Our question is why would this happen to only temp.mdf? And what are the chances that other databases could start getting checksum errors? Any suggestions on what we could test next?