Saturday, November 26, 2011

Thoughts on Improving DBA Productivity

Database administrators (DBAs) need to move away from reactive, repetitive tasks. Use the ideas presented here to become more productive.

Problem Resolution

DBAs should standardize problem resolution methods and the process of root cause analysis.

Most large IT organizations have multiple teams with which DBAs must collaborate. So, we need to consider how the extended team works together. If teams are in react mode, they tend to reactively approach problems; they concentrate on eliciting symptoms and applying quick fixes rather than searching for underlying causes. This leads to failure; or, at best, inefficient and overworked teams.

The most common reasons DBAs fail are the following.

Unclear problem definition: DBAs tend to see the symptoms and consider that the problem is now completely defined. For example, a long-running SQL statement may be diagnosed as "bad" and needing to be "fixed." Work then proceeds on tuning the SQL statement without considering possible causes such as poorly organized data, invalid or unavailable data distribution statistics, data locking delays, incorrect lock granularity (row rather than page), disk synchronous I/O delays, or overall system load. Teams must begin by recognizing the difference between symptoms and their possible underlying causes. One way to accomplish this is to ask why a symptom or problem appears.

Making unfounded assumptions: Many experienced DBAs rely on their instincts or gut feelings only to discover that issues surface because their assumptions were false. Consider a distributed Java application that seems to be "locking" the mainframe DB2 data that it accesses, preventing other applications from reading the data. Without knowing the specific connection method and package bind parameters, the DBA may assume the fix will involve appending "WITH UR" (uncommitted read) to the SQL statements from the application. In some cases, non-standard parameters such a s ACQUIRE (ALLOCATE), ISOLATION(RR) or CURRENTDATA(YES) may be the real culprit.

No structured problem-solving process: Using a standard approach can help technical teams.

Stopping with the first good idea: When some teams encounter a problem, one technician will choose a solution and stop. It’s as if the purpose of the process was to generate a single answer. Rarely does the team consider multiple good ideas, which may then be combined or executed in parallel. Teams also need to consider and implement ideas for detecting the symptom or problem if it recurs, preventing the problem, or processes that will automatically detect and correct the problem.

Continual Improvement

Document, centralize, standardize. Assign quality measures to documents and processes. Productivity for the IT professional means doing the right work while minimizing wasted effort and redundancy. One of the original sources of continuous improvement comes from the Software Engineering Institute at Carnegie Mellon University. Their Capability Maturity Model (CMM) imposes an organizational structure on software development processes and, by extension, infrastructure support processes such as database administration. The model describes processes and practices in terms of a maturity level that can be used as a guide for gauging current maturity and for planning improvements. Continuous improvement is where you work on tasks that increase process and practice maturity.

Consider a process for database change control. Assume the process is poorly documented, uses non-standard scripts, and has no quality measures. Continuous improvement would involve raising the maturity level from repeatable to defined, which might involve improving documentation, standardizing processes, and adding process quality measures.

When implementing a continuous improvement process, follow these guidelines:

•       Make the processes repeatable so earlier successes aren’t wasted. This means standardized JCL and scripts that can be executed by anyone without modification, and instructions for execution and for follow-up on failure. Ensure the entire team is aware that processes exist and you have procedures for controlling changes to documentation and processes.

•       Standardize. Processes and documentation are all well-written in a standard format. Complete information is available in a centralized area, indexed for ease-of-use. The entire team should be familiar with approved ways to access and update information.

•       Manage. Implement initial quality measures. Processes and documentation should be regularly reviewed for possible updates.

Metrics and Dashboards

Be able to answer, "What have you done for us lately?" with numbers and dollars. Many DBAs fail to realize how much IT management depends on numbers when measuring activity or productivity. In today’s climate, the extended IT enterprise will be forced to increase productivity and reduce costs. How can this happen if management can’t measure what the DBA does?

Measures such as the number of tables, indexes, or columns moved to production, or the number of hours spent on a project concentrate on single, one-dimensional tasks; they fail to consider the DBA and the supported DBMS and infrastructure as an integrated system. Consider the performance of an online application from key perspectives:

•       The user dimension includes transaction rates, data throughput, and perceived transaction elapsed times.

•       The system dimension includes resource usage (CPU, DASD, etc.), availability of high-performance access paths (e.g., existence of indexes), and data clustering.

•       The DBA dimension includes scheduling of utilities such as reorg and copy, along with some subjective items such as how well the DBA knows the tables, the SQL, and the application.

To give IT management the numbers they need, you need to find measures of the application that correlate to its performance. You also need the same thing for yourself. What things correlate to your performance?

Measurements typically summarize resources spent on tasks, or measure the state of documentation and processes. The latter measurements include documentation and process quality (readability, repeatability, etc.). Many shops use a time-tracking system to gather resource use.

Without quality measures, it’s difficult to prioritize improvements. Documentation quality measures go beyond good grammar. Teams must decide what quality measures are important; some of these might be:

•       Applicability: Does this documentation apply to a large set of general conditions or only a few specific cases?

•       Verifiability: Can the documentation be cross-referenced to other documentation or publications such as product manuals? Are there simple means of verifying that the documentation is correct, current and complete?

•       Currency: Is the documentation up-to-date? When will the information expire or become useless? Must it be updated at specific times or at the time of certain events?

•       Reference- or task-centric: Is the documentation mostly reference material, or will it be used to complete some process? Does it include checklists, flowcharts, or other means of process management?

Start with these and use simple measures such as a one-to five-point scale. It isn’t necessary to be 100 percent accurate; the important point is to use the results to prioritize improvements.

As you fix problems, document the fixes along with your analysis. Compare the quality of the fix document with that of the process and related documentation. If the fixes are higher quality, then your documentation and processes need to be reviewed and updated. If the documentation and process have higher quality, then you need to improve either your problem-solving skills or your follow- up in documentation of repeatable processes.

Summary

The notions presented here can help the DBA move away from reactive tasks, initiate quality measures, and offload basic, repetitive tasks. Hopefully, you can use this material as you work to become more productive. Remember to quantify and document your results, then advertise your value.

Author: Lockwood Lyon

Sunday, November 20, 2011

DataGridView in Windows Forms – Tips and Tricks

DataGridView in Windows Forms – Tips and Tricks
 
DataGridView control is a Windows Forms control that gives you the ability to customize and edit tabular data. It gives you number of properties, methods and events to customize its appearance and behavior. In this article, we will discuss some frequently asked questions and their solutions. These questions have been collected from a variety of sources including some newsgroups, MSDN site and a few, answered by me at the MSDN forums.
Tip 1 – Populating a DataGridView
 
In this short snippet, we will populate a DataGridView using the LoadData() method. This method uses the SqlDataAdapter to populate a DataSet. The table ‘Orders’ in the DataSet is then bound to the BindingSource component which gives us the flexibility to choose/modify the data location.
C#
public partial class Form1 : Form
    {
        private SqlDataAdapter da;
        private SqlConnection conn;
        BindingSource bsource = new BindingSource();
        DataSet ds = null;
        string sql;
 
        public Form1()
        {
            InitializeComponent();
        }
 
        private void btnLoad_Click(object sender, EventArgs e)
        {
            LoadData();
        }
 
        private void LoadData()
        {
string connectionString = "Data Source=localhost;Initial Catalog=Northwind;" + "Integrated Security=SSPI;";
            conn = new SqlConnection(connectionString);
sql = "SELECT OrderID, CustomerID, EmployeeID, OrderDate, Freight," + "ShipName, ShipCountry FROM Orders";
 
            da = new SqlDataAdapter(sql, conn);
            conn.Open();
            ds = new DataSet();
            SqlCommandBuilder commandBuilder = new SqlCommandBuilder(da);          
            da.Fill(ds, "Orders");
            bsource.DataSource = ds.Tables["Orders"];
            dgv.DataSource = bsource;          
        }
    }
VB.NET
Public Partial Class Form1
      Inherits Form
            Private da As SqlDataAdapter
            Private conn As SqlConnection
            Private bsource As BindingSource = New BindingSource()
            Private ds As DataSet = Nothing
            Private sql As String
 
            Public Sub New()
                  InitializeComponent()
            End Sub
 
Private Sub btnLoad_Click(ByVal sender As Object, ByVal e As EventArgs)
                  LoadData()
            End Sub
 
            Private Sub LoadData()
Dim connectionString As String = "Data Source=localhost;Initial Catalog=Northwind;" & "Integrated Security=SSPI;"
                  conn = New SqlConnection(connectionString)
sql = "SELECT OrderID, CustomerID, EmployeeID, OrderDate, Freight," & "ShipName, ShipCountry FROM Orders"
 
                  da = New SqlDataAdapter(sql, conn)
                  conn.Open()
                  ds = New DataSet()
Dim commandBuilder As SqlCommandBuilder = New SqlCommandBuilder(da)
                  da.Fill(ds, "Orders")
                  bsource.DataSource = ds.Tables("Orders")
                  dgv.DataSource = bsource
            End Sub
End Class
 
Tip 2 – Update the data in the DataGridView and save changes in the database
 
After editing the data in the cells, if you would like to update the changes permanently in the database, use the following code:
C#
        private void btnUpdate_Click(object sender, EventArgs e)
        {
            DataTable dt = ds.Tables["Orders"];
           this.dgv.BindingContext[dt].EndCurrentEdit();
            this.da.Update(dt);
        }
VB.NET
Private Sub btnUpdate_Click(ByVal sender As Object, ByVal e As EventArgs)
                  Dim dt As DataTable = ds.Tables("Orders")
                  Me.dgv.BindingContext(dt).EndCurrentEdit()
                  Me.da.Update(dt)
      End Sub
Tip 3 – Display a confirmation box before deleting a row in the DataGridView
 
Handle the UserDeletingRow event to display a confirmation box to the user. If the user confirms the deletion, delete the row. If the user clicks cancel, set e.cancel = true which cancels the row deletion.
C#
private void dgv_UserDeletingRow(object sender, DataGridViewRowCancelEventArgs e)
        {
            if (!e.Row.IsNewRow)
            {
                DialogResult res = MessageBox.Show("Are you sure you want to delete this row?", "Delete confirmation",
                         MessageBoxButtons.YesNo, MessageBoxIcon.Question);
                if (res == DialogResult.No)
                    e.Cancel = true;
            }
        }
VB.NET
Private Sub dgv_UserDeletingRow(ByVal sender As Object, ByVal e As DataGridViewRowCancelEventArgs)
                  If (Not e.Row.IsNewRow) Then
                        Dim res As DialogResult = MessageBox.Show("Are you sure you want to delete this row?", "Delete confirmation", MessageBoxButtons.YesNo, MessageBoxIcon.Question)
                        If res = DialogResult.No Then
                              e.Cancel = True
                        End If
                  End If
End Sub
Tip 4 – How to autoresize column width in the DataGridView
 
The snippet shown below, first auto-resizes the columns to fit its content. Then the AutoSizeColumnsMode is set to the ‘DataGridViewAutoSizeColumnsMode.AllCells’ enumeration value which automatically adjust the widths of the columns when the data changes.
C#
private void btnResize_Click(object sender, EventArgs e)
        {
            dgv.AutoResizeColumns();
            dgv.AutoSizeColumnsMode = DataGridViewAutoSizeColumnsMode.AllCells;
 
        }
VB.NET
Private Sub btnResize_Click(ByVal sender As Object, ByVal e As EventArgs)
                  dgv.AutoResizeColumns()
                  dgv.AutoSizeColumnsMode = DataGridViewAutoSizeColumnsMode.AllCells
 
End Sub
 
Tip 5 - Select and Highlight an entire row in DataGridView
 
C#
int rowToBeSelected = 3; // third row
if (dgv.Rows.Count >= rowToBeSelected)
{             
       // Since index is zero based, you have to subtract 1
        dgv.Rows[rowToBeSelected - 1].Selected = true;
}
VB.NET
Dim rowToBeSelected As Integer = 3 ' third row
If dgv.Rows.Count >= rowToBeSelected Then
         ' Since index is zero based, you have to subtract 1
            dgv.Rows(rowToBeSelected - 1).Selected = True
End If
Tip 6 - How to scroll programmatically to a row in the DataGridView
 
The DataGridView has a property called FirstDisplayedScrollingRowIndex that can be used in order to scroll to a row programmatically.
C#
int jumpToRow = 20;
if (dgv.Rows.Count >= jumpToRow && jumpToRow >= 1)
{             
        dgv.FirstDisplayedScrollingRowIndex = jumpToRow;
        dgv.Rows[jumpToRow].Selected = true;
}
 
VB.NET
Dim jumpToRow As Integer = 20
If dgv.Rows.Count >= jumpToRow AndAlso jumpToRow >= 1 Then
            dgv.FirstDisplayedScrollingRowIndex = jumpToRow
            dgv.Rows(jumpToRow).Selected = True
End If
 
Tip 7 - Calculate a column total in the DataGridView and display in a textbox
 
A common requirement is to calculate the total of a currency field and display it in a textbox. In the snippet below, we will be calculating the total of the ‘Freight’ field. We will then display the data in a textbox by formatting the result (observe the ToString("c")) while displaying the data, which displays the culture-specific currency.
C#
private void btnTotal_Click(object sender, EventArgs e)
        {
            if(dgv.Rows.Count > 0)
             txtTotal.Text = Total().ToString("c");
        }
 
        private double Total()
        {
            double tot = 0;
            int i = 0;
            for (i = 0; i < dgv.Rows.Count; i++)
            {
                tot = tot + Convert.ToDouble(dgv.Rows[i].Cells["Freight"].Value);
            }
            return tot;
        }
VB.NET
Private Sub btnTotal_Click(ByVal sender As Object, ByVal e As EventArgs)
                  If dgv.Rows.Count > 0 Then
                   txtTotal.Text = Total().ToString("c")
                  End If
End Sub
 
Private Function Total() As Double
                  Dim tot As Double = 0
                  Dim i As Integer = 0
                  For i = 0 To dgv.Rows.Count - 1
                        tot = tot + Convert.ToDouble(dgv.Rows(i).Cells("Freight").Value)
                  Next i
                  Return tot
End Function
Tip 8 - Change the Header Names in the DataGridView
 
If the columns being retrieved from the database do not have meaningful names, we always have the option of changing the header names as shown in this snippet:
C#
private void btnChange_Click(object sender, EventArgs e)
        {
            dgv.Columns[0].HeaderText = "MyHeader1";
            dgv.Columns[1].HeaderText = "MyHeader2";
        }
 
VB.NET
Private Sub btnChange_Click(ByVal sender As Object, ByVal e As EventArgs)
                  dgv.Columns(0).HeaderText = "MyHeader1"
                  dgv.Columns(1).HeaderText = "MyHeader2"
End Sub
 
Tip 9 - Change the Color of Cells, Rows and Border in the DataGridView
 
C#
private void btnCellRow_Click(object sender, EventArgs e)
        {
            // Change ForeColor of each Cell
            this.dgv.DefaultCellStyle.ForeColor = Color.Coral;
            // Change back color of each row
            this.dgv.RowsDefaultCellStyle.BackColor = Color.AliceBlue;
            // Change GridLine Color
            this.dgv.GridColor = Color.Blue;
            // Change Grid Border Style
            this.dgv.BorderStyle = BorderStyle.Fixed3D;
        }
VB.NET
Private Sub btnCellRow_Click(ByVal sender As Object, ByVal e As EventArgs)
                  ' Change ForeColor of each Cell
                  Me.dgv.DefaultCellStyle.ForeColor = Color.Coral
                  ' Change back color of each row
                  Me.dgv.RowsDefaultCellStyle.BackColor = Color.AliceBlue
                  ' Change GridLine Color
                  Me.dgv.GridColor = Color.Blue
                  ' Change Grid Border Style
                  Me.dgv.BorderStyle = BorderStyle.Fixed3D
End Sub
Tip 10 - Hide a Column in the DataGridView
 
If you would like to hide a column based on a certain condition, here’s a snippet for that.
C#
private void btnHide_Click(object sender, EventArgs e)
        {
            this.dgv.Columns["EmployeeID"].Visible = false;
        }
VB.NET
Private Sub btnHide_Click(ByVal sender As Object, ByVal e As EventArgs)
                  Me.dgv.Columns("EmployeeID").Visible = False
End Sub
 
Tip 11 - Handle SelectedIndexChanged of a ComboBox in the DataGridView
 
To handle the SelectedIndexChanged event of a DataGridViewComboBox, you need to use the DataGridView.EditingControlShowing event as shown below. You can then retrieve the selected index or the selected text of the combobox.
C#
private void dataGridView1_EditingControlShowing(object sender, DataGridViewEditingControlShowingEventArgs e)
        {
            ComboBox editingComboBox = (ComboBox)e.Control;
            if(editingComboBox != null)
                editingComboBox.SelectedIndexChanged += new System.EventHandler(this.editingComboBox_SelectedIndexChanged);
        }
private void editingComboBox_SelectedIndexChanged(object sender, System.EventArgs e)
        {
            ComboBox comboBox1 = (ComboBox)sender;
            // Display index
            MessageBox.Show(comboBox1.SelectedIndex.ToString());
            // Display value
            MessageBox.Show(comboBox1.Text);
        }
VB.NET
Private Sub dataGridView1_EditingControlShowing(ByVal sender As Object, ByVal e As DataGridViewEditingControlShowingEventArgs)
                  Dim editingComboBox As ComboBox = CType(e.Control, ComboBox)
                  If Not editingComboBox Is Nothing Then
                        AddHandler editingComboBox.SelectedIndexChanged, AddressOf editingComboBox_SelectedIndexChanged
                  End If
End Sub
 
Private Sub editingComboBox_SelectedIndexChanged(ByVal sender As Object, ByVal e As System.EventArgs)
                  Dim comboBox1 As ComboBox = CType(sender, ComboBox)
                  ' Display index
                  MessageBox.Show(comboBox1.SelectedIndex.ToString())
                  ' Display value
                  MessageBox.Show(comboBox1.Text)
End Sub
Tip 12 - Change Color of Alternate Rows in the DataGridView
 
C#
private void btnAlternate_Click(object sender, EventArgs e)
        {
            this.dgv.RowsDefaultCellStyle.BackColor = Color.White;
            this.dgv.AlternatingRowsDefaultCellStyle.BackColor = Color.Aquamarine;
        }
VB.NET
Private Sub btnAlternate_Click(ByVal sender As Object, ByVal e As EventArgs)
                  Me.dgv.RowsDefaultCellStyle.BackColor = Color.White
                  Me.dgv.AlternatingRowsDefaultCellStyle.BackColor = Color.Aquamarine
End Sub
 
Tip 13 - Formatting Data in the DataGridView
 
The DataGridView exposes properties that enable you to format data such as displaying a currency column in the culture specific currency or displaying nulls in a desired format and so on.
C#
private void btnFormat_Click(object sender, EventArgs e)
        {
            // display currency in culture-specific currency for
            this.dgv.Columns["Freight"].DefaultCellStyle.Format = "c";
            // display nulls as 'NA'
            this.dgv.DefaultCellStyle.NullValue = "NA";
        }
VB.NET
Private Sub btnFormat_Click(ByVal sender As Object, ByVal e As EventArgs)
                  ' display currency in culture-specific currency for
                  Me.dgv.Columns("Freight").DefaultCellStyle.Format = "c"
                  ' display nulls as 'NA'
                  Me.dgv.DefaultCellStyle.NullValue = "NA"
End Sub
 
Tip 14 – Change the order of columns in the DataGridView
 
In order to change the order of columns, just set the DisplayIndex property of the DataGridView to the desired value. Remember that the index is zero based.
C#
private void btnReorder_Click(object sender, EventArgs e)
        {
             dgv.Columns["CustomerID"].DisplayIndex = 5;
             dgv.Columns["OrderID"].DisplayIndex = 3;
             dgv.Columns["EmployeeID"].DisplayIndex = 1;
             dgv.Columns["OrderDate"].DisplayIndex = 2;
             dgv.Columns["Freight"].DisplayIndex = 6;
             dgv.Columns["ShipCountry"].DisplayIndex = 0;
             dgv.Columns["ShipName"].DisplayIndex = 4;
        }
VB.NET
Private Sub btnReorder_Click(ByVal sender As Object, ByVal e As EventArgs)
                   dgv.Columns("CustomerID").DisplayIndex = 5
                   dgv.Columns("OrderID").DisplayIndex = 3
                   dgv.Columns("EmployeeID").DisplayIndex = 1
                   dgv.Columns("OrderDate").DisplayIndex = 2
                   dgv.Columns("Freight").DisplayIndex = 6
                   dgv.Columns("ShipCountry").DisplayIndex = 0
                   dgv.Columns("ShipName").DisplayIndex = 4
End Sub
 

Saturday, November 19, 2011

Field Level Auditing using Change Data Capture – Part 2

SQL Server 2008 introduced a new auditing tool called Change Data Capture (CDC). Using CDC, it is easy and straight forward to track field level changes.

In the first part of this series, we looked at the basics of Change Data Capture and enabled CDC on a sample database and table. We then looked at some of the CDC functions and stored procedures that can be used to configure and query the change table data. One job created with CDC is a cleanup job to delete data from the change table after a retention period, usually a few days. So unless you disable this cleanup job, we need to move this change table data somewhere more permanent. In this article we will build on some of the stored procedures and functions used in Part 1 to create a simple data warehouse to mart the change data, and then create a report using SSRS to view that data.

Create the warehouse

Start by creating the data warehouse database:

USE [master]
GOCREATE DATABASE [CDCDW] ON  PRIMARY
( NAME = N'CDCDW', FILENAME = N'C:\CDCDW.mdf' , SIZE = 2048KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON
( NAME = N'CDCDW_log', FILENAME = N'C:\CDCDW_log.ldf' , SIZE = 1024KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
 
Next create the tables to hold the warehouse data. Include bits flags to indicate which fields changed for each row and fields to hold the timestamp from the LSN and the user name that changed the record based on last_change_id field in the source table. There will be two tables, stagingCDCContract and factCDCContract. The staging table will be truncated and populated with each SSIS run’s data, and the fact table holds all change table data accumulated over time. It is the fact table that will be used for the SSRS report:

USE [CDCDW]
GOCREATE TABLE [dbo].[stagingCDCContract](
      [__$start_lsn] [binary](10) NOT NULL,    
      [__$seqval] [binary](10) NOT NULL,
      [__$operation] [int] NOT NULL,
      [__$update_mask] [varbinary](128) NULL,
      [id] [int] NOT NULL,
      [name] [varchar](50) NULL,
      [year] [int] NULL,
      [length] [int] NULL,
      [type] [int] NULL,
      [amount] [decimal](18, 2) NULL,
      [last_change_id] [int] NULL,
      [last_change_date] [datetime] NULL,
      [last_change_name] [varchar](100) NULL,  
      [IsNameUpdated] bit null default (0),    
      [IsYearUpdated] bit null default (0),
      [IsLengthUpdated] bit null default (0),
      [IsTypeUpdated] bit null default (0),
      [IsAmountUpdated] bit null default (0),
      [IsLastChangeIdUpdated] bit null default (0)   ) ON [PRIMARY]
GOCREATE TABLE [dbo].[factCDCContract](    
      [__$start_lsn] [binary](10) NOT NULL,    
      [__$seqval] [binary](10) NOT NULL,
      [__$operation] [int] NOT NULL,
      [__$update_mask] [varbinary](128) NULL,
      [id] [int] NOT NULL,
      [name] [varchar](50) NULL,
      [year] [int] NULL,
      [length] [int] NULL,
      [type] [int] NULL,
      [amount] [decimal](18, 2) NULL,
      [last_change_id] [int] NULL,
      [last_change_date] [datetime] NULL,
      [last_change_name] [varchar](100) NULL,  
      [IsNameUpdated] bit null default (0),    
      [IsYearUpdated] bit null default (0),
      [IsLengthUpdated] bit null default (0),
      [IsTypeUpdated] bit null default (0),
      [IsAmountUpdated] bit null default (0),
      [IsLastChangeIdUpdated] bit null default (0)
      CONSTRAINT [PK_factCDCContract] PRIMARY KEY CLUSTERED
(
      [__$start_lsn] ASC,    
      [__$seqval] ASC,
      [__$operation] ASC,    
      [id]
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
 
In this example, all the staging fields map directly to the fact table fields. In real life there could be additional mapping to other dimension tables, especially if this data needs to be loaded into a SSAS cube.

We will also create a history table to store the last run time for each change table and initialize it to today’s date:

CREATE TABLE [dbo].[cdcHistory](
      [cdc_table] [varchar](50) NULL,
      [cdc_date] [datetime] NULL
) ON [PRIMARY]
GOinsert into [cdcHistory]
  values ('dbo_contract', 9/1/2011')

 

Add the Stored Procedures

Next we need to create a stored procedure that can be used to pull the change table source data. It uses the same basic syntax as the example in Part 1 that maps the update mask to individual field changes. It also converts the __$start_lsn to a change date and joins to a Users table to get the name of the user who changed the data. This data will be needed for the report.

USE [CDCDemo]
goCREATE PROCEDURE [dbo].[spgGetCDCContract]
 @FromLSNDate as datetime,
 @ToLSNDate as datetime
AS
begin
SET NOCOUNT ON
SET ROWCOUNT 0
declare @from_lsn binary(10)
declare @to_lsn binary(10)
SET @from_lsn = sys.fn_cdc_map_time_to_lsn('largest less than', @FromLSNDate);
SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @ToLSNDate);
if (@from_lsn is not null and @to_lsn is not null and @from_lsn < @to_lsn)
 begin

  DECLARE @name_ordinal int
  DECLARE @year_ordinal int
  DECLARE @length_ordinal int
  DECLARE @type_ordinal int
  DECLARE @amount_ordinal int
  DECLARE @last_change_id_ordinal int

  SET @name_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','name')
  SET @year_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','year')
  SET @length_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','length')
  SET @type_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','type')
  SET @amount_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','amount')
  SET @last_change_id_ordinal = sys.fn_cdc_get_column_ordinal('dbo_contract','last_change_id')

  SELECT
    cdc.fn_cdc_get_all_changes_dbo_contract.*,
    sys.fn_cdc_map_lsn_to_time(__$start_lsn) as 'last_change_date',
    cast(isnull(user_name, '') as varchar(100)) as 'last_change_name',
    sys.fn_cdc_is_bit_set(@name_ordinal,__$update_mask) as 'IsNameUpdated',
    sys.fn_cdc_is_bit_set(@year_ordinal,__$update_mask) as 'IsYearUpdated',
    sys.fn_cdc_is_bit_set(@length_ordinal,__$update_mask) as 'IsLengthUpdated',
    sys.fn_cdc_is_bit_set(@type_ordinal,__$update_mask) as 'IsTypeUpdated',
    sys.fn_cdc_is_bit_set(@amount_ordinal,__$update_mask) as 'IsAmountUpdated',
    sys.fn_cdc_is_bit_set(@last_change_id_ordinal,__$update_mask) as 'IsLastChangeIdUpdated'
   FROM cdc.fn_cdc_get_all_changes_dbo_contract( @from_lsn, @to_lsn, 'all')
     left join Users 
       on Users.user_id = fn_cdc_get_all_changes_dbo_contract.last_change_id
 end
else
 begin
  -- return empty row
  select
    cdc.dbo_contract_CT.*,
    getdate() as 'last_change_date',
    '' as 'last_change_name',
    cast(0 as bit) as 'IsNameUpdated',
    cast(0 as bit) as 'IsYearUpdated',
    cast(0 as bit) as 'IsLengthUpdated',
    cast(0 as bit) as 'IsTypeUpdated',
    cast(0 as bit) as 'IsAmountUpdated',
    cast(0 as bit) as 'IsLastChangeIdUpdated'
   from cdc.dbo_contract_CT
   where __$start_lsn = 0x00000000000000000000 
 end
end
GO

This stored procedure takes two parameters, a start LSN date and an end LSN date. It uses the function fn_cdc_map_time_to_lsn to map the date to a LSN. The start LSN date will be the last date in the cdcHistory table. The next run should start from here. Create a stored procedure to get that date:

USE [CDCDW]
GOCREATE PROCEDURE [dbo].[spgGetCDCHistoryLastLSNDate]
  @cdcTable as varchar(50)
AS
begin
SET NOCOUNT ON
SET ROWCOUNT 0
select max(cdc_date) as 'LastCDCLSNDate'
 from cdcHistory
 where cdc_table = @cdcTableend
GO
 
The end LSN date will be the maximum date in the lsn_time_mapping table. Use the functions fn_cdc_get_max_lsn and fn_cdc_map_lsn_to_time to get this date. Create another stored procedure to return that date from the CDC source database:

USE [CDCDemo]
goCREATE PROCEDURE [dbo].[spgGetCDCMaxLSNDate]
AS
begin
SET NOCOUNT ON
SET ROWCOUNT 0
declare @maxCDCDate as datetime
declare @to_lsn binary(10)
set @to_lsn = sys.fn_cdc_get_max_lsn();
set @maxCDCDate = sys.fn_cdc_map_lsn_to_time(@to_lsn);
select @maxCDCDate as 'MaxCDCLSNDate'
end
GO
 
Finally, create a stored procedure to set the last run date in the history table when the job completes:

USE [CDCDW]
GOCREATE PROCEDURE [dbo].[spgSetCDCHistory]
 @cdcTable as varchar(50),
 @cdcDate as datetime
AS
begin
SET NOCOUNT ON
SET ROWCOUNT 0
insert into [cdcHistory]
 values(@cdcTable, @cdcDate)
end
GO

Create the package

Now we can create a simple SSIS package that can be used to pull the change data from the change table and insert it into the data warehouse. Start by creating a new SSIS project, a package to hold the tasks, and two data sources, one for the CDC source database and the other for the data warehouse:



Next add a sequence container to the package. Create two variables to hold the stored procedure parameters FromLSNDate and ToLSNDate, making sure to highlight the container so the variable scope is set at the container level. Set the variable Type to DateTime:



Now add the tasks to move the change table data. First add an Execute SQL Task to set the FromLSNDate variable to the last run date. Right click and select Edit. Set ResultSet to Single Row, set Connection to the data warehouse data source, and set SQL Statement to call the stored procedure spgGetCDCHistoryLastLSNDate, passing in the table name as a parameter:

exec [spgGetCDCHistoryLastLSNDate] @cdcTable='dbo_contract'



To assign the return value to the variable, click Result Set and add a Result Name value for the result LastCDCLSNDate returned from the stored procedure and map it to the variable FromLSNDate:



Click Ok to save. Next add a second Execute SQL Task to set the ToLSNDate variable. Right click Edit, set the ResultSet to Single Row, set the Connection to the source database data source, and set the SQL Statement to:

exec [spgGetCDCMaxLSNDate]
 




Click the Result Set option and add a Result Name value for the result MaxCDCLSNDate returned from the stored procedure and map it to the variable ToLSNDate:




 
Click Ok to save. Next add a third Execute SQL task to truncate the table stagingCDCContract so it can be populated with this run’s data:




 
Now we are ready to copy the change table data to the staging table in the data warehouse. Add a Data Flow task, right click Edit and add an OLE DB Source data flow source and an OLE DB Destination data flow destination. Right click Edit on the OLE DB Source, set the connection manager to the CDC database, select SQL Command for the Data access mode, and enter this command to call the stored procedure:

exec [spgGetCDCContract] @FromLSNDate=?, @ToLSNDate=?



 
Click the Parameters button and set the input parameters to the corresponding variables set in the previous tasks:




 
Click Ok to save. On the OLE DB Destination data flow task, set the connection to the data warehouse data source, set the mode to Table or view – fast load, and select the staging table as the name of the table:




 
Click OK to save. Next add another Execute SQL task to copy the staging table data to the fact table. Set the Result Set to None, the connection to the data warehouse data source, and the SQL Statement to:

insert into factCDCContractselect * 
 from stagingCDCContract
 where __$start_lsn not in (select __$start_lsn
                             from factCDCContract)


 
Click OK to save. Finally, add one last Execute SQL task to update the history table with the last run date. Set the Result Set to None, the connection to the data warehouse data source, and the SQL Statement to:

exec [spgSetCDCHistory] @cdcTable='dbo_contract', @cdcDate=?
 



 
Set the Parameter Mapping section to set the stored procedure parameter to the variable ToLSNDate. Set the Data Type to DATE:




 
Click OK to save. You should have a complete package that looks something like this:



But does it work

Run the package by right clicking on the container and select Execute Container. Everything should return green for success:



And looking in the data warehouse factCDCContract table, we can see the results:



This SSIS package can now be scheduled to run nightly to move the CDC data to the data warehouse. The CDC cleanup job will delete data after the retention period to manage the source change data tables and keep them from growing too large.

Create the report

Now that we have marted the change table data, we can create a simple SSRS report that can be used to show field level changes. Create a stored procedure that will pull the warehoused data to display in the report. Create this stored procedure in the data warehouse database adding Name as a parameter:


use [CDCDW]
goCREATE PROCEDURE [dbo].[spgGetCDCContractHistory]
 @Name as varchar(50)
AS
begin
SET NOCOUNT ON
SET ROWCOUNT 0
if (@Name is not null)
 begin
  select
    factCDCContract.id,
    __$seqval,
    factCDCContract.name,
    factCDCContract.year,
    factCDCContract.length,
    factCDCContract.type,
    factCDCContract.amount,
    factCDCContract.last_change_id,
    factCDCContract.last_change_date,
    factCDCContract.last_change_name,
    IsNameUpdated,
    IsYearUpdated,
    IsLengthUpdated,
    IsTypeUpdated,
    IsAmountUpdated,
    IsLastChangeIdUpdated
   from factCDCContract
   where factCDCContract.name = @Name
   order by __$seqval desc
 end
end
GO
Now create a simple SSRS report to pull the change data from the data warehouse and display it in a report. Start by creating a new SSRS project, a report, a shared data source for the data warehouse, a shared dataset for the stored procedure to pull the data, and a parameter called Name:



Next add a report Data Source using the Shared Data Source as the reference:





Add a Data Set using the Shared Dataset for the stored procedure and set the parameter to the Name parameter added earlier:




 
Finally add a Table report item with a column for each source data field in the warehouse change table. Your report should look something like this:



Now let’s add some color. Change the header background color to something besides white and the font weight to Bold. For each of the data fields, add an expression to change the color to Green if the field was updated:




 
Do the same thing for the font weight to change it to Bold if the field was changed:



Now select Preview to see what the results look like:



We now have a simple report that shows the field level changes for a database record over time. While this report is a basic audit report of record changes, the change data can used for a variety of different reports from user productivity to reports on daily or weekly load tracking to better predict resource needs during peak times.

Conclusion

Change Data Capture, as well as Change Tracking, introduce built in database auditing to SQL Server. In these articles we have enabled CDC on a database and table, pulled the change table data into a data warehouse, and created a report to view those changes. This helps deliver a robust auditing solution about what is happening to the data in a database over time. What could be better than that?