CDC (Change Data Capture) is a pretty robust technology. We’ve been using it for a while now and it works great. As it is built on top of another established robust technology Replication, I was more confident to continue using it in our production environment.
One aspect of CDC that has consumed a little more time for us was the CDC Cleanup operation (purging). Since there are a number of CDC tables available to query the internals of CDC progress, at at given time, it makes troubleshooting easier.
Our environment and my question from a few days ago:
We have CDC enabled on SQL Server 2008 with retention re-configured to a number other than the default 3 days (or something similar).
When we first configured it (with default options), it worked fine. But since the day we changed the retention period from default to something like 25 days, it stopped working correctly. It would run for many hours (15 to 20 hours). So, after a few days we disabled the job (not good, but we had other priorities to deal with than the data retention).
Now after 10 months, the disk is close to getting full. So we need to re-configure it and start purging data. A few days ago, just to test how it works, I ran a test run. I checked the timestamp of the oldest LSN and added a month to it and changed the retention period to something like 10 months + 2 days. So when the clean up job runs, it should come back with success immediately, as there is no data older than 10 months. But it does not. Its still runs for many hours.
Now I am starting to wonder if the retention period (column) in msdb.dbo.cdc_jobs table works or not.Aaron Bertrand mentioned in his blog that he did not have perfect success with this setting. Maybe this setting is not used by the Clean up job.
I posted this question a few days ago and after some experiments, I arrived at this answer and posted it back in the same site (StackOverflow).
CDC cleanup job relies on msdb.dbo.cdc_jobs table. As far as I know, the only way to configure retention period is by updating the retention column in the
msdb.dbo.cdc_jobs table. You want to update the retention column of the record with
job_type column. Retention is mentioned in number of minutes of data to retain. So if you want to retain data only for 1 day, update the column with 1440 (number of minutes in a day).
It could be done in two ways:
EXECUTE sys.sp_cdc_change_job @job_type = N'cleanup', @retention = 349082;
UPDATE msdb.dbo.cdc_jobs SET retention = '349082' WHERE job_type = 'cleanup'
If you have more than one CDC running, then using the appropriate
CDC instance name.
A Little more info:
CDC cleanup job runs, it purges all data in the
CDC tables up to this retention date. You can stop the
CDC cleanup job any time while it is running. When you start it, it will pick up from where it left off. So if you stopped the job while it is running today and start to run it tomorrow it will start from where it left off yesterday and perform purge until the retention period mentioned in the
CDC is pretty robust technology.
Hope it help,
Read Full Post »