Feeds:
Posts
Comments

Archive for July, 2013

Earlier, this week, I was playing with Table Partitioning on my test environment and ran into this issue. I never ran into this issue before, but it was interesting to run into it this time. Below is the error message:

Msg 7705, Level 16, State 2, Line 1
Could not implicitly convert range values type specified at ordinal 1 to partition function parameter type.

CREATE PARTITION FUNCTION pf_by_integer_month (SMALLINT)
AS
RANGE LEFT FOR VALUES (201301, 201302, 201303, 201304, 201305, 201306, 201307, 201308, 201309, 201310, 201311, 201312)
GO

In the past, I’ve created table partitioning on columns with more common data types, like INT or DATETIME. But this time I wanted to try something different; SMALLINT. It made perfect sense in my mind to use a smaller datatype than INT, hoping to save a couple of bytes every record. But the partitioning key values I have did not fit into SMALLINT range. I did not realize this until now.

But now, I do. My partitioning keys are like this:

201301 — for January
201302 — for February
201303 — for March


201312 — for December

CREATE PARTITION FUNCTION pf_by_integer_month (INT)
AS
RANGE LEFT FOR VALUES (201301, 201302, 201303, 201304, 201305, 201306, 201307, 201308, 201309, 201310, 201311, 201312)
GO

Obviously, the partition function ranges are 6 digit integers, but much higher than the range of SMALLINT. I modified data type of partitioning function to INT and it worked fine.
Simple, but interesting.

Hope this helps,
_Sqltimes

 

PS: When the datatype of partition function and datatype of the partitioning column in the table do not match, we get this error:

Msg 7726, Level 16, State 1, Line 1
Partition column ‘YearMon’ has data type smallint which is different from the partition function ‘pf_by_integer_month’ parameter data type int.

 

Read Full Post »

Early this week, I had an interesting challenge. One of our clients has a large database environment. They have a farm of Sql Server database servers, with databases, on each, crossing 1 terabyte size.

Side bar: Terabyte sized databases are common these days, but what is not common is a supporting scalable storage architecture for VLDBs. It is important that VLDBs are architected appropriately to accommodate the size and load to meet the SLA’s performance metrics. Databases everywhere, have been growing continuously, for several years now, and a lot of them are hitting (and crossing) the 1 terabyte size.

But I digress.

This is one such client.

We have two tables (Table_Main & Table_Archive) with both having close to billion records. As part of daily purge, records from Main table are moved to Archive table and then deleted from main table. The purge logic is not a simple ‘date based‘ archiving, but, at the same time, it’s not too complicated either. The ideal way to do this is would have been to implement ‘Table Partitioning’ and let it play it’s magic every time we need to purge data. But, its not implemented, yet. So, for now, we need to find a lightweight alternative solution to get past the current backlog and gain some breathing time before we implement a long term plan.

The current purge process is a convoluted logic, but primarily does  four things:

  • Get a total count of records
  • Using UPDATE, flag qualifying records for purge. But not all, just a small subset using SET ROWCOUNT 1800 and with ROWLOCK hint.
    • The reasoning was to avoid lock escalation to table lock, which could lock the table for a long time preventing any other queries from reading this table.
    • Another side bar: In Sql Server, query processing starts with the least amount of locks needed to complete any given task. As needed locks are escalated to successfully complete the job. Acquiring row lock on 10 rows is different from acquiring row locks on 100 million rows in a 200 million table. After a point, the cost of maintaining locks goes up so high that it makes sense to acquire table lock (lock escalation). Sql Server efficiently makes this judgement call, so I am not a particular fan of query hints. I use them occasionally, only after careful analysis. Query Optimizer is very efficient in figuring out the best way to retrieve data. But I digress.
  • INSERT flagged records into an Archive table.
  • DELETE flagged records (Using ROWCOUNT 1800)

There is a lot of fluff code, around the above steps, to navigate through the process. But, I think, most of the fluff could be done without. So, essentially, the Table_Main, is touched 4 times for every iteration of Purge. Obviously, this is not very efficient.

Using OUTPUT clause, I modified DELETE statement to perform most of the above steps in one go. Below is the modified version of the same code:


DELETE TOP 1800
 FROM Main_Table
 OUTPUT DELETED.LocalID,
 (DELETED.QtyEarned-DELETED.QtyUsed),
 3,
 GETDATE(),
 -9,
 ((DELETED.QtyEarned-DELETED.QtyUsed) * DELETED.Value * (1) ),
 DELETED.HHID
 INTO dbo.Table_Archive ( LocalID
 , StatusFlag
 , LastUpdate
 , LastLocationID
 , TotalValueEarned
 , HHID)
 WHERE ExpireACK=0
 AND ExpireDate<@ExpireDate
 AND QtyEarned>QtyUsed;

Breaking it down:

  • Just one command that does most of the tasks; All in one shot:
    • DELETE
    • ARCHIVE

DELETEs records based on the same logic and as they are being deleted, they are archived to ‘Table_Archive’. It also performs some column manipulations and since we can list individual columns you want to archive, the schema of main and archive tables does not need to match.

Next steps:

The next steps are to identify the number of records that could be DELETEd without causing any overhead on the server. The number 1800 was identified as the’ sweet spot‘ for the old logic (with ROWLOCK hint). With new logic, I believe that number could be different. So the next steps, for me, are to run some tests and see where the new ‘sweet spot‘ is. Hopefully, the new number would be greater than 1800, while still minimizing the overall impact on the server resources.

This is pretty exciting !!

Hope this helps,
_Sqltimes

Read Full Post »