The Mobile Field Base: 2016

Monday, November 21, 2016

Advanced Debugging in SSIS no Script Task Related

http://www.techbrothersit.com/2013/12/ssis-how-to-watch-value-of-variables.html

I wanted to just have this out there. More of a re-blog because it needs to be pushed up. This is how you do this properly when you are having an issue and are unsure of the exact value of a record. I also tend to log every variable using a script task as well during run time so that I can review the ssis logs for it.

Thursday, November 3, 2016

In Place Updates of Two Fields To Each Other in SQL

Not really sure how to classify this, but there are cases when you want to do an update of data in your system where you update one field to another field and set the other field to NULL. Those of us that have had to approach this problem before know what the answer is, but I decided to create an example.

So the situation is this: I have a table with two columns, and I want the first column to be overridden by the second column, and the second column to be set to NULL, if and ONLY if the second column is NOT null.

The final output I want to be is this:

As you can see, Four was not overridden by Null, but One and Two in Column UserOne were updated by the values in the field UserTwo.

The reason I thought this was interesting as a problem was because of the question I was asked by a junior developer asking how he would track the update step to make sure that he updated the fields to NULL and I realized that I had never really thought about it because SQL Server handles this situation quite elegantly. You don't HAVE to track what was updated, when doing the update it will use the values that are available at time of update, it does not do in order updates at all.

In reality the only statement you have to use is provided as a single update in the example below. Order of the columns being set doesn't matter either.

IF object_id('tempdb..#tmp_testexample') IS NOT NULL

BEGIN

DROP TABLE #tmp_testexample

END

create table #tmp_testexample

(

ID int identity(1,1)

,UserOne varchar(255)

,UserTwo varchar(255)

)

insert into #tmp_testexample

(

UserOne,UserTwo

)

SELECT

'One','Two'

UNION

Select 'Two','Three'

UNION SELECT 'Four',null

SELECT * FROM #tmp_testexample

select

from #tmp_testexample

where UserTwo is not null

and UserOne <> UserTwo

update tta

set tta.UserOne = tta.UserTwo

, tta.UserTwo = NULL

from #tmp_testexample tta

where UserTwo is not null

and UserOne <> UserTwo

select

from #tmp_testexample

This produces the output:

Original Data Set:

ID	UserOne	UserTwo
1	Four	NULL
2	One	Two
3	Two	Three

Example of what I want to update:

ID	UserOne	UserTwo
2	One	Two
3	Two	Three

Final Result after single update statement:

ID	UserOne	UserTwo
1	Four	NULL
2	Two	NULL
3	Three	NULL

Thursday, April 28, 2016

SSIS - Script Component Transform to TrimToNull Generically.

I have finally hit a problem that I had 10 years ago in SSIS. The need for something to trim and then null any empty strings generically in SSIS. Back then we used a compiled C# component, and the heavy use of that component caused headache for years.

Eventually we were able to use a vb script based script task to do the work, but that too has become unavailable in recent versions of SSIS (at least the method we were using).

So I had to whip one up yesterday. Why? Why does someone feel that they need to do this instead of using a derived column task or manually taking all input columns and managing them individually?

Well, it takes a long time to do that for 100 columns, and I had about 30 minutes to make a fix and not all columns were string etc... etc... (excuses!).

So I did a little research and here is where I have come to this solution:

 //Include   
 using Microsoft.SqlServer.Dts.Pipeline;  
 .......  
 public override void ProcessInput(int InputID, PipelineBuffer Buffer)  
   {  
     while (Buffer.NextRow())  
     {  
       for (int columnIndex = 0;  
    columnIndex < Buffer.ColumnCount;  
    columnIndex++)  
       {  
         //string columnData = null;  
         if (Buffer.IsNull(columnIndex))  
         {  
           //columnData = "is NULL";  
         }  
         else  
         {  
           BufferColumn columnInfo = Buffer.GetColumnInfo(columnIndex);  
           switch (columnInfo.DataType)  
           {  
             case DataType.DT_BOOL:  
               //columnData = Buffer.GetBoolean(columnIndex).ToString();  
               break;  
             case DataType.DT_WSTR:  
               Buffer.SetString(columnIndex, Buffer.GetString(columnIndex).Trim());  
               if (Buffer.GetString(columnIndex) == "")  
               {  
                 Buffer.SetNull(columnIndex);  
               }  
               break;  
             case DataType.DT_STR:  
               //columnData += Buffer.GetString(columnIndex);  
               Buffer.SetString(columnIndex, Buffer.GetString(columnIndex).Trim());  
               if (Buffer.GetString(columnIndex) == "")  
               {  
                 Buffer.SetNull(columnIndex);  
               }  
               break;  
             // add code to support more data types here  
             default:  
              // columnData = "";  
               break;  
           }  
         }  
       }  
     }  
     base.ProcessInput(InputID, Buffer);  
   }

So how does this code get into your script task? Basically, you drop in a script component, connect your columns you want to clean as input, and put this script in the script component.

This code takes every single input that is a DT_STR or DT_WSTR and trims the data. If it is empty, it produces a NULL.

Not the best written code, I could have done a bit more, but it gives the mechanics of it. The implementation requires adding a single include and pasting code into the buffer. I got the base code a while ago from another source, but I don't remember the source. I will add it if someone gives me a heads up.

Tuesday, April 12, 2016

Quick Reference For Ranking and Sums with Ranking

Developer: Christopher Harsch

A quick and simple example of ranking and summing functions against a rank function.

Ranking Functions in SQL (some syntax only applies to 2012 sql server)

https://msdn.microsoft.com/en-us/library/ms189798.aspx

-- This shows the use of over clauses in sql which are used below.

https://msdn.microsoft.com/en-us/library/ms189461.aspx

IF Object_id('tempdb..#tmp_rank') IS NOT NULL

BEGIN

DROP TABLE #tmp_rank

END

CREATE TABLE #tmp_rank

(

myid INT,

value1 VARCHAR(255),

value2 VARCHAR(255),

DecimalValue decimal(18,2)

)

INSERT INTO #tmp_rank

(myid,

value1,

value2,

DecimalValue)

SELECT 1,

'A',

'B',

UNION ALL

SELECT 1,

'A',

'C' ,

UNION ALL

SELECT 1,

'A',

'D' ,

UNION ALL

SELECT 1,

'A',

'B' ,

UNION ALL

SELECT 2,

'B',

'B' ,

UNION ALL

SELECT 2,

'B',

'B' ,

UNION ALL

SELECT 1,

'C',

'D' ,

UNION ALL

SELECT 1,

'D',

'B' ,

UNION ALL

SELECT 1,

'C',

'A' ,

IF object_id('tempdb..#tmp_ranks') IS NOT NULL

BEGIN

DROP TABLE #tmp_ranks

END

SELECT *,

Rank()

OVER (

partition BY tr.value1

ORDER BY value2) AS myRank -- Ranks with numbering being numerically continuous IE 1,1,3,4

Dense_rank()

OVER (

partition BY tr.value1

ORDER BY value2) AS mydenserank -- This provides contiguous rank numbering within ranked sets, 1,1,2,3

Row_number()

OVER (

partition BY tr.value1

ORDER BY value2) AS myrownum -- Individually numbers

Dense_rank()

OVER (

partition BY 1

ORDER BY value1) AS myDenseRankgroupnumber -- by partitioning by a static number you can use it as a group number.

Rank()

OVER (

partition BY 1

ORDER BY value1) AS myRankGroupNumber -- Ranks with numbering being numerically continuous IE 1,1,3,4 not the best for group numbers

, sum(DecimalValue) over(partition by myid) as SumForAllInId -- Lets you sum by a grouping

, sum(DecimalValue) over(partition by myid,value1) as SumForAllValue1InID -- Lets you sum within groups and sub groups

into #tmp_ranks

-- Individually numbers within a partitioned set.

FROM #tmp_rank tr

-- This shows how you can use a two step ranking, then summing process.

-- In later versions of SQL this can be moved upward into the above query.

select

, SUM(1) over (Partition By myDenseRankgroupnumber) as TotalInDenseGroup -- Count total records in a group that was created by the dense rank

, SUM(DecimalValue) over (Partition By myDenseRankgroupnumber) as SumTotalInDenseGroup -- Sum records in a group that was created by the dense rank

from #tmp_ranks

Friday, April 8, 2016

SQL Server 2012 not properly opening .sql files and setting it in default files doesn't work.

Exactly as the title says. Sometimes I like to post things just to get their search rankings higher. This is the solution I had for the problem where something went wrong in my installation of sql server 2012 and it _WOULD NOT_ properly open sql files by clicking on them directly, even using regular windows assignments. This post on stack overflow has the best top answer for this issue and it fixed my problem.

http://stackoverflow.com/questions/14564469/opening-sql-files-in-ssms-2012-as-default-program-and-with-existing-instance

Thursday, March 31, 2016

SQL 2012 - SUM syntax in 2012. IE: I need to do more research on the latest SQL Server versions.

So I have written SQL for a long time, and what happens sometimes is that you get entrenched in a specific way of doing things without realizing that new, better, methods can exist. Not only are these methods new and better, they are also much faster as well (not just easier to write).

I had this happen today when someone mentioned to me that the method of doing month over month sums by year that I was doing were archaic. This piqued my interest because that is not something I would normally have expected to hear.

Basically, the problem is this, I have a bunch of data being summed and I want to make sure that for any specific date I have the rolling number of the previous months sums. So if I had 10 dollars in January, 10 dollars in February, and 10 dollars in March, then by March I have a total of 30 bucks, February 20 bucks, and January is just 10. Seems pretty straight forward, and this is a totally common task.

What I didn't realize is that in SQL Server 2012 and beyond, the code to accomplish this goal is less obtuse because functionality was added to the SUM syntax.

So I created a small example that you can review:

IF Object_id('tempdb..#tmp') IS NOT NULL
  BEGIN
      DROP TABLE #tmp
  END

CREATE TABLE #tmp
  (
     id      INT IDENTITY(1, 1),
     mth_nb  INT,
     year_nb INT,
     value   DECIMAL(16, 2)
  )

INSERT INTO #tmp
            (mth_nb,
             year_nb,
             value)
SELECT '1'  MTH_NB,
       2016 year_nb,
       '10' AS value
UNION
SELECT '2'  MTH_NB,
       2016 year_nb,
       '10' AS value
UNION
SELECT '3'  MTH_NB,
       2016 year_nb,
       '10' AS value
UNION
SELECT '4'  MTH_NB,
       2016 year_nb,
       '10' AS value
UNION
SELECT '5'  MTH_NB,
       2016 year_nb,
       '10' AS value
UNION
SELECT '1'  MTH_NB,
       2015 year_nb,
       '1'  AS value
UNION
SELECT '2'  MTH_NB,
       2015 year_nb,
       '1'  AS value
UNION
SELECT '3'  MTH_NB,
       2015 year_nb,
       '1'  AS value
UNION
SELECT '4'  MTH_NB,
       2015 year_nb,
       '1'  AS value
UNION
SELECT '5'  MTH_NB,
       2015 year_nb,
       '1'  AS value

-- The sql server 2008 method
SELECT ta.*,
       tart.runningtotal
FROM   #tmp ta
       OUTER apply (SELECT year_nb,
                           Sum(value) AS RunningTotal
                    FROM   #tmp tsub
                    WHERE  ta.year_nb = tsub.year_nb
                           AND ta.mth_nb >= tsub.mth_nb
                    GROUP  BY year_nb) tart
ORDER  BY ta.year_nb,
          mth_nb

-- sql server 2012 method
SELECT *,
       Sum(value)
         OVER(
           partition BY year_nb
           ORDER BY mth_nb) RunningTotal
FROM   #tmp

The output of both is the correct answer! But the SQL Server 2012 method is far faster!

id	MTH_NB	year_nb	value	RunningTotal
1	1	2015	1	1
3	2	2015	1	2
5	3	2015	1	3
7	4	2015	1	4
9	5	2015	1	5
2	1	2016	10	10
4	2	2016	10	20
6	3	2016	10	30
8	4	2016	10	40
10	5	2016	10	50

Here is the MSDN article on SUM.

https://msdn.microsoft.com/en-us/library/ms187810.aspx

NOTE: Even though the MSDN article says it starts in 2008, I try to run the new syntax on my 2008 instance and it doesn't work.