http://www.techbrothersit.com/2013/12/ssis-how-to-watch-value-of-variables.html
I wanted to just have this out there. More of a re-blog because it needs to be pushed up. This is how you do this properly when you are having an issue and are unsure of the exact value of a record. I also tend to log every variable using a script task as well during run time so that I can review the ssis logs for it.
Monday, November 21, 2016
Advanced Debugging in SSIS no Script Task Related
Thursday, November 3, 2016
In Place Updates of Two Fields To Each Other in SQL
Not really sure how to classify this, but there are cases when you want to do an update of data in your system where you update one field to another field and set the other field to NULL. Those of us that have had to approach this problem before know what the answer is, but I decided to create an example.
So the situation is this: I have a table with two columns, and I want the first column to be overridden by the second column, and the second column to be set to NULL, if and ONLY if the second column is NOT null.
The final output I want to be is this:
As you can see, Four was not overridden by Null, but One and Two in Column UserOne were updated by the values in the field UserTwo.
The reason I thought this was interesting as a problem was because of the question I was asked by a junior developer asking how he would track the update step to make sure that he updated the fields to NULL and I realized that I had never really thought about it because SQL Server handles this situation quite elegantly. You don't HAVE to track what was updated, when doing the update it will use the values that are available at time of update, it does not do in order updates at all.
In reality the only statement you have to use is provided as a single update in the example below. Order of the columns being set doesn't matter either.
IF object_id('tempdb..#tmp_testexample') IS NOT NULL
BEGIN
DROP TABLE #tmp_testexample
END
create table #tmp_testexample
(
ID int identity(1,1)
,UserOne varchar(255)
,UserTwo varchar(255)
)
insert into #tmp_testexample
(
UserOne,UserTwo
)
SELECT
'One','Two'
UNION
Select 'Two','Three'
UNION SELECT 'Four',null
SELECT * FROM #tmp_testexample
select
*
from #tmp_testexample
where UserTwo is not null
and UserOne <> UserTwo
update tta
set tta.UserOne = tta.UserTwo
, tta.UserTwo = NULL
from #tmp_testexample tta
where UserTwo is not null
and UserOne <> UserTwo
select
*
from #tmp_testexample
This produces the output:
Original Data Set:
ID
|
UserOne
|
UserTwo
|
1
|
Four
|
NULL
|
2
|
One
|
Two
|
3
|
Two
|
Three
|
Example of what I want to update:
ID
|
UserOne
|
UserTwo
|
2
|
One
|
Two
|
3
|
Two
|
Three
|
Final Result after single update statement:
ID
|
UserOne
|
UserTwo
|
1
|
Four
|
NULL
|
2
|
Two
|
NULL
|
3
|
Three
|
NULL
|
Thursday, April 28, 2016
SSIS - Script Component Transform to TrimToNull Generically.
I have finally hit a problem that I had 10 years ago in SSIS. The need for something to trim and then null any empty strings generically in SSIS. Back then we used a compiled C# component, and the heavy use of that component caused headache for years.
Eventually we were able to use a vb script based script task to do the work, but that too has become unavailable in recent versions of SSIS (at least the method we were using).
So I had to whip one up yesterday. Why? Why does someone feel that they need to do this instead of using a derived column task or manually taking all input columns and managing them individually?
Well, it takes a long time to do that for 100 columns, and I had about 30 minutes to make a fix and not all columns were string etc... etc... (excuses!).
So I did a little research and here is where I have come to this solution:
So how does this code get into your script task? Basically, you drop in a script component, connect your columns you want to clean as input, and put this script in the script component.
This code takes every single input that is a DT_STR or DT_WSTR and trims the data. If it is empty, it produces a NULL.
Not the best written code, I could have done a bit more, but it gives the mechanics of it. The implementation requires adding a single include and pasting code into the buffer. I got the base code a while ago from another source, but I don't remember the source. I will add it if someone gives me a heads up.
Eventually we were able to use a vb script based script task to do the work, but that too has become unavailable in recent versions of SSIS (at least the method we were using).
So I had to whip one up yesterday. Why? Why does someone feel that they need to do this instead of using a derived column task or manually taking all input columns and managing them individually?
Well, it takes a long time to do that for 100 columns, and I had about 30 minutes to make a fix and not all columns were string etc... etc... (excuses!).
So I did a little research and here is where I have come to this solution:
//Include
using Microsoft.SqlServer.Dts.Pipeline;
.......
public override void ProcessInput(int InputID, PipelineBuffer Buffer)
{
while (Buffer.NextRow())
{
for (int columnIndex = 0;
columnIndex < Buffer.ColumnCount;
columnIndex++)
{
//string columnData = null;
if (Buffer.IsNull(columnIndex))
{
//columnData = "is NULL";
}
else
{
BufferColumn columnInfo = Buffer.GetColumnInfo(columnIndex);
switch (columnInfo.DataType)
{
case DataType.DT_BOOL:
//columnData = Buffer.GetBoolean(columnIndex).ToString();
break;
case DataType.DT_WSTR:
Buffer.SetString(columnIndex, Buffer.GetString(columnIndex).Trim());
if (Buffer.GetString(columnIndex) == "")
{
Buffer.SetNull(columnIndex);
}
break;
case DataType.DT_STR:
//columnData += Buffer.GetString(columnIndex);
Buffer.SetString(columnIndex, Buffer.GetString(columnIndex).Trim());
if (Buffer.GetString(columnIndex) == "")
{
Buffer.SetNull(columnIndex);
}
break;
// add code to support more data types here
default:
// columnData = "";
break;
}
}
}
}
base.ProcessInput(InputID, Buffer);
}
So how does this code get into your script task? Basically, you drop in a script component, connect your columns you want to clean as input, and put this script in the script component.
This code takes every single input that is a DT_STR or DT_WSTR and trims the data. If it is empty, it produces a NULL.
Not the best written code, I could have done a bit more, but it gives the mechanics of it. The implementation requires adding a single include and pasting code into the buffer. I got the base code a while ago from another source, but I don't remember the source. I will add it if someone gives me a heads up.
Tuesday, April 12, 2016
Quick Reference For Ranking and Sums with Ranking
/*
Developer: Christopher Harsch
A quick and simple example of ranking
and summing functions against a rank function.
Ranking Functions in SQL (some syntax
only applies to 2012 sql server)
https://msdn.microsoft.com/en-us/library/ms189798.aspx
-- This shows the use of over clauses
in sql which are used below.
https://msdn.microsoft.com/en-us/library/ms189461.aspx
*/
IF Object_id('tempdb..#tmp_rank') IS NOT NULL
BEGIN
DROP TABLE #tmp_rank
END
CREATE TABLE #tmp_rank
(
myid INT,
value1 VARCHAR(255),
value2 VARCHAR(255),
DecimalValue decimal(18,2)
)
INSERT INTO #tmp_rank
(myid,
value1,
value2,
DecimalValue)
SELECT 1,
'A',
'B',
10
UNION ALL
SELECT 1,
'A',
'C' ,
10
UNION ALL
SELECT 1,
'A',
'D' ,
10
UNION ALL
SELECT 1,
'A',
'B' ,
10
UNION ALL
SELECT 2,
'B',
'B' ,
10
UNION ALL
SELECT 2,
'B',
'B' ,
10
UNION ALL
SELECT 1,
'C',
'D' ,
10
UNION ALL
SELECT 1,
'D',
'B' ,
10
UNION ALL
SELECT 1,
'C',
'A' ,
10
IF object_id('tempdb..#tmp_ranks') IS NOT NULL
BEGIN
DROP TABLE #tmp_ranks
END
SELECT *,
Rank()
OVER (
partition BY tr.value1
ORDER BY value2) AS myRank -- Ranks with
numbering being numerically continuous IE 1,1,3,4
,
Dense_rank()
OVER (
partition BY tr.value1
ORDER BY value2) AS mydenserank -- This
provides contiguous rank numbering within ranked sets, 1,1,2,3
,
Row_number()
OVER (
partition BY tr.value1
ORDER BY value2) AS myrownum -- Individually numbers
,
Dense_rank()
OVER (
partition BY 1
ORDER BY value1) AS myDenseRankgroupnumber -- by
partitioning by a static number you can use it as a group number.
,
Rank()
OVER (
partition BY 1
ORDER BY value1) AS myRankGroupNumber -- Ranks with
numbering being numerically continuous IE 1,1,3,4 not the best for group
numbers
, sum(DecimalValue) over(partition by myid) as SumForAllInId -- Lets you sum by a grouping
, sum(DecimalValue) over(partition by myid,value1) as SumForAllValue1InID -- Lets
you sum within groups and sub groups
into #tmp_ranks
-- Individually numbers within a
partitioned set.
FROM #tmp_rank tr
-- This shows how you can use a two
step ranking, then summing process.
-- In later versions of SQL this can be
moved upward into the above query.
select
*
, SUM(1) over (Partition By myDenseRankgroupnumber)
as TotalInDenseGroup -- Count
total records in a group that was created by the dense rank
, SUM(DecimalValue) over (Partition By myDenseRankgroupnumber) as SumTotalInDenseGroup
-- Sum records in a group that was created by
the dense rank
from #tmp_ranks
Friday, April 8, 2016
SQL Server 2012 not properly opening .sql files and setting it in default files doesn't work.
Exactly as the title says. Sometimes I like to post things just to get their search rankings higher. This is the solution I had for the problem where something went wrong in my installation of sql server 2012 and it _WOULD NOT_ properly open sql files by clicking on them directly, even using regular windows assignments. This post on stack overflow has the best top answer for this issue and it fixed my problem.
http://stackoverflow.com/questions/14564469/opening-sql-files-in-ssms-2012-as-default-program-and-with-existing-instance
http://stackoverflow.com/questions/14564469/opening-sql-files-in-ssms-2012-as-default-program-and-with-existing-instance
Thursday, March 31, 2016
SQL 2012 - SUM syntax in 2012. IE: I need to do more research on the latest SQL Server versions.
So I have written SQL for a long time, and what happens sometimes is that you get entrenched in a specific way of doing things without realizing that new, better, methods can exist. Not only are these methods new and better, they are also much faster as well (not just easier to write).
BEGIN
DROP TABLE #tmp
END
CREATE TABLE #tmp
(
id INT IDENTITY(1, 1),
mth_nb INT,
year_nb INT,
value DECIMAL(16, 2)
)
INSERT INTO #tmp
(mth_nb,
year_nb,
value)
SELECT '1' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '2' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '3' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '4' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '5' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '1' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '2' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '3' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '4' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '5' MTH_NB,
2015 year_nb,
'1' AS value
-- The sql server 2008 method
SELECT ta.*,
tart.runningtotal
FROM #tmp ta
OUTER apply (SELECT year_nb,
Sum(value) AS RunningTotal
FROM #tmp tsub
WHERE ta.year_nb = tsub.year_nb
AND ta.mth_nb >= tsub.mth_nb
GROUP BY year_nb) tart
ORDER BY ta.year_nb,
mth_nb
-- sql server 2012 method
SELECT *,
Sum(value)
OVER(
partition BY year_nb
ORDER BY mth_nb) RunningTotal
FROM #tmp
The output of both is the correct answer! But the SQL Server 2012 method is far faster!
I had this happen today when someone mentioned to me that the method of doing month over month sums by year that I was doing were archaic. This piqued my interest because that is not something I would normally have expected to hear.
Basically, the problem is this, I have a bunch of data being summed and I want to make sure that for any specific date I have the rolling number of the previous months sums. So if I had 10 dollars in January, 10 dollars in February, and 10 dollars in March, then by March I have a total of 30 bucks, February 20 bucks, and January is just 10. Seems pretty straight forward, and this is a totally common task.
What I didn't realize is that in SQL Server 2012 and beyond, the code to accomplish this goal is less obtuse because functionality was added to the SUM syntax.
So I created a small example that you can review:
IF Object_id('tempdb..#tmp') IS NOT NULL BEGIN
DROP TABLE #tmp
END
CREATE TABLE #tmp
(
id INT IDENTITY(1, 1),
mth_nb INT,
year_nb INT,
value DECIMAL(16, 2)
)
INSERT INTO #tmp
(mth_nb,
year_nb,
value)
SELECT '1' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '2' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '3' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '4' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '5' MTH_NB,
2016 year_nb,
'10' AS value
UNION
SELECT '1' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '2' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '3' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '4' MTH_NB,
2015 year_nb,
'1' AS value
UNION
SELECT '5' MTH_NB,
2015 year_nb,
'1' AS value
-- The sql server 2008 method
SELECT ta.*,
tart.runningtotal
FROM #tmp ta
OUTER apply (SELECT year_nb,
Sum(value) AS RunningTotal
FROM #tmp tsub
WHERE ta.year_nb = tsub.year_nb
AND ta.mth_nb >= tsub.mth_nb
GROUP BY year_nb) tart
ORDER BY ta.year_nb,
mth_nb
-- sql server 2012 method
SELECT *,
Sum(value)
OVER(
partition BY year_nb
ORDER BY mth_nb) RunningTotal
FROM #tmp
The output of both is the correct answer! But the SQL Server 2012 method is far faster!
id
|
MTH_NB
|
year_nb
|
value
|
RunningTotal
|
1
|
1
|
2015
|
1
|
1
|
3
|
2
|
2015
|
1
|
2
|
5
|
3
|
2015
|
1
|
3
|
7
|
4
|
2015
|
1
|
4
|
9
|
5
|
2015
|
1
|
5
|
2
|
1
|
2016
|
10
|
10
|
4
|
2
|
2016
|
10
|
20
|
6
|
3
|
2016
|
10
|
30
|
8
|
4
|
2016
|
10
|
40
|
10
|
5
|
2016
|
10
|
50
|
Here is the MSDN article on SUM.
https://msdn.microsoft.com/en-us/library/ms187810.aspx
NOTE: Even though the MSDN article says it starts in 2008, I try to run the new syntax on my 2008 instance and it doesn't work.
Subscribe to:
Posts (Atom)