Monday, February 4, 2013

SQL Code Snippets - Split String and Examples

The original implementation of the code below was created because of a project I did that involved converting COBOL code into SQL server code.  The problem stemmed from having to emulate COBOL code which would split data based on a character type, and then place the data into columns of a COBOL defined export.  The values could not be duplicated, and we could only grab the first 3 unique values in order.

This means that I had to have a method of easily splitting a string, and then taking the data and grabbing non duplicate data, and ignoring empty strings.  This was a considerably more interesting problem than I originally thought it would be when I first approached it.

So the first step is to get a good split string implementation.  I went with a common one you can get after searching for a second on Google.


CREATE
--DROP
FUNCTION [dbo].[SplitString]
(
      @InputString VARCHAR(MAX),  -- Input string
      @Delimiter CHAR(1) = ',',  -- Delimiter
      @RemoveEmpty bit = 1,   -- Remove Empty IE empty string is removed from the output
      @RemoveDupes bit = 0   -- Remove Dupes -- Removes duplicates from the list if executed.
)
RETURNS
      @OutList TABLE
      (
            Position    int identity(1,1)  -- The position of the returned parsed string
            ,Value VARCHAR(50)                     -- The value of the returned parsed string.
      )
AS
BEGIN
      --Parse the string and insert each list item into the @tblList
      DECLARE @ListItem varchar(max)
      DECLARE @Pos1 int, @Pos2 int, @Length int, @ListLength int
      SET @Pos1 = 1
      SET @ListLength = LEN(@InputString)

      WHILE @Pos1 <= @ListLength
            BEGIN
                  SET @Pos2 = CHARINDEX(@Delimiter, @InputString, @Pos1)
                  IF @Pos2 < 0 SET @Pos2 = 0
                  SET @Length = @Pos2 - @Pos1  
                  IF @Length < 0 SET @Length = 0
                 
                  IF @Pos2 > 0
                        BEGIN
                              SET @ListItem = SUBSTRING(@InputString, @Pos1, @Length)
                              SET @Pos1 = @Pos2 + 1
                        END
                  ELSE
                        BEGIN
                              SET @ListItem = SUBSTRING(@InputString, @Pos1, @ListLength)
                              SET @Pos1 = @ListLength + 1
                        END
                                   
                  IF(@RemoveEmpty = 1 and RTRIM(isnull(@ListItem,'')) = '')
                  BEGIN
                        CONTINUE;
                  END
                  ELSE
                  IF(@RemoveDupes = 0)
                  BEGIN
                        INSERT @OutList (Value) VALUES(@ListItem)
                  END
                  ELSE
                  BEGIN     
                    INSERT @OutList (Value)
                    SELECT @ListItem
                    WHERE NOT EXISTS (SELECT * FROM @OutList where Value = @ListItem)
                  END
      END
      RETURN
END

I then create a problem set of data that I want to split.


tmpexampleid
datatoparse
1
ParseThisValue1,ParseThisValue2,ParseThisValue3
2
ParseThisValue1,ParseThisValue2,ParseThisValue4
3
ParseThisValue1,,ParseThisValue2,ParseThisValue4
4
ParseThisValue1,,ParseThisValue2,ParseThisValue3,ParseThisValue4
5

6
,,,,
7
,ParseThisValue1,,,
8
,ParseThisValue1,,ParseThisValue2,
9
,ParseThisValue1,,ParseThisValue2,ParseThisValue3
10
,ParseThisValue1,          ,ParseThisValue2,ParseThisValue3


And then I do the string split.  Note the following should run as long as SplitString exists in the same database context.


-- EXAMPLE of RETURN
SELECT * FROM [dbo].[SplitString] (
   'test,test2,test3,,,,,, '
  ,','
  ,1
  ,0)
 
  -- No values returned if no data to split!
 SELECT * FROM [dbo].[SplitString] (
   ''
  ,','
  ,1
  ,0)
 
  SELECT * FROM [dbo].[SplitString] (
   null
  ,','
  ,1
  ,0)
  GO
 
 
  -- This shows all options turned on, returns a postioned deduped list that removes empty string values.
IF object_id('tempdb..#tmpexample') IS NOT NULL
BEGIN
   DROP TABLE #tmpexample
END
 
create table #tmpexample
(
      tmpexampleid int identity(1,1)
      , datatoparse varchar(255)
)

INSERT INTO #tmpexample
(
      datatoparse
)
SELECT
      'ParseThisValue1,ParseThisValue2,ParseThisValue3'
UNION ALL
SELECT
      'ParseThisValue1,ParseThisValue2,ParseThisValue4'
UNION ALL
SELECT
      'ParseThisValue1,,ParseThisValue2,ParseThisValue4'
UNION ALL
SELECT
      'ParseThisValue1,,ParseThisValue2,ParseThisValue3,ParseThisValue4'
UNION ALL
SELECT
      ''
UNION ALL
SELECT
      ',,,,'
UNION ALL
SELECT
      ',ParseThisValue1,,,'
UNION ALL
SELECT
      ',ParseThisValue1,,ParseThisValue2,'
UNION ALL
SELECT
      ',ParseThisValue1,,ParseThisValue2,ParseThisValue3'
UNION ALL
SELECT
      ',ParseThisValue1,          ,ParseThisValue2,ParseThisValue3'
     
select * from #tmpexample
;with cte_parsemap as
      (
            select 1 as idpos, 'PRIMARY' as namepart
            UNION
            select 2, 'SECONDARY'
            UNION
            select 3, 'THIRD---ARY'
      ) , cte_candidates as
      (
      SELECT *
      FROM  #tmpexample PPOSRC
      --WHERE
      --    and FileID = @FileID
      )
      , cte_parsenames as
      (
            select
            c.[tmpexampleid],O.Value,cp.namepart
            from cte_candidates c
            CROSS APPLY
            dbo.[SplitString](c.[datatoparse], ',',1,1) AS O
            JOIN cte_parsemap cp
                  on O.Position = cp.idpos

      )
      , cte_parsed as
      (
            select pvt.[tmpexampleid],[PRIMARY],[SECONDARY],[THIRD---ARY]
            from cte_parsenames c
            PIVOT
            (
            max (Value)
            FOR namepart IN
            ( [PRIMARY],[SECONDARY],[THIRD---ARY])
            ) AS pvt
      )
      SELECT *   
      FROM cte_parsed  
     
GO

Example Output

1
ParseThisValue1
ParseThisValue2
ParseThisValue3
2
ParseThisValue1
ParseThisValue2
ParseThisValue4
3
ParseThisValue1
ParseThisValue2
ParseThisValue4
4
ParseThisValue1
ParseThisValue2
ParseThisValue3
7
ParseThisValue1
NULL
NULL
8
ParseThisValue1
ParseThisValue2
NULL
9
ParseThisValue1
ParseThisValue2
ParseThisValue3
10
ParseThisValue1
ParseThisValue2
ParseThisValue3

No comments:

Post a Comment