SQL #9: Programmable objects

6 minute read

This is the ninth in a series of SQL notes I made during the Kubrick Data Engineering training course. The others are:
#1: Database Design
#2: SQL Design
#3: DQL
#4: DDL
#5: DML
#6: Sorting
#7: Joining
#8: Temporary Tables
#10: Indexing
#11: Operation & Optimization

All the posts in this series are my personal note taking and may be updated as the course progresses.


Functions

As well as default system functions users can define new functions, the category of function is either:

  • Scalar - Returns a single value, like the system functions above (SELECT <fnc>)
  • In Line tabled value - Returns a table from a SELECT statement, can take an arguement as input (SELECT FROM <fnc>)
  • Multi statement table valued - Returns a table from multiple SELECT statements, can take input (SELECT FROM <fnc>)

Scalar

A scalar functions act on a single input variable and outputs a single value. They can be computationally inefficient when they involve IF functions that cycle through the table row by row.

CAST

  • CAST(<val> AS DATE) - Treats the supplied value as some other format
  • TRY_CAST(<val> AS DATE) - Same as CAST but enters NULL if there’s an error
  • CONVERT(DATE, <val>, <optn>) - SQL server specific version of cast which allows for extra options

Missing Data

NULL in SQL is an annoying value as testing for it will always return UNKOWN, even if the query is clearly asking for a character that is not null, or asking for a null. The answer is to use IS NULL or IS NOT NULL which will return TRUE or FALSE. For many reasons it’s a good idea to avoid the use of NULL entirely.

  • ISNULL(n1,n2) - Return n1 if it is NOT NULL, otherwise return n2
  • NULLIF(n1, n2) - Return NULL if n1=n2, otherwise return n1
  • COALESCE(n1,n2,..) - Gives first value from list that is NOT NULL

Maths

Most of the mathematical aggregation functions are self-explanatory. They are MAX(), MIN(), COUNT(), SUM(), AVG(), STDEV(), and VAR(). These functions are often combined with other aggregation methods to draw insight. Other functions are:

  • CEILING() - Round up
  • FLOOR() - Round down
  • ROUND(,D) - Round to D decimal places -DECIMAL(,P,S) - Round to P total digits and S significant figures -CHECKSUM() - Computes a hash as a number -NEWID() - Computes a random ID string RAND() - Random number betweeen 0-1

String functions

  • LEN() - Length
  • LTRIM(), RTRIM() - Trims blank space either side
  • SUBSTRING() - Splits string up
  • CHARINDEX() - Index of a character in string
  • CONCAT() - Combines strings
  • DIFFERENCE() - Returns a rating of how different 2 strings are (1=diff, 4=same)

The following statement will split an email into the username and the domain name

DECLARE @email VARCHAR(100) = 'joe@gmail.com'
SELECT  SUBSTRING(@email, 1, CHARINDEX('@', @email)-1) username
,SUBSTRING(@email, CHARINDEX('@', @email)+1, LEN(@email)) domainname

Date Functions

There are many date functions in excel that can be used on the DATE datatype fields. The following script can be used to extract information extra from date columns using system date functions, this is the basis of a dimensional date table.

SELECT 
	 SalesOrderID
	,OrderDate
	,ShipDate
	,GETDATE() AS TodaysDate
	,DATEDIFF(dd,OrderDate,ShipDate) AS ProcessDays
	,CAST(DATEDIFF(dd,ShipDate,GETDATE()) / 365. AS DECIMAL(4,1)) AS YearsToToday
	,CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, OrderDate), 0) AS DATE) AS StartOfMonth
	,EOMONTH(OrderDate) AS EndOfMonth
	,DATENAME(dw,OrderDate) AS OrderDayName
	,DATEPART(dw,OrderDate) AS OrderdayNo
FROM Sales.SalesOrderHeader

User defined functions

A user can create functions using CREATE FUNCTION, a scalar function requires an input and an output and must have BEGIN and END keywords denoting where the function operation happens. The following script creates a function that does the same as ISNULL().

CREATE FUNCTION dbo.ufnisnull(@col SQL_VARIANT, @rep SQL_VARIANT)
RETURNS SQL_VARIANT
AS
BEGIN
DECLARE @answer SQL_VARIANT

-- Calculation:
IF @col IS NULL
SET @answer = @rep
ELSE SET @answer = @col

RETURN @answer
END

Table Valued Functions

These functions return a table variable as a result, the function itself takes the form of a select statement that references an input variable and does not need BEGIN or END. A simple function of this type just returns the values from a table for some condition.

CREATE FUNCTION dbo.fn_productbyminprice(@minprice MONEY)
RETURNS TABLE -- TVF definition
AS
RETURN(
SELECT *
FROM Production.Product
WHERE ListPrice > @minprice
)

Multi Statement Table Valued Functions

These functions are the same as table valued functions except they contain multiple SELECT queries. For example, a function could return a table with values populated from multiple SELECT statements. The following script returns a table.

CREATE FUNCTION dbo.lastShipped(@CustomerID INT)
RETURNS @CustomerOrder TABLE
(SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL)
AS
BEGIN
    DECLARE @MaxDate DATETIME

    SELECT @MaxDate = MAX(OrderDate)
    FROM Sales.SalesOrderHeader
    WHERE CustomerID = @CustomerID

    INSERT @CustomerOrder
    SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
    FROM Sales.SalesOrderHeader a 
INNER JOIN Sales.SalesOrderHeader b
        ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c 
        ON b.ProductID = c.ProductID
WHERE a.OrderDate = @MaxDate
        AND a.CustomerID = @CustomerID
    RETURN
END

Views

Views are like a CTE but without containing anything other than the query. Accessing a view will just run the query that is stored in that view making them very lightweight to store but not to run.

CREATE VIEW vw_price
AS
(
    SELECT ListPrice 
    FROM Production.Product
)

vw_price can then be used in any query and it will run the section of code in the brackets.


Stored Procedures

Procs are used in db’s to allow analysts to query data without doing anything dodgy. Often the analyst will be denied any access EXCEPT use of procs.

System defined stored procedures

  • Sp_who – who’s connected to system
  • Sp_who2 – Detailed version of sp_who
  • Sp_who2 n – Detailed info on user ‘n’
  • Sp_attach_db – Attach a DB file to server using filepath

User defined stored procedures

SP to add a row to table:

-- DEFINE/EDIT SP --
GO
CREATE proc usp_addrow
@productID VARCHAR(50) = NULL,
@cat VARCHAR(50) = NULL,
@subcat VARCHAR(50) = NULL,
AS
-- integrity --
IF left(@productID, 1) NOT LIKE '[A-Z]'
BEGIN
	PRINT 'SP FAILED, productID should start with A-Z'
	RETURN 99 -- Returns no. to front end and quits proc
END
-- check if already exists --
DECLARE @cnt INT
SELECT @cnt = count(*) FROM Product WHERE productID = @productID
IF @cnt > 0
BEGIN
	PRINT 'Product code already in table	'
	RETURN
END
-- proc --
INSERT INTO Product(productID, cat, subcat, [desc])
SELECT @productID, @cat, @subcat, @desc
RETURN 0
GO

-- EXECUTION --
DECLARE @retvalue INT
EXEC @retvalue = usp_addrow '5AA', 'AAA', 'A' 'TEST CHAPTER'
SELECT @retvalue -- 99 if fail, 0 if pass

SP to get Metadata about a table:

CREATE PROC usp_tablesyntax
@tabname NVARCHAR(255)
AS
SELECT 'Create table ' + @tabname + ' ('
UNION ALL
SELECT
	 c.name
	+ ' ' + tp.name
	+ ' (' + CAST(c.max_length AS VARCHAR(10)) + '),'
FROM sys.tables t
INNER JOIN sys.columns c
	ON t.object_id = c.object_id
INNER JOIN sys.types tp
	ON c.user_type_id = tp.user_type_id
WHERE t.name = '' + @tabname + ''
UNION ALL
SELECT ')'
GO

-- EXECUTION --
EXEC usp_tablesyntax Production.Product
GO

Triggers

Triggers are scripts that are set to run when a specific condition is fullfilled. This is useful for automating a workflow or blocking certain edits.
The following trigger will update modifieddate to todays date when listprice is changed.

CREATE trigger tr_scd2_pricechange
ON Production.Product
FOR UPDATE
AS
SET NOCOUNT ON -- Does not return 'count of rows changed'
UPDATE p
SET p.ModifiedDate = getdate()  -- Update modified date to todays date
FROM Production.Product p
JOIN inserted i  -- Join on new values
	ON p.ProductID = i.ProductID
JOIN deleted d   -- Join on old values
	ON p.ProductID = d.ProductID
WHERE i.ListPrice <> d.ListPrice -- Where price has changed
GO

-- TESTING --
UPDATE Production.Product
SET ListPrice = 19.99
WHERE ProductID = 5

Geographic functions

SQL has some special functions for converting and displaying geographic data. To use these functions a variable must be defined as geographic type using GEOGRAPHY.
The following script declares a geographic variable and inputs the coordinates for Oxford, UK. A SELECT statement then gets address data from Adventureworks, displays the distance from Oxford and orders by this distance. STAsText() returns the coordinates for SpatialLocation and STDistance returns the distance (in millimetres) between SpatialLocation and ox.

DECLARE @ox GEOGRAPHY
SELECT @ox = GEOGRAPHY::STGeomFromText('POINT(-1.258289 51.753475)', 4326)  -- Oxford co-ordinates

SELECT
	 AddressLine1
	,City
	,PostalCode
	,SpatialLocation.STAsText() AS latlong
	,SpatialLocation.STDistance(@ox) /1000000. km_from_ox
FROM Person.Address
ORDER BY km_from_ox ASC

Updated: