| Mastering the SQL UPDATE Syntax
By Kalman Toth, M.Phil., M.Phil., MCDBA, MCITP
February 21 , 2008
The UPDATE statement in SQL has unusual and confusing syntax.
Typically mastered by expert DBA-s and SQL developers, and the
rest of the database community uses it in an insecure fashion.
The following simple example demonstrates some of the big problems
with the UPDATE syntax.
The following is a normal simple update in AdventureWorks database.
Although the introduction of the TOP clause is a telltale sign
of problems to come. Vacation hours increases by 2% to a random
set of 10 employees.
UPDATE TOP (10) HumanResources.Employee
SET VacationHours = VacationHours * 1.02 ;
(10 row(s) affected)
We now try to alias the table.
UPDATE TOP (10) HumanResources.Employee
e
SET VacationHours = VacationHours * 1.02 ;
Msg 102, Level 15, State 1, Line 1
Incorrect syntax near 'e'.
It failed. SYNTAX PROBLEM #1: table updated cannot be aliased.
This is very strange by itself. Lets introduce the FROM
clause.
UPDATE TOP (10) HumanResources.Employee
SET VacationHours = VacationHours * 1.02
FROM HumanResources.Employee ;
(10 row(s) affected)
This statement is really incomprehensible. Why does it work with
2 references to the same table without a self-join? Let' introduce
alias on the FROM table.
UPDATE TOP (10) HumanResources.Employee
SET VacationHours = e.VacationHours * 1.02
FROM HumanResources.Employee e;
(10 row(s) affected)
Well as it turn out the right hand side of the assignment is
coming from the FROM table, while the left hand side from the
UPDATE table. But are they matched? Are we assigning the presidents
vacation hours plus 2% to the head of engineering? We do hope
that does not happen but we cant really tell based on the
syntax. Pehaps SQL Server 2005 adds a secret self-join on EmployeeID.
More likely though that SQL Server interpretes the UPDATE table
as a reference to the FROM table by the same name. This is from
BOL: "If the object being updated (the UPDATE table) is the
same as the object in the FROM clause (the FROM table) and there
is only one reference to the object in the FROM clause, an object
alias may or may not be specified. If the object being updated
appears more than one time in the FROM clause, one, and only one,
reference to the object must not specify a table alias. All other
references to the object in the FROM clause must include an object
alias."
We want to make sure that the same row is updated by 2%.
UPDATE TOP (10) HumanResources.Employee
SET e.VacationHours = e.VacationHours * 1.02
FROM HumanResources.Employee e;
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "e.VacationHours" could not
be bound.
This did not work. Apperantly the left side of the assignment
belongs to the UPDATE table. Lets see if we can alias the
UPDATE table.
UPDATE TOP (10) e
SET e.VacationHours = e.VacationHours * 1.02
FROM HumanResources.Employee e;
(10 row(s) affected)
That works fine. SYNTAX PROBLEM #2: The FROM table alias can
be used for the UPDATE table. It reads quiet strange. This is
not really how it is being used in practice. Let's associate the
TOP with an ORDER BY.
UPDATE TOP (10) e
SET e.VacationHours = e.VacationHours * 1.02
FROM HumanResources.Employee e
ORDER BY e.VacationHours desc;
Msg 156, Level 15, State 1, Line 5
Incorrect syntax near the keyword 'ORDER'.
No dice. SYNTAX PROBLEM #3: ORDER BY cannot be used even though
usage of TOP is allowed in the UPDATE statement. So we are forced
to introduce a DERIVED TABLE to get the ordered set of employees
with the least number of vacation hours.
UPDATE TOP (10) e
SET e.VacationHours = e.VacationHours * 1.02
FROM HumanResources.Employee e
JOIN (SELECT TOP (10) EmployeeID FROM HumanResources.Employee
ORDER BY VacationHours desc) eord
ON e.EmployeeID=eord.EmployeeID;
(10 row(s) affected)
Finally, we achieved our goal. We also feel assured that the
syntax produces the desired results.
If you use the normal UPDATE statement like below,
you do hope that the rows dont get mixed up between the
UPDATE table and FROM table.
UPDATE Sales.SalesPerson
SET SalesYTD = sp.SalesYTD + sod.SubTotal
FROM Sales.SalesPerson sp
JOIN Sales.SalesOrderHeader sod
ON sp.SalesPersonID = sod.SalesPersonID
AND sod.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE SalesPersonID =
sp.SalesPersonID);
(17 row(s) affected)
If you want to be sure though, you use alias in the UPDATE clause.
UPDATE sp
SET sp.SalesYTD = sp.SalesYTD + sod.SubTotal
FROM Sales.SalesPerson sp
JOIN Sales.SalesOrderHeader sod
ON sp.SalesPersonID = sod.SalesPersonID
AND sod.OrderDate = (SELECT MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE SalesPersonID =
sp.SalesPersonID);
(17 row(s) affected)
The following script demonstrates an update with a JOIN to SELECT - GROUP BY query (a derived table) .
USE tempdb
GO
SELECT * into dbo.Department
FROM AdventureWorks.HumanResources.Department
ALTER TABLE dbo.Department
ADD NoOfEmployees int not null default (0)
GO
UPDATE DE
SET NoOfEmployees =DG.DeptEmployees,
ModifiedDate = getdate()
FROM dbo.Department DE
JOIN (SELECT D.DepartmentID, COUNT(*) AS DeptEmployees
FROM AdventureWorks.HumanResources.EmployeeDepartmentHistory EDH
JOIN dbo.Department D
ON d.DepartmentID = EDH.DepartmentID
and EDH.EndDate is null
JOIN AdventureWorks.HumanResources.Employee E
ON EDH.EmployeeID = E.EmployeeID
GROUP BY D.DepartmentID ) DG
ON DE.DepartmentID = DG.DepartmentID
GO
SELECT * FROM dbo.Department
GO
|