DATETIME SELECT SELECT INTO DATE PAD STRING DYNAMIC SQL CURSOR MONEY FORMAT PERCENT STORED PROCEDURE SQL SERVER AGENT JOB OPTIMIZATION WHILE LOOP OVER PARTITION BY UPDATE
SITE SEARCH SQLUSA.com HEADLINES NEWS
SQL E/BOOKS   SQL 2014 PROGRAMMING   DOWNLOADS
SCRIPTS SQL 2005 SQL 2008 ARTICLES
SQL JOBS TWITTER FORMAT VIDEOS

 

How to Achieve High Performance with Linked Servers
By Kalman Toth, M.Phil. Physics, M.Phil. Computing Science, MCDBA, MCITP

February 10, 2009

Linked servers may be few feet apart in the same computer room, or thousands of miles away, one in New York, the other in London. Communications with a linked server are significantly slower than communications with a database on the same server. The reason: instead of the motherboard and native Windows communications, the information exchange has to go through network boxes, security authentication and additional layers of communications software.

You can join tables in different databases on different servers: cross database, cross server query. Here is an example:

 

SELECT a.*,

       b.*

FROM   TableA a

       INNER JOIN DATASRV888.accounting.dbo.TableB b

         ON a.ColumnP = b.ColumnQ

For complex queries the performance may be unpredictable. That depends how the query optimizer prepares the execution plan. SQL Server may copy over the entire TableB before executing the join. If the table is small, and the communications link is fast, it is not an issue. However for large tables and/or slow links, it is a performance killer. On a typical platform, to copy 5 GB of data file from the linked server to another on the Windows level may take 10 minutes. To copy over a table 5 GB from database to database may take much longer during normal operations. Here is a simple test with small tables and linked servers on same physical server or connected with few feet of LAN cable through a network switch:

-- JOIN tables in same database

USE AdventureWorks;

DECLARE @StartTime datetime

DBCC DROPCLEANBUFFERS

SET @StartTime = getdate()

SELECT SlsOrds = count(*)

FROM Sales.SalesOrderHeader soh -- 31K

INNER JOIN Sales.SalesOrderDetail sod -- 121K

on soh.SalesOrderID = sod.SalesOrderID

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 50 msec (avg)

 

 

-- JOIN tables in different databases on same server instance

DECLARE @StartTime datetime

DBCC DROPCLEANBUFFERS

SET @StartTime = getdate()

SELECT SlsOrds = count(*)

FROM Sales.SalesOrderHeader soh

INNER JOIN CopyALPHAOfAdventureWorks.Sales.SalesOrderDetail sod --121K

on soh.SalesOrderID = sod.SalesOrderID

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 120 msec (avg)

Related msdn forum link:

http://social.msdn.microsoft.com/Forums/en/transactsql/thread/250dac80-f35c-46ea-9bf6-2489e055280d

 

-- JOIN tables in different databases on different server instances (linked server)

DECLARE @StartTime datetime

-- DBCC DROPCLEANBUFFERS -- EXECUTE it on DELLSTAR\ALPHA

DBCC DROPCLEANBUFFERS

SET @StartTime = getdate()

SELECT SlsOrds = count(*)

FROM Sales.SalesOrderHeader soh

INNER JOIN [DELLSTAR\ALPHA].AdventureWorks.Sales.SalesOrderDetail sod --121K

on soh.SalesOrderID = sod.SalesOrderID

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 500 msec (avg)

 

-- JOIN tables in different databases on different servers (linked server)

DECLARE @StartTime datetime

-- DBCC DROPCLEANBUFFERS – EXECUTE it on GATESTAR

DBCC DROPCLEANBUFFERS

SET @StartTime = getdate()

SELECT SlsOrds = count(*)

FROM Sales.SalesOrderHeader soh

INNER JOIN [GATESTAR].AdventureWorks.Sales.SalesOrderDetail sod --121K

on soh.SalesOrderID = sod.SalesOrderID

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 400 msec (avg)

-- Execute stored procedure on linked server (Remote Procedure Call)

-- Linked server must be configured for RPC

-- Use Linked Server properties to set RPC /RPC OUT true

/* Create stored procedure on linked server GATESTAR

 

CREATE PROCEDURE sprocSalesOrderCount

AS

SELECT SlsOrds = count(*)

FROM AdventureWorks.Sales.SalesOrderHeader soh

INNER JOIN AdventureWorks.Sales.SalesOrderDetail sod --121K

on soh.SalesOrderID = sod.SalesOrderID

GO

*/

DECLARE @StartTime datetime

-- Execute DBCC DROPCLEANBUFFERS on GATESTAR

DBCC DROPCLEANBUFFERS

SET @StartTime = getdate()

EXECUTE [GATESTAR].AdventureWorks.dbo.sprocSalesOrderCount

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 38 msec (avg)

The linked server performance on the same server is the slowest probably due to disk contention. The linked server performance 8 times slower for a cross database, cross server query.

There is another aspect to linked server performance. Even if the 38 msec RPC execution or the 400 msec cross server query execution time is acceptable, due to network delay, the response occasionally can increase to 20, 000 msec, for example, in other words subject to unpredictability.

There are several ways you can take control back.

First, you can build and execute a stored proc on the linked server. That is fast. Example:

EXEC DATASRV888.accounting.dbo.WeeklyAccountingSummary

Second, you can pull the data you need from the linked server into a temporary or staging table. After that you operate with the data on same server at the usual speed. The advantage of this approach is that you can execute this step individually and get timing on the transfer. Example:

SELECT *

INTO   #TableB

FROM   DATASRV888.accounting.dbo.TableB

WHERE  ColumnR = '2010'

Third, you can "push" the data from the linked server instead of "pulling" it. The advantage of this approach is that the data partitioning takes place on the linked server efficiently, and only the needed data is moved. One way to push the data is a stored procedure on the linked server which is called from the "local" server.

The discussion above assumed high speed linking (1 GB/min) between the servers. With slow links, you have to be extremely careful. Basically, you just want to pull a single row with a remote procedure call or with OPENQUERY. If you need some significant data transfer, the best to schedule it as night process or physical transfer on CD, DVD or external disk drive.

DECLARE @StartTime datetime;

DBCC DROPCLEANBUFFERS;

-- Execute DBCC DROPCLEANBUFFERS on GATESTAR

SET @StartTime = getdate();

SELECT * FROM OPENQUERY( GATESTAR,

'SELECT SlsOrds = count(*)

FROM AdventureWorks.Sales.SalesOrderHeader soh -- 31K

INNER JOIN AdventureWorks.Sales.SalesOrderDetail sod -- 121K

on soh.SalesOrderID = sod.SalesOrderID;

')

SELECT DurationMsec = datediff(ms, @StartTime, getdate())

GO

-- 121313

-- 40 msec (avg)

The OPENQUERY above is even faster than execution on the local DELLSTAR server since the query is executed on the linked server and only the few bytes of result is transferred on the network. Nevertheless it is still subject to unpredictability due to network performance.

The following article states that the PULL method (if available) is much faster than the PUSH method:

Linked servers and performance impact: Direction matters!

A WORKAROUND suggested by Paul Nielsen (http://www.sqlserverbible.com/) : don't use linked servers, instead push the task into the application layer because the application can connect to both servers at the usual (high) speed.

 

Exam Prep 70-461
Exam 70-461
DATETIME SELECT SELECT INTO DATE PAD STRING DYNAMIC SQL CURSOR MONEY FORMAT PERCENT STORED PROCEDURE SQL SERVER AGENT JOB OPTIMIZATION WHILE LOOP OVER PARTITION BY UPDATE