|
How to Achieve High Performance with Linked Servers
By Kalman Toth, M.Phil., M.Phil., MCDBA
March 22, 2005
Linked servers maybe few feet apart in the same computer room,
or thousands of miles away, one on New York, the other in London.
Communications with a linked server is slower than with a database
on the same server. The reason: instead of the motherboard and
native Windows communications, the information exchange has to
go through network boxes and additional layers of communications
software.
You can join tables in different databases on different servers.
Here is an example:
Select a.*, b.* from TableA a, DATASRV888.accounting.dbo.TableB b
Where a.ColumnP=b.ColumnQ
For complex queries the performance maybe unpredictable. Depends
how the query optimizer prepares the execution plan. SQL Server
may copy over the entire TableB before executing the join. If
the table is small, and the communications link is fast, it is
not an issue. However for large tables and/or slow links, it is
a performance killer. On a typical platform, to copy 5 GB of data
file from the linked server to another on the Windows level may
take 10 minutes. To copy over a table 5 GB from database to database
may take much longer during normal operations.
There are several ways you can take control back.
First, you can build and execute a stored proc on the linked
server. That is fast. Example:
Exec DATASRV888.accounting.dbo.WeeklySummary
Second, you can pull the data you need from the linked server
into a temporary or staging table. After that you operate with
the data on same server at the usual speed. The advantage of this
approach is that you can execute this step individually and get
timing on the transfer. Example:
Select * into #TableB from DATASRV888.accounting.dbo.TableB
Where ColumnR='2010'
Third, you can "push" the data from the linked server
instead of "pulling" it. The advantage of this approach
is that the data partitioning takes place on the linked server
efficiently, and only the needed data is moved. One way to push
the data is a stored procedure on the linked server which is called
from the "local" server.
The discussion above assumed high speed linking (1 GB/min) between
the servers. With slow links, you have to be extremely careful.
Basically, you just want to pull a single row with a remote procedure
call or with OPENQUERY. If you need some significant data transfer,
the best to schedule it as night process or physical transfer
on a CD-ROM.
|