leo_wyn
作者leo_wyn·2020-04-29 10:34
商业智能工程师·Security

Pagination with OFFSET / FETCH : A better way

字数 5287阅读 921评论 0赞 0

Pagination is a common use case throughout client and web applications everywhere. Google shows you 10 results at

a time, your online bank may show 20 bills per page, and bug tracking and source control software might display 50

items on the screen.

Based on the indexing of the table, the columns needed, and the sort method chosen, paging can be relatively

painless. If you're looking for the "first" 20 customers and the clustered index supports that sorting (say, a clustered

index on an IDENTITY column or DateCreated column), then the query is going to be pretty efficient. If you need to

support sorting that requires non-clustered indexes, and especially if you have columns needed for output that aren't

covered by the index (never mind if there is no supporting index), the queries can get more expensive. And even the

same query (with a different @PageNumber parameter) can get much more expensive as the @PageNumber gets

higher – since more reads may be required to get to that "slice" of the data.

Some will say that progressing toward the end of the set is something that you can solve by throwing more memory at

the problem (so you eliminate any physical I/O) and/or using application-level caching (so you're not going to the

database at all). Let's assume for the purposes of this post that more memory isn't always possible, since not every

customer can add RAM to a server that's out of memory slots, or just snap their fingers and have newer, bigger

servers ready to go. Especially since some customers are on Standard Edition, so are capped at 64GB (SQL Server 2012)

or 128GB (SQL Server 2014), or are using even more limited editions such as Express (1GB) or whatever they're calling

Azure SQL Database this week (many different servicing tiers).

So I wanted to look at the common paging approach on SQL Server 2012 – OFFSET / FETCH – and suggest a variation

that will lead to more linear paging performance across the entire set, instead of only being optimal at the beginning.

Which, sadly, is all that a lot of shops will test.

Setup

I'm going to borrow from a recent post, Bad habits : Focusing only on disk space when choosing keys, where I

populated the following table with 1,000,000 rows of random-ish (but not entirely realistic) customer data:

CREATE TABLE [dbo].[Customers_I]

(

[CustomerID] [int] IDENTITY(1,1) NOT NULL,

[FirstName] nvarchar NOT NULL,

[LastName] nvarchar NOT NULL,

[EMail] nvarchar NOT NULL,

[Active] [bit] NOT NULL DEFAULT ((1)),

[Created] [datetime] NOT NULL DEFAULT (sysdatetime()),

[Updated] [datetime] NULL,

CONSTRAINT [C_PK_Customers_I] PRIMARY KEY CLUSTERED ([CustomerID] ASC)

);

GO

CREATE NONCLUSTERED INDEX [C_Active_Customers_I]

ON [dbo].[Customers_I]

Home Archives Authors Categories Newsletters About Search... GO

10/11/2019 Pagination with OFFSET / FETCH : A better way

https://sqlperformance.com/2015/01/t-sql-queries/pagination-with-offset-fetch 2/23

([FirstName] ASC, [LastName] ASC, [EMail] ASC)

WHERE ([Active] = 1);

GO

CREATE UNIQUE NONCLUSTERED INDEX [C_Email_Customers_I]

ON [dbo].[Customers_I]

([EMail] ASC);

GO

CREATE NONCLUSTERED INDEX [C_Name_Customers_I]

ON [dbo].[Customers_I]

([LastName] ASC, [FirstName] ASC)

INCLUDE ([EMail]);

GO

Since I knew I would be testing I/O here, and would be testing from both a warm and cold cache, I made the test at

least a little bit more fair by rebuilding all of the indexes to minimize fragmentation (as would be done less

disruptively, but regularly, on most busy systems that are performing any type of index maintenance):

ALTER INDEX ALL ON dbo.Customers_I REBUILD WITH (ONLINE = ON);

After the rebuild, fragmentation comes in now at 0.05% – 0.17% for all indexes (index level = 0), pages are filled over

99%, and the row count / page count for the indexes are as follows:

Index Page Count Row Count

C_PK_Customers_I (clustered index) 19,210 1,000,000

C_Email_Customers_I 7,344 1,000,000

C_Active_Customers_I (filtered index) 13,648 815,235

C_Name_Customers_I 16,824 1,000,000

Indexes, page counts, row counts

This obviously isn't a super-wide table, and I've left compression out of the picture this time. Perhaps I will explore

more configurations in a future test.

Paging Scenarios

Typically, users will formulate a paging query like this (I'm going to leave the old-school, pre-2012 methods out of this

post):

SELECT [a_bunch_of_columns]

FROM dbo.[some_table]

ORDER BY [some_column_or_columns]

OFFSET @PageSize * (@PageNumber - 1) ROWS

FETCH NEXT @PageSize ROWS ONLY;

As I mentioned above, this works just fine if there is an index that supports the ORDER BY and that covers all of the

columns in the SELECT clause (and, for more complex queries, the WHERE and JOIN clauses). However, the sort costs

might be overwhelming with no supporting index, and if the output columns aren't covered, you will either end up

with a whole bunch of key lookups, or you may even get a table scan in some scenarios.

Let's get more specific. Given the table and indexes above, I wanted to test these scenarios, where we want to show

100 rows per page, and output all of the columns in the table:

  1. Default – ORDER BY CustomerID (clustered index).

10/11/2019 Pagination with OFFSET / FETCH : A better way

https://sqlperformance.com/2015/01/t-sql-queries/pagination-with-offset-fetch 3/23

  1. Phone book – ORDER BY LastName, FirstName (supporting non-clustered index).
  2. User-de

如果觉得我的文章对您有用,请点赞。您的支持将鼓励我继续创作!

0

添加新评论0 条评论

Ctrl+Enter 发表

作者其他文章

相关文章

相关问题

相关资料

X社区推广