0000421: LIMIT start, ppp Does Not Scale Well

ID	Project	Category	View Status	Date Submitted	Last Update

0000421	XMB1	Research Tasks	public	2011-01-14 16:52	2025-09-20 06:43

Reporter	miqrogroove	Assigned To
Priority	normal	Severity	minor	Reproducibility	N/A
Status	acknowledged	Resolution	open
Target Version	1.12.00

Summary	0000421: LIMIT start, ppp Does Not Scale Well
Description	During optimization testing on Exhale, I found that the LIMIT command runs about 10 times faster using an index instead of the table data. Thus, we now have two queries instead of one when we need to query paged information. The remaining problem is that this wont be helpful for a forum that grows 10 times larger than Exhale. If a forum had on the order of 10^6 threads or if a thread had 10^6 replies, then even the index scan would become rather onerous for the sake of locating page boundaries. It may be worth researching alternative strategies for paging information with greater magnitude. It may not be compatible with XMB 1.9.11, but it would be nice to know what direction these optimizations will take in the future. Partitioning? Stats tables? Views?
Tags	No tags attached.

MySQL Version
PHP Version
Web Server
Browser
Flags	Schema Updates
Original Reporter
SVN Revision
Git Commit

miqrogroove 2011-01-15 13:06 administrator ~0000283	Here is a post confirming the behavior, but does not really offer an alternative strategy for locating an offset. http://forums.mysql.com/read.php?24,112440,112461

miqrogroove 2011-01-15 13:22 administrator ~0000284	This post has a nice discussion, basically concluding that every situation requires a different strategy. http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

miqrogroove 2011-01-15 13:45 administrator ~0000285	This one by Yahoo basically says don't do what we are trying to do :) http://www.scribd.com/doc/14683263/Efficient-Pagination-Using-MySQL

miqrogroove 2011-01-15 14:53 administrator ~0000286	My conclusion so far is that we would have to maintain stats tables for key distribution data. It seems like that shouldn't be necessary, but here's how it would work: Without a specific WHERE condition, the best index scan rate I've seen is 300,000 rows in 0.1 sec. Let's assume a fixed time of 0.0000003 per row. SELECT dateline, pid FROM xmb_posts WHERE tid=288108 ORDER BY dateline ASC, pid ASC LIMIT 300000, 1 If we have a table identifying a key value (date) for each 1,000 rows in the table, then we could query that table to find any record within a range of 1,000 records. In theory, MySQL would be able run the resulting WHERE condition in just 0.0003 sec regardless of how many rows are in the table. This is essentially an index on an index. The above query returned dateline = 1238176252 so what would happen if... SELECT dateline, pid FROM xmb_posts WHERE tid=288108 AND dateline > 1238176252 ORDER BY dateline ASC, pid ASC LIMIT 1000, 1 Actual run time was 0.0008 :D

miqrogroove 2025-09-20 06:43 administrator ~0000599	It's time to consider this for v1.12. Also, I think there are good alternatives to stats tables that need to be considered. If the main issue is lack of indexed query criteria for page numbers, then we could just not use page numbers in URLs and queries. The existing lastpost column is too awkward for this, and autonumber fields are technically not appropriate, so what if we simply add a 'sort' column and assign serial numbers to each post in the same thread? This would give us all the advantages of speed in a single table and single query. The disadvantage is that it doesn't allow for graceful post deletion, unless we sacrifice the ability to display page numbers.

Date Modified	Username	Field	Change
2011-01-14 16:52	miqrogroove	New Issue
2011-01-14 16:53	miqrogroove	Relationship added	related to 0000416
2011-01-14 16:55	miqrogroove	Description Updated
2011-01-15 13:06	miqrogroove	Note Added: 0000283
2011-01-15 13:22	miqrogroove	Note Added: 0000284
2011-01-15 13:45	miqrogroove	Note Added: 0000285
2011-01-15 14:53	miqrogroove	Note Added: 0000286
2025-09-20 06:43	miqrogroove	Priority	low => normal
2025-09-20 06:43	miqrogroove	Status	new => acknowledged
2025-09-20 06:43	miqrogroove	Target Version	=> 1.12.00
2025-09-20 06:43	miqrogroove	Flags	=> Schema Updates
2025-09-20 06:43	miqrogroove	Note Added: 0000599

View Issue Details

Relationships

Activities

Issue History