_about
_photos

On speed for paginated queries

The problem

Finally, something that is work-related and not some random stuff I do in my free time.

Imagine, you have a huge table, or even a view and you need to query it from the database and present in the UI, with sorting and pagination.

Looks pretty simple: we have LIMIT and OFFSET specifically for that task. But I would not have written this article, if it was that easy.

The problem here is that if the OFFSET value is too high your query is going to become extremely slow.

More details

I'm using Postgresql with Prisma in Node.js so code snippets would be with those tools. So, consider this query:


const data = await prisma.someTable
.findMany({
    where: {
        someProperty: 'someValue',
    },
    include: {
        otherTable: {
            andAnotherOne: true
        }
    },
    orderBy: {
        field: 'asc'
    },
    { 
        skip: 1000,
        take: 10
    }
});
        

And... this takes a while to run, easily requiring several seconds. And the more the "skip", the longer it takes.

The sample query takes 6.52s

Why is that?

I have found a great article on this topic, here it is. The key takeaway is that with the large skip, database requires a lot of memory, and the more it is, the more memory will be needed (image from the article above):

/7/1.png

That article suggests using indexes, which is indeed a good solution, but might not be applicable. In my app, there is an ability to sort this table and some fields are really hard to index. Some of them have the same value (which is not ok, but I don't control the data), some are composite.

Another issue is that for pagination to work I have to send the total number of rows along with the paginated data. Therefore, I have to run another query to get that count (since the long-awaited findManyAndCount had not been implemented yet)

My solution

Once I've realized that the issue is the memory, I've tried to reduce the consumption.

So, just to test things, I have removed everything from the "include" property (for those who are unfamiliar with Prisma - it is pretty much SQL join with some extra stuff).

The sample query takes 55ms

Great! But useless - I still need those included columns, and let's not forget about the count query.

But now we know, that queries with includes and without an offset run fast - so let's try to have one.

The 1st idea was to pre-select the array of object ids we're going to need for this exact page of pagination and then use it in the big one.

Now, let's have 2 queries:


const ids = await prisma.someTable
.findMany({
    where: {
        someProperty: 'someValue',
    },
    select: {
        id: true,
    },
    orderBy: {
        field: 'asc'
    },
    { 
        skip: 1000,
        take: 10
    }
});

const data = await prisma.someTable
.findMany({
    where: {
        id: {
           in: Object.values(ids).map((id) => id.id),
        },
    },
    include: {
        otherTable: {
            andAnotherOne: true
        }
    },
});
        

The sample query takes 99ms

Not as good as without data, but still much better than the initial solution. But I still need the count!

Let's rewrite it a bit more. "ids" is just a simple array, therefore it should be actually easy to get the part we need without using sql at all. As as side effect - the full array of ids has the exact same length I needed to sent to the UI.

So, I have removed pagination from the request at all, leaving sorting in the 1st one (the one that gets ids) and moved pagination task to js:


const paginatedIds = ids.slice(skip, skip + take));
          

And then use this paginatedIds array in the "in" clause of the 2nd query.

The sample query takes 81ms

So, yeah, I have not beaten the 55ms result, but that one was pretty much useless, and now I've got the count as well - therefore this is all I needed from this function and it won't take any more time than that.

done

I can't share the whole code because I've wrote it for work - but you don't have to trust me, better try this approach yourself and see if it is any good in your case.