MongoDB duplicate results issue - Stack Overflow

admin2025-04-26  6

i'm working on a project that uses a mongodb query to get results and handle pagination and result limiation. The issue is i'm starting to see duplicate results for every page.. I'm not sure what's the issue.

Here is my query:

async function getAllResults(uid, jobId, page = 1, limit = 10, sortField = 'id', sortOrder = 'asc') {
  const db = await connectToDatabase();
  const resultsCollection = db.collection('results');

  const totalCount = await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $count: "total" }
  ]).toArray();

  const total = totalCount[0]?.total || 0;

  const sortObj = {};
  
  
    sortObj[`results.${sortField}`] = sortOrder === 'asc' ? 1 : -1;

  const pipeline = [
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $sort: sortObj },
  ];

  if (limit > 0) {
    const skip = (page - 1) * limit;
    pipeline.push({ $skip: skip });
    pipeline.push({ $limit: limit });
  }

  pipeline.push({
    $group: {
      _id: null,
      results: { $push: "$results" }
    }
  });

  const jobResults = await resultsCollection.aggregate(pipeline).toArray();

  if (jobResults.length === 0) {
    return { results: [], total: 0, pages: 0 };
  }

  return {
    results: jobResults[0].results,
    total: total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };
}

I tried so many solutions using Ai but none works. I know the issue might be with unwind.. there was a solution that worked but it required handling results in memory which might cause me some performance issues if the results were large (10k up to 500k).

This is an example docuemnt for a result. Each JSON object in results have data.

{
  "_id": "6735781e6707ab39c1f694e5",
  "userId": "OJctGjbmxJN7iLhIKIXeItYd0BT2",
  "jobId": "68c96821-d601-4c77-b9eb-b2df1e09a838",
  "documentId": "68c96821-d601-4c77-b9eb-b2df1e09a838-0",
  "type": "email",
  "date": "2024-11-14T04:10:06.440+00:00",
  "results": [
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}
  ]
}

And this is in the returned response:

{
    "results": [
        {
            "input": "...",
            "success": true,
            "avatar": "...",
            "city": "...",
            "country": "...",
            "email": "..."
        },
        {
            ...
        }
    ],
    "total": 100,
    "pages": 10
}

i'm working on a project that uses a mongodb query to get results and handle pagination and result limiation. The issue is i'm starting to see duplicate results for every page.. I'm not sure what's the issue.

Here is my query:

async function getAllResults(uid, jobId, page = 1, limit = 10, sortField = 'id', sortOrder = 'asc') {
  const db = await connectToDatabase();
  const resultsCollection = db.collection('results');

  const totalCount = await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $count: "total" }
  ]).toArray();

  const total = totalCount[0]?.total || 0;

  const sortObj = {};
  
  
    sortObj[`results.${sortField}`] = sortOrder === 'asc' ? 1 : -1;

  const pipeline = [
    { $match: { userId: uid, jobId: jobId } },
    { $unwind: "$results" },
    { $sort: sortObj },
  ];

  if (limit > 0) {
    const skip = (page - 1) * limit;
    pipeline.push({ $skip: skip });
    pipeline.push({ $limit: limit });
  }

  pipeline.push({
    $group: {
      _id: null,
      results: { $push: "$results" }
    }
  });

  const jobResults = await resultsCollection.aggregate(pipeline).toArray();

  if (jobResults.length === 0) {
    return { results: [], total: 0, pages: 0 };
  }

  return {
    results: jobResults[0].results,
    total: total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };
}

I tried so many solutions using Ai but none works. I know the issue might be with unwind.. there was a solution that worked but it required handling results in memory which might cause me some performance issues if the results were large (10k up to 500k).

This is an example docuemnt for a result. Each JSON object in results have data.

{
  "_id": "6735781e6707ab39c1f694e5",
  "userId": "OJctGjbmxJN7iLhIKIXeItYd0BT2",
  "jobId": "68c96821-d601-4c77-b9eb-b2df1e09a838",
  "documentId": "68c96821-d601-4c77-b9eb-b2df1e09a838-0",
  "type": "email",
  "date": "2024-11-14T04:10:06.440+00:00",
  "results": [
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}, {}, 
    {}, {}, {}, {}, {}, {}, {}
  ]
}

And this is in the returned response:

{
    "results": [
        {
            "input": "...",
            "success": true,
            "avatar": "...",
            "city": "...",
            "country": "...",
            "email": "..."
        },
        {
            ...
        }
    ],
    "total": 100,
    "pages": 10
}
Share Improve this question edited Nov 16, 2024 at 15:42 Joint asked Nov 16, 2024 at 5:13 JointJoint 11 silver badge1 bronze badge 1
  • 1 Show some example documents. Your default sort field is id - rather than _id --> so results.id. Does this field actually exist? Is it unique? If not, then add a second sort condition. Otherwise, it will sort on that one field and documents which have same value for that field will be sorted in an undetermined sequence / randomly. Add this stage to the end and make sure all the counts are '1': { $group: { _id: "results.id", count: { $count: {} } } }. – aneroid Commented Nov 16, 2024 at 5:38
Add a comment  | 

1 Answer 1

Reset to default 1

You don't need $unwind and $count, just use

{ $set: {total: {$size: "$results"}} }

Instead of $unwind, $skip, $limit and $group you can use $slice

You can do it in one query, something like this (not tested):

const jobResults= await resultsCollection.aggregate([
    { $match: { userId: uid, jobId: jobId } },
    { $set: {
       total: {$size: "$results"},
       results: { $slice: [ "$results", skip, limit ] }
       }
    }
  ]).toArray();

return {
    results: jobResults[0].results,
    total: jobResults[0].total,
    pages: limit > 0 ? Math.ceil(total / limit) : 1
  };

You may use

results: {
   $sortArray: {
      input: { $slice: [ "$results", skip, limit ] }, 
      sortBy: sortObj }
}

to get an ordered result

转载请注明原文地址:http://conceptsofalgorithm.com/Algorithm/1745664880a312969.html

最新回复(0)