16

I have a collection containing entries in following format:

{ 
    "_id" : ObjectId("5538e75c3cea103b25ff94a3"), 
    "userID" : "USER001", 
    "userName" : "manish", 
    "collegeIDs" : [
        "COL_HARY",
        "COL_MARY",
        "COL_JOHNS",
        "COL_CAS",
        "COL_JAMES",
        "COL_MARY",
        "COL_MARY",
        "COL_JOHNS"
    ]
}

I need to find out the collegeIDs those are repeating. So the result should give "COL_MARY","COL_JOHNS" and if possible the repeating count. Please do give a mongo query to find it.

JohnnyHK
  • 290,447
  • 61
  • 595
  • 453
lime_pal
  • 163
  • 1
  • 9
  • 1
    possible duplicate of [How to remove duplicate entries from an array?](http://stackoverflow.com/questions/9862255/how-to-remove-duplicate-entries-from-an-array) – Thomas Sep 10 '15 at 13:09
  • 2
    Please, search for other similar questions before posting your own. I found [this](http://stackoverflow.com/questions/9862255/how-to-remove-duplicate-entries-from-an-array) through Googling "mongodb find duplicate values array" in under a minute. There are plenty of resources out there to help you with this. Attempt also to show us what you have done. That way we can better guide you. – Thomas Sep 10 '15 at 13:11

1 Answers1

22

Probably there would be many of these documents and thus you want it per ObjectId.

db.myCollection.aggregate([
  {"$project": {"collegeIDs":1}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":{"_id":"$_id", "cid":"$collegeIDs"}, "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
  {"$group": {"_id": "$_id._id", "collegeIDs":{"$addToSet":"$_id.cid"}}}
])

This might be what you want to, not clear from your question:

db.myCollection.aggregate([
  {"$match": {"userID":"USER001"}},
  {"$project": {"collegeIDs":1, "_id":0}},
  {"$unwind":"$collegeIDs"},
  {"$group": {"_id":"$collegeIDs", "count":{"$sum":1}}},
  {"$match": {"count":{"$gt":1}}},
])
Cetin Basoz
  • 18,243
  • 2
  • 24
  • 36
  • yes.., i want to find it for a particular user in that collection. i mean filter by giving userid in the query. – lime_pal Sep 10 '15 at 13:21
  • Then there is your solution, just ignore the down voter whoever he is he did without thinking. With your data result is: { "_id" : ObjectId("5538e75c3cea103b25ff94a3"), "collegeIDs" : [ "COL_MARY", "COL_JOHNS" ] } – Cetin Basoz Sep 10 '15 at 13:22
  • where should i give the user id in that query ? – lime_pal Sep 10 '15 at 13:25
  • Ahh you want it for a particular UserID? Then add a $match at top. – Cetin Basoz Sep 10 '15 at 13:31