Bucket in mongodb

Hi, i was going through from the mognodb documentation and the topic was bucket.
i tried the given example which was as followed:

    db.artists.insertMany([
      { "_id" : 1, "last_name" : "Bernard", "first_name" : "Emil", "year_born" : 1868, "year_died" : 1941, "nationality" : "France" },
      { "_id" : 2, "last_name" : "Rippl-Ronai", "first_name" : "Joszef", "year_born" : 1861, "year_died" : 1927, "nationality" : "Hungary" },
      { "_id" : 3, "last_name" : "Ostroumova", "first_name" : "Anna", "year_born" : 1871, "year_died" : 1955, "nationality" : "Russia" },
      { "_id" : 4, "last_name" : "Van Gogh", "first_name" : "Vincent", "year_born" : 1853, "year_died" : 1890, "nationality" : "Holland" },
      { "_id" : 5, "last_name" : "Maurer", "first_name" : "Alfred", "year_born" : 1868, "year_died" : 1932, "nationality" : "USA" },
      { "_id" : 6, "last_name" : "Munch", "first_name" : "Edvard", "year_born" : 1863, "year_died" : 1944, "nationality" : "Norway" },
      { "_id" : 7, "last_name" : "Redon", "first_name" : "Odilon", "year_born" : 1840, "year_died" : 1916, "nationality" : "France" },
      { "_id" : 8, "last_name" : "Diriks", "first_name" : "Edvard", "year_born" : 1855, "year_died" : 1930, "nationality" : "Norway" }
    ])

and query on that data.

    db.artists.aggregate( [
      // First Stage
      {
        $bucket: {
          groupBy: "$year_born",                        // Field to group by
          boundaries: [ 1840, 1850, 1860, 1870, 1880 ], // Boundaries for the buckets
          default: "Other",                             // Bucket id for documents which do not fall into a bucket
          output: {                                     // Output for each bucket
            "count": { $sum: 1 },
            "artists" :
              {
                $push: {
                  "name": { $concat: [ "$first_name", " ", "$last_name"] },
                  "year_born": "$year_born"
                }
              }
          }
        }
      }
    ] )

and here is the output of the above query:

    /* 1 */
    {
        "_id" : 1840.0,
        "count" : 1.0,
        "artists" : [ 
            {
                "F_name" : "Odilon",
                "L_name" : "Redon",
                "year_born" : 1840.0,
                "year_died" : 1916.0,
                "nationality" : "France"
            }
        ]
    }

    /* 2 */
    {
        "_id" : 1850.0,
        "count" : 2.0,
        "artists" : [ 
            {
                "F_name" : "Vincent",
                "L_name" : "Van Gogh",
                "year_born" : 1853.0,
                "year_died" : 1890.0,
                "nationality" : "Holland"
            }, 
            {
                "F_name" : "Edvard",
                "L_name" : "Diriks",
                "year_born" : 1855.0,
                "year_died" : 1930.0,
                "nationality" : "Norway"
            }
        ]
    }

    /* 3 */
    {
        "_id" : "other",
        "count" : 5.0,
        "artists" : [ 
            {
                "F_name" : "Emil",
                "L_name" : "Bernard",
                "year_born" : 1868.0,
                "year_died" : 1941.0,
                "nationality" : "France"
            }, 
            {
                "F_name" : "Joszef",
                "L_name" : "Rippl-Ronai",
                "year_born" : 1861.0,
                "year_died" : 1927.0,
                "nationality" : "Hungary"
            }, 
            {
                "F_name" : "Anna",
                "L_name" : "Ostroumova",
                "year_born" : 1871.0,
                "year_died" : 1955.0,
                "nationality" : "Russia"
            }, 
            {
                "F_name" : "Alfred",
                "L_name" : "Maurer",
                "year_born" : 1868.0,
                "year_died" : 1932.0,
                "nationality" : "USA"
            }, 
            {
                "F_name" : "Edvard",
                "L_name" : "Munch",
                "year_born" : 1863.0,
                "year_died" : 1944.0,
                "nationality" : "Norway"
            }
        ]
    }

1- My question is that the artists array is grouped by year_born and data isn’t in ascending order or descending order?
2- Can we set limit of documents to be placed in one bucket.
3- Can we add number of buckets while creating create statement. is this is possible or not?
so that when we insert a document it will be placed/moved to the matched condition bucket.

1- My question is that the artists array is grouped by year_born and data isn’t in ascending order or descending order?

artists is not an array; it is a collection. “…is grouped by year_born and data isn’t in ascending order or descending order” - correct.

2- Can we set limit of documents to be placed in one bucket.

Actually you cannot specify the number of documents, or use some kind of limit expression. The boundaries field specifies the limits (upper and lowerbound) for each bucket based on the groupBy expression.

3- Can we add number of buckets while creating create statement. is this is possible or not? so that when we insert a document it will be placed/moved to the matched condition bucket.

You mean while creating the collection? It is not clear what you mean by “create statement”.

Check the attachment. year_born isn’t in any order.
image

When we create the collection can we specify the bucket?

No.There is no such feature with collection creation.

1 Like

Okay thanks. It help me alot. But what about the sorting ?
kindly check the image above.

The year_born values are not sorted.

So how to view them in sorted ordered.

{ $sort : {year_born:1} }
I just added this line but it doesn’t work for me.

You can try this:

db.collection.aggregate( [ 
  { $unwind: "$artists" }, 
  { $sort: { "artists.year_born": 1 } }, 
  { $group: { _id: "$id", artists: { $push: "$artists" }, count: { $first: "$count" } } }
] )

This will sort the unwound array elements and group the sorted ones back into an array.

Kindly review the Question again. I have collection and i made a Query and i need it to be sorted.

You can add the three stages from the aggregation I have provided to your aggregation query (at the end).

It doesn’t work for me.

db.artists.aggregate( [
  // First Stage
  {
    $bucket: {
      groupBy: "$year_born",                        // Field to group by
      boundaries: [ 1840, 1850, 1860, 1870, 1880 ], // Boundaries for the buckets
      default: "NULL VALUES",                             // Bucket id for documents which do not fall into a bucket
      output: {                                     // Output for each bucket
        "count": { $sum: 1 },
        "artists" :
          {
            $push: {
              "F_name": "$first_name",
              "L_name": "$last_name",
              "year_born": "$year_born",
              "year_died" : "$year_died",
              "nationality": "$nationality"
            }
          }
      }
    }
  },
 { $unwind: "$artists" }, 
 { $sort: { "artists.year_born": 1 } },
 { $group: { _id: "$_id", artists: { $push: "$artists" } } }

] )