使用 PyMongo 的 Concat 数组因未知组运算符“$concatArrays"而失败

Concat arrays using PyMongo failed with unknown group operator #39;$concatArrays#39;(使用 PyMongo 的 Concat 数组因未知组运算符“$concatArrays而失败)
本文介绍了使用 PyMongo 的 Concat 数组因未知组运算符“$concatArrays"而失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 mongodb 数据,例如:

I have mongodb data like:

{'word': 'good', 'info': [{'tbl_id': 'd1', 'term_freq': 2}, {'tbl_id': 'd2', 'term_freq': 56}, {'tbl_id': 'd3', 'term_freq': 3}]}
{'word': 'spark', 'info': [{'tbl_id': 'd1', 'term_freq': 6}, {'tbl_id': 'd3', 'term_freq': 11}, {'tbl_id': 'd4', 'term_freq': 10}]}
{'word': 'good', 'info': [{'tbl_id': 'd4', 'term_freq': 12}, {'tbl_id': 'd5', 'term_freq': 8}, {'tbl_id': 'd8', 'term_freq': 7}]}
{'word': 'spark', 'info': [{'tbl_id': 'd5', 'term_freq': 6}, {'tbl_id': 'd6', 'term_freq': 11}, {'tbl_id': 'd7', 'term_freq': 10}]}

我想用pymongo来处理,结果应该是:

and I want to use pymongo to process it, the result should be:

{'word': 'good',
 'info': [{'tbl_id': 'd1', 'term_freq': 2}, {'tbl_id': 'd2', 'term_freq': 56}, {'tbl_id': 'd3', 'term_freq': 3},
          {'tbl_id': 'd4', 'term_freq': 12}, {'tbl_id': 'd5', 'term_freq': 8}, {'tbl_id': 'd8', 'term_freq': 7}]}
{'word': 'spark',
 'info': [{'tbl_id': 'd1', 'term_freq': 6}, {'tbl_id': 'd3', 'term_freq': 11}, {'tbl_id': 'd4', 'term_freq': 10},
          {'tbl_id': 'd5', 'term_freq': 6}, {'tbl_id': 'd6', 'term_freq': 11}, {'tbl_id': 'd7', 'term_freq': 10}]}

我在 pymongo 中使用组:

I use group in pymongo:

a = mycol.aggregate([{"$group": {"_id":"$word", 'infos': {"$concatArrays": 1}}}])
for i in a:
    print(i)

出错了:pymongo.errors.OperationFailure: unknown group operator '$concatArrays'.我使用 group 关键字:

It went wrong: pymongo.errors.OperationFailure: unknown group operator '$concatArrays'. and I use group keyword:

a = mycol.group(key='word',condition=None, initial={'infos': []}, reduce={"$concatArrays": "info"})
for i in a:
    print(i)

也出错了:

Traceback (most recent call last):File "F:/programs/SearchEngine/test.py", line 167, in <module> a = mycol.group(key='word',condition=None, initial={'infos': []}, reduce={"$concatArrays": "info"})  File "C:Usersll.virtualenvsSearchEnginelibsite-packagespymongocollection.py", line 2550, in group  group["$reduce"] = Code(reduce)  File "C:Usersll.virtualenvsSearchEnginelibsite-packagessoncode.py", line 54, in __new__  "instance of %s" % (string_type.__name__))
TypeError: code must be an instance of str

推荐答案

您收到此错误消息的原因是 $concatArrays 运算符是 表达式运算符 不是 $group accumulator.

The reason you are getting this error message is because the $concatArrays operator is an expression operator not a $group accumulator.

话虽如此,您可以使用以下管道执行此操作:

That being said, you can do this with the following pipeline:

[
    {
        "$group": {
            "_id": "$word",
            "info": {
                "$push": "$info"
            }
        }
    },
    {
        "$project": {
            "_id": 0,
            "word": "$_id",
            "info": {
                "$reduce": {
                    "input": "$info",
                    "initialValue": [

                    ],
                    "in": {
                        "$concatArrays": [
                            "$$value",
                            "$$this"
                        ]
                    }
                }
            }
        }
    }
]

我们在 $group 阶段使用 $push 操作符创建一个 info 的二维列表,然后在另一个 $project 阶段你使用 $reduce$concatArrays.

We create a 2d list of info in the $group stage with the $push operator then in the another $project stage you flatten the list using the $reduce and $concatArrays.

这篇关于使用 PyMongo 的 Concat 数组因未知组运算符“$concatArrays"而失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)