在JSON嵌套对象中搜索组合

Search for combinations in JSON nested object(在JSON嵌套对象中搜索组合)
本文介绍了在JSON嵌套对象中搜索组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型JSON对象。其中一部分是:

data = [
{  
   'make': 'dacia',
   'model': 'x',
   'version': 'A',
   'typ': 'sedan',
   'infos': [
            {'id': 1, 'name': 'steering wheel problems'}, 
            {'id': 32, 'name': ABS errors}
   ]
},
{  
   'make': 'nissan',
   'model': 'z',
   'version': 'B',
   'typ': 'coupe',
   'infos': [
         {'id': 3,'name': throttle problems'}, 
         {'id': 56, 'name': 'broken handbreak'}, 
         {'id': 11, ;'name': missing seatbelts'}
   ]
}
]

我创建了我的JSON中可能出现的所有可能的信息组合的列表(一辆车有时只能有一种信息,而另一辆车可能有很多信息):

inf = list(set(i.get'name' for d in data for i in (d['infos'] if isinstance(d['infos'], list) else [d['infos']]))
inf_comb = [combo for n in range(1, len(infos+1)) for combo in itertools.combinations(infos, n)]
infos_combo = [list(elem) for elem in inf_comb]

现在我需要迭代整个JSONdata,并计算infos_combo的某个集合发生了多少次,因此我创建了代码:

tab = []
s = 0
for x in infos_combo:
   s = sum([1 for k in data if (([i['name'] for i in (k['infos'] if isinstance(k['infos'], list) else [k['infos']])] == x))])
   if s!= 0:
     tab.append({'infos': r, 'sum': s})
print(tab)

我面临的问题是tab只返回我期望的一些元素--在我的JSON对象中出现的组合要多得多,必须进行计数,但我无法获得它们。如何解决这个问题?

推荐答案

好的,那么首先您需要从您的json数据中获取所有的实际信息,如下所示:

infos = [
    [i["name"] for i in d["infos"]] if isinstance(d["infos"], list) else d["infos"]
    for d in data
]

这将为您提供类似以下内容的内容,我们稍后将使用这些内容:

[['steering wheel problems', 'ABS errors'], ['throttle problems', 'broken handbreak', 'missing seatbelts']]

现在,要获得所有组合,我们首先需要通过展平信息数组并剔除重复项来进行处理:

unique_infos = [x for l in infos for x in l]

要获取所有组合:

infos_combo = itertools.chain.from_iterable(
    itertools.combinations(unique_infos, r) for r in range(len(unique_infos) + 1)
)

将产生以下结果:

()
('steering wheel problems',)
('ABS errors',)
('throttle problems',)
('broken handbreak',)
('missing seatbelts',)
('steering wheel problems', 'ABS errors')
('steering wheel problems', 'throttle problems')
('steering wheel problems', 'broken handbreak')
...
# truncated code too long
...
('steering wheel problems', 'throttle problems', 'broken handbreak', 'missing seatbelts')
('ABS errors', 'throttle problems', 'broken handbreak', 'missing seatbelts')
('steering wheel problems', 'ABS errors', 'throttle problems', 'broken handbreak', 'missing seatbelts')

之后,需要对原始信息列表中的每个组合进行计数:

occurences = {}
for combo in infos_combo:
    occurences[combo] = infos.count(list(combo))

print(occurences)

完整代码:

import itertools
import sys

data = [
    {
        "make": "dacia",
        "model": "x",
        "version": "A",
        "typ": "sedan",
        "infos": [
            {"id": 1, "name": "steering wheel problems"},
            {"id": 32, "name": "ABS errors"},
        ],
    },
    {
        "make": "nissan",
        "model": "z",
        "version": "B",
        "typ": "coupe",
        "infos": [
            {"id": 3, "name": "throttle problems"},
            {"id": 56, "name": "broken handbreak"},
            {"id": 11, "name": "missing seatbelts"},
        ],
    },
]

infos = [
    [i["name"] for i in d["infos"]] if isinstance(d["infos"], list) else d["infos"]
    for d in data
]

unique_infos = [x for l in infos for x in l]

infos_combo = itertools.chain.from_iterable(
    itertools.combinations(unique_infos, r) for r in range(len(unique_infos) + 1)
)

occurences = {}
for combo in infos_combo:
    occurences[combo] = infos.count(list(combo))

print(occurences)

这篇关于在JSON嵌套对象中搜索组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Make gRPC messages JSON serializable(使GRPC消息JSON可序列化)
Send plain JSON to a gRPC server using python(使用Python将纯JSON发送到GRPC服务器)
how to iterate through geojson elements(如何循环访问Geojson元素)
how to validate properties key in a geojson(如何验证Geojson中的属性密钥)
Python JSONDecoder custom translation of null type(空类型的Python JSONDecoder自定义翻译)
Splitting a MultiPolygon(拆分多重多边形)