将数据框转换为元组列表字典

Convert dataframe to dictionary of list of tuples(将数据框转换为元组列表字典)
本文介绍了将数据框转换为元组列表字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下所示的数据框

I have a dataframe that looks like the following

    user                             item  
0  b80344d063b5ccb3212f76538f3d9e43d87dca9e          The Cove - Jack Johnson   
1  b80344d063b5ccb3212f76538f3d9e43d87dca9e  Entre Dos Aguas - Paco De Lucia   
2  b80344d063b5ccb3212f76538f3d9e43d87dca9e            Stronger - Kanye West   
3  b80344d063b5ccb3212f76538f3d9e43d87dca9e    Constellations - Jack Johnson   
4  b80344d063b5ccb3212f76538f3d9e43d87dca9e      Learn To Fly - Foo Fighters   

rating  
0       1  
1       2  
2       1  
3       1  
4       1  

并想实现如下结构:

dict-> list of tuples
user-> (item, rating)

b80344d063b5ccb3212f76538f3d9e43d87dca9e -> list((The Cove - Jack 
Johnson, 1), ... , )

我能做到:

item_set = dict((user, set(items)) for user, items in 
data.groupby('user')['item'])

但这只会让我半途而废.如何从 groupby 中获取相应的评分"值?

But that only gets me halfways. How do I get the corresponding "rating" value from the groupby?

推荐答案

设置user为索引,使用df.apply转换成元组,使用分组索引df.groupby(level=0) 并使用 dfGroupBy.agg 获取列表并使用 df.to_dict 转换为字典:

Set user as index, convert to tuple using df.apply, groupby index using df.groupby(level=0) and get a list using dfGroupBy.agg and convert to dictionary using df.to_dict:

In [1417]: df
Out[1417]: 
                                       user                             item  
0  b80344d063b5ccb3212f76538f3d9e43d87dca9e          The Cove - Jack Johnson   
1  b80344d063b5ccb3212f76538f3d9e43d87dca9e  Entre Dos Aguas - Paco De Lucia   
2  b80344d063b5ccb3212f76538f3d9e43d87dca9e            Stronger - Kanye West   
3  b80344d063b5ccb3212f76538f3d9e43d87dca9e    Constellations - Jack Johnson   
4  b80344d063b5ccb3212f76538f3d9e43d87dca9e      Learn To Fly - Foo Fighters   

   rating  
0       1  
1       2  
2       2  
3       2  
4       2  

In [1418]: df.set_index('user').apply(tuple, 1)
             .groupby(level=0).agg(lambda x: list(x.values))
             .to_dict()
Out[1418]: 
{'b80344d063b5ccb3212f76538f3d9e43d87dca9e': [('The Cove - Jack Johnson', 1),
  ('Entre Dos Aguas - Paco De Lucia', 2),
  ('Stronger - Kanye West', 2),
  ('Constellations - Jack Johnson', 2),
  ('Learn To Fly - Foo Fighters', 2)]}

这篇关于将数据框转换为元组列表字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)