问题描述
我对 python 非常陌生,并且一直在研究我的树莓派,以启动并运行脚本以将数百万个传感器数据记录导入 sqlite.我想在事务中执行此操作以提高流程效率.我试图将交易分解为 10k 块,如下所示:Python CSV to SQLite
I'm very new to python and have been working on my raspberry pi to get a script up and running to import millions of sensor data records into sqlite. I want to do this in transactions to make the process more efficient. I am trying to break the transactions down into 10k chunks as done here: Python CSV to SQLite
到目前为止我有
import csv, sqlite3, time
def chunks(data, rows=10000):
for i in range (0, len(data), rows):
yield data[i:i+rows]
if __name__ == "__main__":
t = time.time()
con = sqlite3.connect('test.db')
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS sensor;")
cur.execute("CREATE TABLE sensor(key INT, reading REAL);")
filename = 'dummy.csv'
reader = csv.reader(open(filename,"r"))
divdata = chunks(reader)
for chunk in divdata:
cur.execute('BEGIN TRANSACTION')
for col1, col2 in chunk:
cur.execute('INSERT INTO sensor (key, reading) VALUES (?, ?)', (col1, col2))
con.execute('COMMIT')
我在 python 3.2.3 中遇到以下错误:
I'm getting the following error in python 3.2.3:
Traceback (most recent call last):
File "/home/pi/test1.py", line 20, in <module>
for chunk in divdata:
File "/home/pi/test1.py", line 4, in chunks
for i in range (0, len(data), rows):
TypeError: object of type '_csv.reader' has no len()
我显然在某个地方搞砸了块部分,因为没有块和事务,一切(基本插入)都可以正常工作.任何帮助表示赞赏.
I'm obviously messing up in the chunks part somewhere as everything (basic insertion) works fine without the chunks and transaction. Any help appreciated.
推荐答案
您的 SQL 看起来不错.不过,我确实发现您的 CSV 阅读器存在问题:它不支持您在 chunks()
中使用它的方式len()
.
Your SQL looks okay. I do see a problem with your CSV reader, though: it doesn't support len()
the way you're using it in chunks()
.
您可以使用更典型的 for row in data
循环,或者使用 本主题中描述的一种技术 如果您需要将文件分成多个块.
You could either use the more typical for row in data
loop, or use one of the techniques described in this thread if you need to break the file into chunks.
这篇关于用于 CSV 导入的 SQLite 事务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!