Mariadb - 每批扩展插入执行时间都在逐渐增加

Mariadb - Every batch of extended insert execution time is increasing gradually(Mariadb - 每批扩展插入执行时间都在逐渐增加)
本文介绍了Mariadb - 每批扩展插入执行时间都在逐渐增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将 六亿五千万(650000000) 条记录插入到一​​个表中.我尝试过 extend insertLOAD DATA LOCAL INFILE 方法.

I need to insert six hundred fifty million(650000000) records into a single table. I have tried extend insert and LOAD DATA LOCAL INFILE methods.

扩展插入结果:

我已经拆分了数据,每个扩展插入都包含 1250000.我面临的问题是每个批次的执行时间都会逐渐增加.

I have split the data and every extended insert contains 1250000. I'm facing issues like every batch gradually getting increased execution time.

Batch 1 completed in 40s
Batch 2 completed in 50s
Batch 3 completed in 60s
Batch 4 completed in 80s
Batch 5 completed in 100s

完成需要 15-20 小时.

It will take 15-20 hours to complete.

加载 DATA LOCAL INFILE 结果

我有一个 40G 的 CSV 文件,我将它分成 500 个文件.然后我将它加载到Mysql中.在这里我也面临同样的问题,每次文件加载都会增加执行时间.

I have a CSV file size 40G, I split it by 500 files. Then I loaded it in Mysql. Here also I'm facing the same issue, every file load getting increased execution time.

表结构供您参考:

如果是预期行为,有没有其他方法可以快速导入数据?

If it is expected behavior, is there any other way to import the data fastly?

推荐答案

PRIMARY KEY对数据进行排序.使用 sort 命令对 40GB 文件进行排序可能比让 MySQL 进行排序要快.

Sort the data by the PRIMARY KEY. It is likely to be faster to sort a 40GB file with the sort command than to have MySQL do it.

innodb_buffer_pool_size 设置为可用 RAM 的 70% 左右.你有多少内存?

Set innodb_buffer_pool_size to about 70% of available RAM. How much RAM do you have?

PRIMARY KEY 已经建立在桌子上.

没有任何二级索引或外键或触发器.稍后添加它们.是的,这需要时间.

Don't have any secondary indexes or FOREIGN KEYs or Triggers. Add them later. Yes, it takes time.

确认您确实需要所有这些索引.

Confirm that you will really need all those indexes.

选择小于 10 字节的 DECIMAL(20,6) 可以稍微缩小数据.

The data could be shrunk slightly by picking a smaller than the 10-byte DECIMAL(20,6).

这篇关于Mariadb - 每批扩展插入执行时间都在逐渐增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Hibernate reactive No Vert.x context active in aws rds(AWS RDS中的休眠反应性非Vert.x上下文处于活动状态)
Bulk insert with mysql2 and NodeJs throws 500(使用mysql2和NodeJS的大容量插入抛出500)
Flask + PyMySQL giving error no attribute #39;settimeout#39;(FlASK+PyMySQL给出错误,没有属性#39;setTimeout#39;)
auto_increment column for a group of rows?(一组行的AUTO_INCREMENT列?)
Sort by ID DESC(按ID代码排序)
SQL/MySQL: split a quantity value into multiple rows by date(SQL/MySQL:按日期将数量值拆分为多行)