问题描述
我想比较使用 Python 和 C++ 从 stdin 读取字符串输入的行,并且震惊地看到我的 C++ 代码运行速度比等效的 Python 代码慢一个数量级.由于我的 C++ 生疏,而且我还不是 Pythonista 专家,请告诉我是我做错了什么还是我误解了什么.
I wanted to compare reading lines of string input from stdin using Python and C++ and was shocked to see my C++ code run an order of magnitude slower than the equivalent Python code. Since my C++ is rusty and I'm not yet an expert Pythonista, please tell me if I'm doing something wrong or if I'm misunderstanding something.
(TLDR 答案:包括以下语句:cin.sync_with_stdio(false)
或仅使用 fgets
代替.
(TLDR answer: include the statement: cin.sync_with_stdio(false)
or just use fgets
instead.
TLDR 结果:一直向下滚动到我的问题的底部并查看表格.)
TLDR results: scroll all the way down to the bottom of my question and look at the table.)
C++ 代码:
#include <iostream>
#include <time.h>
using namespace std;
int main() {
string input_line;
long line_count = 0;
time_t start = time(NULL);
int sec;
int lps;
while (cin) {
getline(cin, input_line);
if (!cin.eof())
line_count++;
};
sec = (int) time(NULL) - start;
cerr << "Read " << line_count << " lines in " << sec << " seconds.";
if (sec > 0) {
lps = line_count / sec;
cerr << " LPS: " << lps << endl;
} else
cerr << endl;
return 0;
}
// Compiled with:
// g++ -O3 -o readline_test_cpp foo.cpp
Python 等效项:
#!/usr/bin/env python
import time
import sys
count = 0
start = time.time()
for line in sys.stdin:
count += 1
delta_sec = int(time.time() - start_time)
if delta_sec >= 0:
lines_per_sec = int(round(count/delta_sec))
print("Read {0} lines in {1} seconds. LPS: {2}".format(count, delta_sec,
lines_per_sec))
这是我的结果:
$ cat test_lines | ./readline_test_cpp
Read 5570000 lines in 9 seconds. LPS: 618889
$ cat test_lines | ./readline_test.py
Read 5570000 lines in 1 seconds. LPS: 5570000
我应该注意,我在 Mac OS X v10.6.8 (Snow Leopard) 和 Linux 2.6.32 (Red Hat Linux 6.2) 下都尝试了这个.前者是一台 MacBook Pro,后者是一个非常强大的服务器,并不是说这太相关了.
$ for i in {1..5}; do echo "Test run $i at `date`"; echo -n "CPP:"; cat test_lines | ./readline_test_cpp ; echo -n "Python:"; cat test_lines | ./readline_test.py ; done
Test run 1 at Mon Feb 20 21:29:28 EST 2012
CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 2 at Mon Feb 20 21:29:39 EST 2012
CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 3 at Mon Feb 20 21:29:50 EST 2012
CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 4 at Mon Feb 20 21:30:01 EST 2012
CPP: Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 5 at Mon Feb 20 21:30:11 EST 2012
CPP: Read 5570001 lines in 10 seconds. LPS: 557000
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
微小的基准附录和回顾
Tiny benchmark addendum and recap
为了完整起见,我想我会使用原始(同步)C++ 代码更新同一个盒子上同一个文件的读取速度.同样,这是针对快速磁盘上的 100M 行文件.以下是几种解决方案/方法的比较:
For completeness, I thought I'd update the read speed for the same file on the same box with the original (synced) C++ code. Again, this is for a 100M line file on a fast disk. Here's the comparison, with several solutions/approaches:
实现 | 每秒行数 |
---|---|
python(默认) | 3,571,428 |
cin (default/naive) | 819,672 |
cin(无同步) | 12,500,000 |
fgets | 14,285,714 |
wc(不公平比较) | 54,644,808 |
推荐答案
tl;dr:因为C++中不同的默认设置需要更多的系统调用.
默认情况下,cin
与 stdio 同步,从而避免任何输入缓冲.如果你把它添加到你的 main 的顶部,你应该会看到更好的性能:
tl;dr:BecauseofdifferentdefaultsettingsinC++requiringmoresystemcalls.
By default, cin
is synchronized with stdio, which causes it to avoid any input buffering. If you add this to the top of your main, you should see much better performance:
std::ios_base::sync_with_stdio(false);
通常,当输入流被缓冲时,不是一次读取一个字符,而是以更大的块读取流.这减少了通常相对昂贵的系统调用的数量.然而,由于基于 FILE*
的 stdio
和 iostreams
通常有单独的实现,因此有单独的缓冲区,如果两者都使用,这可能会导致问题一起.例如:
Normally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the FILE*
based stdio
and iostreams
often have separate implementations and therefore separate buffers, this could lead to a problem if both were used together. For example:
int myvalue1;
cin >> myvalue1;
int myvalue2;
scanf("%d",&myvalue2);
如果 cin
读取的输入比实际需要的多,那么第二个整数值将无法用于 scanf
函数,它有自己的独立缓冲区.这会导致意想不到的结果.
If more input was read by cin
than it actually needed, then the second integer value wouldn't be available for the scanf
function, which has its own independent buffer. This would lead to unexpected results.
为了避免这种情况,默认情况下,流与 stdio
同步.实现此目的的一种常用方法是使用 stdio
函数根据需要让 cin
一次读取每个字符.不幸的是,这引入了很多开销.对于少量输入,这不是什么大问题,但是当您读取数百万行时,性能损失是显着的.
To avoid this, by default, streams are synchronized with stdio
. One common way to achieve this is to have cin
read each character one at a time as needed using stdio
functions. Unfortunately, this introduces a lot of overhead. For small amounts of input, this isn't a big problem, but when you are reading millions of lines, the performance penalty is significant.
幸运的是,库设计者决定如果您知道自己在做什么,您也应该能够禁用此功能以提高性能,因此他们提供了 sync_with_stdio
方法.
Fortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the sync_with_stdio
method.
这篇关于为什么在 C++ 中从 stdin 读取行比 Python 慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!