为什么在 C++ 中从 stdin 读取行比 Python 慢得多?

Why is reading lines from stdin much slower in C++ than Python?(为什么在 C++ 中从 stdin 读取行比 Python 慢得多?)
本文介绍了为什么在 C++ 中从 stdin 读取行比 Python 慢得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较使用 Python 和 C++ 从 stdin 读取字符串输入的行,并且震惊地看到我的 C++ 代码运行速度比等效的 Python 代码慢一个数量级.由于我的 C++ 生疏,而且我还不是 Pythonista 专家,请告诉我是我做错了什么还是我误解了什么.

I wanted to compare reading lines of string input from stdin using Python and C++ and was shocked to see my C++ code run an order of magnitude slower than the equivalent Python code. Since my C++ is rusty and I'm not yet an expert Pythonista, please tell me if I'm doing something wrong or if I'm misunderstanding something.

(TLDR 答案:包括以下语句:cin.sync_with_stdio(false) 或仅使用 fgets 代替.

(TLDR answer: include the statement: cin.sync_with_stdio(false) or just use fgets instead.

TLDR 结果:一直向下滚动到我的问题的底部并查看表格.)

TLDR results: scroll all the way down to the bottom of my question and look at the table.)

C++ 代码:

#include <iostream>
#include <time.h>

using namespace std;

int main() {
    string input_line;
    long line_count = 0;
    time_t start = time(NULL);
    int sec;
    int lps;

    while (cin) {
        getline(cin, input_line);
        if (!cin.eof())
            line_count++;
    };

    sec = (int) time(NULL) - start;
    cerr << "Read " << line_count << " lines in " << sec << " seconds.";
    if (sec > 0) {
        lps = line_count / sec;
        cerr << " LPS: " << lps << endl;
    } else
        cerr << endl;
    return 0;
}

// Compiled with:
// g++ -O3 -o readline_test_cpp foo.cpp

Python 等效项:

#!/usr/bin/env python
import time
import sys

count = 0
start = time.time()

for line in  sys.stdin:
    count += 1

delta_sec = int(time.time() - start_time)
if delta_sec >= 0:
    lines_per_sec = int(round(count/delta_sec))
    print("Read {0} lines in {1} seconds. LPS: {2}".format(count, delta_sec,
       lines_per_sec))

这是我的结果:

$ cat test_lines | ./readline_test_cpp
Read 5570000 lines in 9 seconds. LPS: 618889

$ cat test_lines | ./readline_test.py
Read 5570000 lines in 1 seconds. LPS: 5570000

我应该注意,我在 Mac OS X v10.6.8 (Snow Leopard) 和 Linux 2.6.32 (Red Hat Linux 6.2) 下都尝试了这个.前者是一台 MacBook Pro,后者是一个非常强大的服务器,并不是说这太相关了.

$ for i in {1..5}; do echo "Test run $i at `date`"; echo -n "CPP:"; cat test_lines | ./readline_test_cpp ; echo -n "Python:"; cat test_lines | ./readline_test.py ; done

Test run 1 at Mon Feb 20 21:29:28 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 2 at Mon Feb 20 21:29:39 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 3 at Mon Feb 20 21:29:50 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 4 at Mon Feb 20 21:30:01 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 5 at Mon Feb 20 21:30:11 EST 2012
CPP:   Read 5570001 lines in 10 seconds. LPS: 557000
Python:Read 5570000 lines in  1 seconds. LPS: 5570000


微小的基准附录和回顾


Tiny benchmark addendum and recap

为了完整起见,我想我会使用原始(同步)C++ 代码更新同一个盒子上同一个文件的读取速度.同样,这是针对快速磁盘上的 100M 行文件.以下是几种解决方案/方法的比较:

For completeness, I thought I'd update the read speed for the same file on the same box with the original (synced) C++ code. Again, this is for a 100M line file on a fast disk. Here's the comparison, with several solutions/approaches:

<头>
实现每秒行数
python(默认)3,571,428
cin (default/naive)819,672
cin(无同步)12,500,000
fgets14,285,714
wc(不公平比较)54,644,808

推荐答案

tl;dr:因为C++中不同的默认设置需要更多的系统调用.

默认情况下,cin 与 stdio 同步,从而避免任何输入缓冲.如果你把它添加到你的 main 的顶部,你应该会看到更好的性能:

tl;dr:BecauseofdifferentdefaultsettingsinC++requiringmoresystemcalls.

By default, cin is synchronized with stdio, which causes it to avoid any input buffering. If you add this to the top of your main, you should see much better performance:

std::ios_base::sync_with_stdio(false);

通常,当输入流被缓冲时,不是一次读取一个字符,而是以更大的块读取流.这减少了通常相对昂贵的系统调用的数量.然而,由于基于 FILE*stdioiostreams 通常有单独的实现,因此有单独的缓冲区,如果两者都使用,这可能会导致问题一起.例如:

Normally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the FILE* based stdio and iostreams often have separate implementations and therefore separate buffers, this could lead to a problem if both were used together. For example:

int myvalue1;
cin >> myvalue1;
int myvalue2;
scanf("%d",&myvalue2);

如果 cin 读取的输入比实际需要的多,那么第二个整数值将无法用于 scanf 函数,它有自己的独立缓冲区.这会导致意想不到的结果.

If more input was read by cin than it actually needed, then the second integer value wouldn't be available for the scanf function, which has its own independent buffer. This would lead to unexpected results.

为了避免这种情况,默认情况下,流与 stdio 同步.实现此目的的一种常用方法是使用 stdio 函数根据需要让 cin 一次读取每个字符.不幸的是,这引入了很多开销.对于少量输入,这不是什么大问题,但是当您读取数百万行时,性能损失是显着的.

To avoid this, by default, streams are synchronized with stdio. One common way to achieve this is to have cin read each character one at a time as needed using stdio functions. Unfortunately, this introduces a lot of overhead. For small amounts of input, this isn't a big problem, but when you are reading millions of lines, the performance penalty is significant.

幸运的是,库设计者决定如果您知道自己在做什么,您也应该能够禁用此功能以提高性能,因此他们提供了 sync_with_stdio 方法.

Fortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the sync_with_stdio method.

这篇关于为什么在 C++ 中从 stdin 读取行比 Python 慢得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Rising edge interrupt triggering multiple times on STM32 Nucleo(在STM32 Nucleo上多次触发上升沿中断)
How to use va_list correctly in a sequence of wrapper functions calls?(如何在一系列包装函数调用中正确使用 va_list?)
OpenGL Perspective Projection Clipping Polygon with Vertex Outside Frustum = Wrong texture mapping?(OpenGL透视投影裁剪多边形,顶点在视锥外=错误的纹理映射?)
How does one properly deserialize a byte array back into an object in C++?(如何正确地将字节数组反序列化回 C++ 中的对象?)
What free tiniest flash file system could you advice for embedded system?(您可以为嵌入式系统推荐什么免费的最小闪存文件系统?)
Volatile member variables vs. volatile object?(易失性成员变量与易失性对象?)