计算大数字的方差

Calculating variance with large numbers(计算大数字的方差)
本文介绍了计算大数字的方差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我还没有真正使用过那么多的方差计算,我也不知道会发生什么.其实我数学一点都不好.

I haven't really used variance calculation that much, and I don't know quite what to expect. Actually I'm not too good with math at all.

我有一个由 0-10000 范围内的 1000000 个随机数值组成的数组.

I have a an array of 1000000 random numeric values in the range 0-10000.

数组可以变得更大,所以我使用 64 位 int 来求和.

The array could grow even larger, so I use 64 bit int for sum.

我试图找到关于如何计算方差的代码,但我不知道我是否得到了正确的输出.

I have tried to find code on how to calc variance, but I don't know if I get correct output.

平均值为 4692,中位数为 4533.我使用以下代码得到方差 1483780.469308:

The mean is 4692 and median is 4533. I get variance 1483780.469308 using the following code:

// size is the element count, in this case 1000000
// value_sum is __int64

double p2 = pow( (double)(value_sum - (value_sum/size)), (double)2.0 );
double variance = sqrt( (double)(p2 / (size-1)) );

我得到了一个合理的价值吗?

Am I getting a reasonable value?

计算有问题吗?

推荐答案

注意:这看起来不像是在计算方差.

Note: It doesn't look like you're calculating the variance.

方差是通过从每个元素中减去平均值并计算这些差异的加权和来计算的.

Variance is calculated by subtracting the mean from every element and calculating the weighted sum of these differences.

所以你需要做的是:

// Get mean
double mean = static_cast<double>(value_sum)/size;

// Calculate variance
double variance = 0;
for(int i = 0;i<size;++i) 
{
  variance += (MyArray[i]-mean)*(MyArray[i]-mean)/size;
}

// Display
cout<<variance;

请注意,这是样本方差,在潜在分布未知时使用(因此我们假设分布均匀).

Note that this is the sample variance, and is used when the underlying distribution is unknown (so we assume a uniform distribution).

此外,经过一番挖掘,我发现这不是一个无偏估计.Wolfram Alpha 对此有话要说,但作为一个例子,当 MATLAB 计算方差,它返回偏差校正样本方差".

Also, after some digging around, I found that this is not an unbiased estimator. Wolfram Alpha has something to say about this, but as an example, when MATLAB computes the variance, it returns the "bias-corrected sample variance".

偏差修正后的方差可以用每个元素除以size-1得到,或者:

The bias-corrected variance can be obtained by dividing by each element by size-1, or:

//Please check that size > 1
variance += (MyArray[i]-mean)*(MyArray[i]-mean)/(size-1); 

还要注意的是,mean 的值保持不变.

Also note that, the value of mean remains the same.

这篇关于计算大数字的方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Rising edge interrupt triggering multiple times on STM32 Nucleo(在STM32 Nucleo上多次触发上升沿中断)
How to use va_list correctly in a sequence of wrapper functions calls?(如何在一系列包装函数调用中正确使用 va_list?)
OpenGL Perspective Projection Clipping Polygon with Vertex Outside Frustum = Wrong texture mapping?(OpenGL透视投影裁剪多边形,顶点在视锥外=错误的纹理映射?)
How does one properly deserialize a byte array back into an object in C++?(如何正确地将字节数组反序列化回 C++ 中的对象?)
What free tiniest flash file system could you advice for embedded system?(您可以为嵌入式系统推荐什么免费的最小闪存文件系统?)
Volatile member variables vs. volatile object?(易失性成员变量与易失性对象?)