问题描述
我有一个定点 bignumber 库,想实现没有精度损失的快速阶乘.
I have a fixed-point bignumber library and want to implement fast factorial with no precision loss.
在纸上做了一些数学技巧后,我得到了这个公式:
After some math tricks on paper I got this formula:
(4N)!=((2N)!).((2N)!).{ (2N+1).(2N+3).(2N+5)...(4N-1) }.(2^N)/(N!)
这已经相当快了,而且通过一些编程技巧,复杂度接近~O(log(n))
.
This is already pretty fast, and with some programming tricks the complexity nears ~ O(log(n))
.
明确地说,我目前的实现是这样的:
To be clear, my current implementation is this:
//---------------------------------------------------------------------------
longnum fact(const DWORD &x,longnum &h) // h return (x>>1)! to speed up computation
{
if (x==0) { h=1; return 1; }
if (x==1) { h=1; return 1; }
if (x==2) { h=1; return 2; }
if (x==3) { h=1; return 6; }
if (x==4) { h=2; return 24; }
int N4,N2,N,i; longnum c,q;
N=(x>>2);
N2=N<<1;
N4=N<<2;
h=fact(N2,q); // get 2N! and N!
c=h*h; for (i=(N2+1)|1;i<=N4;i+=2) c*=i; c/=q; // c= ((2N!)^2)*T1 / N!
for (i=N4+1;i<=x;i++) c*=i; c.round(); c<<=N ; // convert 4N! -> x!, cut off precision losses
for (i=(N2+1)|1,N2=x>>1;i<=N2;i++) h*=i; h.round(); // convert 2N! -> (x/2)!, cut off precision losses
return c;
}
//---------------------------------------------------------------------------
longnum fact(const DWORD &x)
{
longnum tmp;
return fact(x,tmp);
}
//---------------------------------------------------------------------------
现在我的问题:
是否有一种快速获取
N!
从此术语:T1 = { (2N+1).(2N+3).(2N+5)...(4N-1) }
?
已经回答.
为了清楚起见,我需要提取这个未知术语:
So to be clear, I need to extract this unknown term:
T2 = (4N)! / (((2N)!).((2N)!))
所以:
(4N)! = (((2N)!).((2N)!)).T2
这会很有帮助,因为这样就不需要为阶乘计算 .../(N!)
.
This would help a lot because then it would not be needed to compute .../(N!)
for factorial.
T1
项总是整数分解为:
T1 = T2 * N!
最后,它打动了我 :) 我做了一个小程序来分解阶乘的素数,然后突然一切都变得清晰了:
Finally, it hit me :) I have done a little program for primes decomposition of factorials and then suddenly all becomes much clearer:
4! = 2!.2!.(2^1).(3^1) = 24
8! = 4!.4!.(2^1).(5^1).(7^1) = 40320
12! = 6!.6!.(2^2).(3^1).(7^1).(11^1) = 479001600
16! = 8!.8!.(2^1).(3^2).(5^1).(11^1).(13^1) = 20922789888000
20! = 10!.10!.(2^2).(11^1).(13^1).(17^1).(19^1) = 2432902008176640000
24! = 12!.12!.(2^2).(7^1).(13^1).(17^1).(19^1).(23^1) = 620448401733239439360000
28! = 14!.14!.(2^3).(3^3).(5^2).(17^1).(19^1).(23^1) = 304888344611713860501504000000
32! = 16!.16!.(2^1).(3^2).(5^1).(17^1).(19^1).(23^1).(29^1).(31^1) = 263130836933693530167218012160000000
36! = 18!.18!.(2^2).(3^1).(5^2).(7^1).(11^1).(19^1).(23^1).(29^1).(31^1) = 371993326789901217467999448150835200000000
40! = 20!.20!.(2^2).(3^2).(5^1).(7^1).(11^1).(13^1).(23^1).(29^1).(31^1).(37^1) = 815915283247897734345611269596115894272000000000
在分析了 T2
项(半阶乘之后的其余部分 ^ 2)的素指数之后,我推导出它们的公式:
After analyzing the prime exponents of the T2
term (the rest after half factorials ^ 2) I derive the formula for them:
T2(4N) = multiplication(i=2,3,5,7,11,13,17,...) of ( i ^ sum(j=1,2,3,4,5,...) of (4N/(i^j))-(2N/(i^j)) )
- 乘法是通过所有
质数<= 4N
- 求和是直到
i^j <= 4N
我如何计算:
exponent(i) = sum(j=1,2,3,4,5,...) of (N/(i^j))
有效吗?How can I compute this:
exponent(i) = sum(j=1,2,3,4,5,...) of (N/(i^j))
effectively?
问题是4N/(i^j)
和2N/(i^j)
的除法必须在整数数学 所以它们不能轻易简化.
The problem is that the divisions 4N/(i^j)
and 2N/(i^j)
must be done in integer math so they cannot be simplified easily.
所以我还有一个问题:
i
是任何质数,其中 i<=N
.应该很容易.
i
is any prime where i<=N
. It should be easy.
现在我像这样计算 T2(N)
项中素数 i
的指数 e
(但这对我来说太复杂了味道):
Now I calculate the exponent e
for prime i
inside the T2(N)
term like this (but this is too complex for my taste):
for (e=0,a=N/i,b=(N>>1)/i;(a)||(b);e+=a-b-b,a/=i,b/=i);
...我将尝试将 T2
实现到 fact(x)
并比较速度...
... I will try implement T2
into fact(x)
and compare speeds ...
推荐答案
我有一个解决方案:
(4N!)=((2N!)^2) . mul(i=all primes<=4N) of [i^sum(j=1,2,3,4,5,...4N>=i^j) of [(4N/(i^j))%2]]
T2
的子项总是 prime^exponent
其中指数可以在这样的小整数上计算:
sub-terms of T2
are always prime^exponent
where exponent can be computed on small integers like this:
for (e=0,j=N4;j;e+=j&1,j/=p);
其中 e
是指数,p
是质数,N4
是 4*N
where e
is exponent, p
is prime and N4
is 4*N
新方程的代码:
// edit beg:
// Sorry, forget to copy sorted list of all primes up to max n here it is
// end of table is marked with 0
// Primes are in DWORDs so they only 4Byte per number
// so the table is very small compared with lookup table for the same max n!
// and also primes are needed for many other routines in bignum
// can compute n! for n <= max prime in table
DWORD _arithmetics_primes[]={2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,0};
// edit end.
longnum fact(const DWORD &x)
{
if (x<=4)
{
if (x==4) return 24;
if (x==3) return 6;
if (x==2) return 2;
if (x==1) return 1;
if (x==0) return 1;
}
int N4,N2,p,i,j,e; longnum c,pp;
N4=(x>>2)<<2;
N2=N4>>1;
c=fact(N2); c*=c; // c=((2N)!)^2;
for (i=0;;i++) // c*= T2
{
p=_arithmetics_primes[i];
if (!p) break;
if (p>N4) break;
for (e=0,j=N4;j;e+=j&1,j/=p);
if (e) // c*=p^e
{
if (p==2) c<<=e;
else for (pp=p;;)
{
if (int(e&1)) c*=pp;
e>>=1; if (!e) break;
pp*=pp;
}
}
}
for (i=N4+1;i<=x;i++) { c*=i; } c.round();
return c;
}
这里是前 128 个阶乘的粗略时间测量,以便您可以估计实际复杂性.
Here are rough time measurements for the first 128 factorials so you can estimate real complexity.
Fixed point 768.128 bits arithmetics ... 231.36 decimals.
[ 0.001 ms ] 1! = 1
[ 0.000 ms ] 2! = 2
[ 0.000 ms ] 3! = 6
[ 0.000 ms ] 4! = 24
[ 0.006 ms ] 5! = 120
[ 0.006 ms ] 6! = 720
[ 0.007 ms ] 7! = 5040
[ 0.005 ms ] 8! = 40320
[ 0.006 ms ] 9! = 362880
[ 0.007 ms ] 10! = 3628800
[ 0.008 ms ] 11! = 39916800
[ 0.012 ms ] 12! = 479001600
[ 0.013 ms ] 13! = 6227020800
[ 0.014 ms ] 14! = 87178291200
[ 0.016 ms ] 15! = 1307674368000
[ 0.014 ms ] 16! = 20922789888000
[ 0.015 ms ] 17! = 355687428096000
[ 0.017 ms ] 18! = 6402373705728000
[ 0.019 ms ] 19! = 121645100408832000
[ 0.016 ms ] 20! = 2432902008176640000
[ 0.017 ms ] 21! = 51090942171709440000
[ 0.019 ms ] 22! = 1124000727777607680000
[ 0.021 ms ] 23! = 25852016738884976640000
[ 0.023 ms ] 24! = 620448401733239439360000
[ 0.025 ms ] 25! = 15511210043330985984000000
[ 0.027 ms ] 26! = 403291461126605635584000000
[ 0.029 ms ] 27! = 10888869450418352160768000000
[ 0.032 ms ] 28! = 304888344611713860501504000000
[ 0.034 ms ] 29! = 8841761993739701954543616000000
[ 0.037 ms ] 30! = 265252859812191058636308480000000
[ 0.039 ms ] 31! = 8222838654177922817725562880000000
[ 0.034 ms ] 32! = 263130836933693530167218012160000000
[ 0.037 ms ] 33! = 8683317618811886495518194401280000000
[ 0.039 ms ] 34! = 295232799039604140847618609643520000000
[ 0.041 ms ] 35! = 10333147966386144929666651337523200000000
[ 0.039 ms ] 36! = 371993326789901217467999448150835200000000
[ 0.041 ms ] 37! = 13763753091226345046315979581580902400000000
[ 0.044 ms ] 38! = 523022617466601111760007224100074291200000000
[ 0.046 ms ] 39! = 20397882081197443358640281739902897356800000000
[ 0.041 ms ] 40! = 815915283247897734345611269596115894272000000000
[ 0.044 ms ] 41! = 33452526613163807108170062053440751665152000000000
[ 0.046 ms ] 42! = 1405006117752879898543142606244511569936384000000000
[ 0.049 ms ] 43! = 60415263063373835637355132068513997507264512000000000
[ 0.048 ms ] 44! = 2658271574788448768043625811014615890319638528000000000
[ 0.050 ms ] 45! = 119622220865480194561963161495657715064383733760000000000
[ 0.054 ms ] 46! = 5502622159812088949850305428800254892961651752960000000000
[ 0.056 ms ] 47! = 258623241511168180642964355153611979969197632389120000000000
[ 0.056 ms ] 48! = 12413915592536072670862289047373375038521486354677760000000000
[ 0.060 ms ] 49! = 608281864034267560872252163321295376887552831379210240000000000
[ 0.063 ms ] 50! = 30414093201713378043612608166064768844377641568960512000000000000
[ 0.066 ms ] 51! = 1551118753287382280224243016469303211063259720016986112000000000000
[ 0.065 ms ] 52! = 80658175170943878571660636856403766975289505440883277824000000000000
[ 0.069 ms ] 53! = 4274883284060025564298013753389399649690343788366813724672000000000000
[ 0.072 ms ] 54! = 230843697339241380472092742683027581083278564571807941132288000000000000
[ 0.076 ms ] 55! = 12696403353658275925965100847566516959580321051449436762275840000000000000
[ 0.077 ms ] 56! = 710998587804863451854045647463724949736497978881168458687447040000000000000
[ 0.162 ms ] 57! = 40526919504877216755680601905432322134980384796226602145184481280000000000000
[ 0.095 ms ] 58! = 2350561331282878571829474910515074683828862318181142924420699914240000000000000
[ 0.093 ms ] 59! = 138683118545689835737939019720389406345902876772687432540821294940160000000000000
[ 0.089 ms ] 60! = 8320987112741390144276341183223364380754172606361245952449277696409600000000000000
[ 0.093 ms ] 61! = 507580213877224798800856812176625227226004528988036003099405939480985600000000000000
[ 0.098 ms ] 62! = 31469973260387937525653122354950764088012280797258232192163168247821107200000000000000
[ 0.096 ms ] 63! = 1982608315404440064116146708361898137544773690227268628106279599612729753600000000000000
[ 0.090 ms ] 64! = 126886932185884164103433389335161480802865516174545192198801894375214704230400000000000000
[ 0.100 ms ] 65! = 8247650592082470666723170306785496252186258551345437492922123134388955774976000000000000000
[ 0.104 ms ] 66! = 544344939077443064003729240247842752644293064388798874532860126869671081148416000000000000000
[ 0.111 ms ] 67! = 36471110918188685288249859096605464427167635314049524593701628500267962436943872000000000000000
[ 0.100 ms ] 68! = 2480035542436830599600990418569171581047399201355367672371710738018221445712183296000000000000000
[ 0.121 ms ] 69! = 171122452428141311372468338881272839092270544893520369393648040923257279754140647424000000000000000
[ 0.109 ms ] 70! = 11978571669969891796072783721689098736458938142546425857555362864628009582789845319680000000000000000
[ 0.119 ms ] 71! = 850478588567862317521167644239926010288584608120796235886430763388588680378079017697280000000000000000
[ 0.104 ms ] 72! = 61234458376886086861524070385274672740778091784697328983823014963978384987221689274204160000000000000000
[ 0.124 ms ] 73! = 4470115461512684340891257138125051110076800700282905015819080092370422104067183317016903680000000000000000
[ 0.113 ms ] 74! = 330788544151938641225953028221253782145683251820934971170611926835411235700971565459250872320000000000000000
[ 0.118 ms ] 75! = 24809140811395398091946477116594033660926243886570122837795894512655842677572867409443815424000000000000000000
[ 0.118 ms ] 76! = 1885494701666050254987932260861146558230394535379329335672487982961844043495537923117729972224000000000000000000
[ 0.123 ms ] 77! = 145183092028285869634070784086308284983740379224208358846781574688061991349156420080065207861248000000000000000000
[ 0.129 ms ] 78! = 11324281178206297831457521158732046228731749579488251990048962825668835325234200766245086213177344000000000000000000
[ 0.133 ms ] 79! = 894618213078297528685144171539831652069808216779571907213868063227837990693501860533361810841010176000000000000000000
[ 0.121 ms ] 80! = 71569457046263802294811533723186532165584657342365752577109445058227039255480148842668944867280814080000000000000000000
[ 0.119 ms ] 81! = 5797126020747367985879734231578109105412357244731625958745865049716390179693892056256184534249745940480000000000000000000
[ 0.131 ms ] 82! = 475364333701284174842138206989404946643813294067993328617160934076743994734899148613007131808479167119360000000000000000000
[ 0.150 ms ] 83! = 39455239697206586511897471180120610571436503407643446275224357528369751562996629334879591940103770870906880000000000000000000
[ 0.141 ms ] 84! = 3314240134565353266999387579130131288000666286242049487118846032383059131291716864129885722968716753156177920000000000000000000
[ 0.148 ms ] 85! = 281710411438055027694947944226061159480056634330574206405101912752560026159795933451040286452340924018275123200000000000000000000
[ 0.154 ms ] 86! = 24227095383672732381765523203441259715284870552429381750838764496720162249742450276789464634901319465571660595200000000000000000000
[ 0.163 ms ] 87! = 2107757298379527717213600518699389595229783738061356212322972511214654115727593174080683423236414793504734471782400000000000000000000
[ 0.211 ms ] 88! = 185482642257398439114796845645546284380220968949399346684421580986889562184028199319100141244804501828416633516851200000000000000000000
[ 0.151 ms ] 89! = 16507955160908461081216919262453619309839666236496541854913520707833171034378509739399912570787600662729080382999756800000000000000000000
[ 0.157 ms ] 90! = 1485715964481761497309522733620825737885569961284688766942216863704985393094065876545992131370884059645617234469978112000000000000000000000
[ 0.166 ms ] 91! = 135200152767840296255166568759495142147586866476906677791741734597153670771559994765685283954750449427751168336768008192000000000000000000000
[ 0.161 ms ] 92! = 12438414054641307255475324325873553077577991715875414356840239582938137710983519518443046123837041347353107486982656753664000000000000000000000
[ 0.169 ms ] 93! = 1156772507081641574759205162306240436214753229576413535186142281213246807121467315215203289516844845303838996289387078090752000000000000000000000
[ 0.173 ms ] 94! = 108736615665674308027365285256786601004186803580182872307497374434045199869417927630229109214583415458560865651202385340530688000000000000000000000
[ 0.188 ms ] 95! = 10329978488239059262599702099394727095397746340117372869212250571234293987594703124871765375385424468563282236864226607350415360000000000000000000000
[ 0.181 ms ] 96! = 991677934870949689209571401541893801158183648651267795444376054838492222809091499987689476037000748982075094738965754305639874560000000000000000000000
[ 0.187 ms ] 97! = 96192759682482119853328425949563698712343813919172976158104477319333745612481875498805879175589072651261284189679678167647067832320000000000000000000000
[ 0.194 ms ] 98! = 9426890448883247745626185743057242473809693764078951663494238777294707070023223798882976159207729119823605850588608460429412647567360000000000000000000000
[ 0.201 ms ] 99! = 933262154439441526816992388562667004907159682643816214685929638952175999932299156089414639761565182862536979208272237582511852109168640000000000000000000000
[ 0.185 ms ] 100! = 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
[ 0.191 ms ] 101! = 9425947759838359420851623124482936749562312794702543768327889353416977599316221476503087861591808346911623490003549599583369706302603264000000000000000000000000
[ 0.202 ms ] 102! = 961446671503512660926865558697259548455355905059659464369444714048531715130254590603314961882364451384985595980362059157503710042865532928000000000000000000000000
[ 0.207 ms ] 103! = 99029007164861804075467152545817733490901658221144924830052805546998766658416222832141441073883538492653516385977292093222882134415149891584000000000000000000000000
[ 0.242 ms ] 104! = 10299016745145627623848583864765044283053772454999072182325491776887871732475287174542709871683888003235965704141638377695179741979175588724736000000000000000000000000
[ 0.210 ms ] 105! = 1081396758240290900504101305800329649720646107774902579144176636573226531909905153326984536526808240339776398934872029657993872907813436816097280000000000000000000000000
[ 0.215 ms ] 106! = 114628056373470835453434738414834942870388487424139673389282723476762012382449946252660360871841673476016298287096435143747350528228224302506311680000000000000000000000000
[ 0.221 ms ] 107! = 12265202031961379393517517010387338887131568154382945052653251412013535324922144249034658613287059061933743916719318560380966506520420000368175349760000000000000000000000000
[ 0.217 ms ] 108! = 1324641819451828974499891837121832599810209360673358065686551152497461815091591578895743130235002378688844343005686404521144382704205360039762937774080000000000000000000000000
[ 0.226 ms ] 109! = 144385958320249358220488210246279753379312820313396029159834075622223337844983482099636001195615259277084033387619818092804737714758384244334160217374720000000000000000000000000
[ 0.232 ms ] 110! = 15882455415227429404253703127090772871724410234473563207581748318444567162948183030959960131517678520479243672638179990208521148623422266876757623911219200000000000000000000000000
[ 0.240 ms ] 111! = 1762952551090244663872161047107075788761409536026565516041574063347346955087248316436555574598462315773196047662837978913145847497199871623320096254145331200000000000000000000000000
[ 0.213 ms ] 112! = 197450685722107402353682037275992488341277868034975337796656295094902858969771811440894224355027779366597957338237853638272334919686385621811850780464277094400000000000000000000000000
[ 0.231 ms ] 113! = 22311927486598136465966070212187151182564399087952213171022161345724023063584214692821047352118139068425569179220877461124773845924561575264739138192463311667200000000000000000000000000
[ 0.240 ms ] 114! = 2543559733472187557120132004189335234812341496026552301496526393412538629248600474981599398141467853800514886431180030568224218435400019580180261753940817530060800000000000000000000000000
[ 0.252 ms ] 115! = 292509369349301569068815180481773552003419272043053514672100535242441942363589054622883930786268803187059211939585703515345785120071002251720730101703194015956992000000000000000000000000000
[ 0.248 ms ] 116! = 33931086844518982011982560935885732032396635556994207701963662088123265314176330336254535971207181169698868584991941607780111073928236261199604691797570505851011072000000000000000000000000000
[ 0.598 ms ] 117! = 3969937160808720895401959629498630647790406360168322301129748464310422041758630649341780708631240196854767624444057168110272995649603642560353748940315749184568295424000000000000000000000000000
[ 0.259 ms ] 118! = 468452584975429065657431236280838416439267950499862031533310318788629800927518416622330123618486343228862579684398745837012213486653229822121742374957258403779058860032000000000000000000000000000
[ 0.261 ms ] 119! = 55745857612076058813234317117419771556272886109483581752463927935846946310374691578057284710599874844234646982443450754604453404911734348832487342619913750049708004343808000000000000000000000000000
[ 0.254 ms ] 120! = 6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000
[ 0.263 ms ] 121! = 809429852527344373968162284544935082997082306309701607045776233628497660426640521713391773997910182738287074185078904956856663439318382745047716214841147650721760223072092160000000000000000000000000000
[ 0.270 ms ] 122! = 98750442008336013624115798714482080125644041369783596059584700502676714572050143649033796427745042294071023050579626404736512939596842694895821378210620013388054747214795243520000000000000000000000000000
[ 0.281 ms ] 123! = 12146304367025329675766243241881295855454217088483382315328918161829235892362167668831156960612640202170735835221294047782591091570411651472186029519906261646730733907419814952960000000000000000000000000000
[ 0.290 ms ] 124! = 1506141741511140879795014161993280686076322918971939407100785852066825250652908790935063463115967385069171243567440461925041295354731044782551067660468376444194611004520057054167040000000000000000000000000000
[ 0.322 ms ] 125! = 188267717688892609974376770249160085759540364871492425887598231508353156331613598866882932889495923133646405445930057740630161919341380597818883457558547055524326375565007131770880000000000000000000000000000000
[ 0.303 ms ] 126! = 23721732428800468856771473051394170805702085973808045661837377170052497697783313457227249544076486314839447086187187275319400401837013955325179315652376928996065123321190898603130880000000000000000000000000000000
[ 0.313 ms ] 127! = 3012660018457659544809977077527059692324164918673621799053346900596667207618480809067860692097713761984609779945772783965563851033300772326297773087851869982500270661791244122597621760000000000000000000000000000000
[ 0.307 ms ] 128! = 385620482362580421735677065923463640617493109590223590278828403276373402575165543560686168588507361534030051833058916347592172932262498857766114955245039357760034644709279247692495585280000000000000000000000000000000
refernce 128! = 385620482362580421735677065923463640617493109590223590278828403276373402575165543560686168588507361534030051833058916347592172932262498857766114955245039357760034644709279247692495585280000000000000000000000000000000
我的测量表明 N!
使用
- 最大
2.2N
个快速低级长操作(+,-,<<<,>
) - 比
N/2
次长的乘法略少,但大多大小方便,可以加快乘法,所以测得的次数与明显的O(N/2*O(乘法))
.使用时间分析后,观察到的复杂度最匹配O(N.log^2(N))
但错误仍然太大,所以它很可能是一些稍微奇怪的函数......
- max of
2.2N
fast low level long operations (+,-,<<,>>
) - slightly less than
N/2
long multiplications, but most of them are convenient in size which speeds up the multiplication, so the measured times do not match the obviousO(N/2*O(multiplication))
. After using analysis on the times the observed complexity best matchesO(N.log^2(N))
but still the error is too big so it will be most likely some slightly weirder function...
我也尝试过阶乘作为素数的非递归乘法(类似于 T2
项),但结果要慢得多.
Also I have tried factorial as non recursive multiplication of primes only (similar to T2
term), but the results was much slower.
PS: 问题中发布的代码也 100%
工作,但比新代码慢(即使它使用较少的乘法 - 因为递归需要更多内存并没有优化乘法顺序).
P.S.: Code posted in the question is also 100%
working, but slower than new one (even if it uses fewer multiplications - because of more memory needed for recursion and not optimized multiplicants order).
这篇关于快速精确 bigint 阶乘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!