----boundary-LibPST-iamunique-588113885_-_-
Content-type: text/plain
Dear Rex
You must be really having a fast computer.
When I tried your code using global index variables it took 84s.
Changing to local variables except for the big arrays accelerated the speed
by a factor 10!
This is because appearently BB allocates register variables when available
for local variables, BUT NOT for global ones.
You should be careful with the use of global variables as they are meant to
represent the state of a module and not much more.
Unfortunately the BB Compiler does not do many optimisations. It very much
unfolds constant expressions and as I discovered, now, uses registers as
local variables. But for example it doesn't reuse results of identical
operations even within short distance, i.e. your i-j.
I by the way have started looking into the code of the compiler and have
added some small syntactical features.
I had even considered making it a personal project of integrating some
advanced intermediate program representation like Static Single Assignment
in order to stepwise add more optimisations.
But the thing is, that as soon as memory becomes the bottle neck there is
not use in compiling better.
And if the CPU only has to wait for Memory, than things are still ok.,
mostly it waits for the IO anyway!
That is why I'm working on the framework allowing to improve development
speed, which is much more crucial than code performance, which anyway by 90%
depends on the programmer and not on the comiler.
Marco (mc)
-----Original Message-----
From: blackbox{([at]})nowhere.xy
Couture
Sent: Friday, November 03, 2006 22:05
To: BlackBox Mailing List
Subject: [BlackBox] - Speed test. Clarification
That ought to teach me not to post partial code. The complete
programs are posted below (I added timers).
I made the arrays big, (1) so I didn't have to bother with timers in
three compilers, and (2) to make sure they wouldn't fit in the cache. For
comparing compilers, I wanted to avoid erratic results due to the cache.
I like Wojtek's program, though. Caching makes almost a threefold
improvement in speed on my computer, so yes, memory is the bottleneck.
I played a little more with unrolling the loop. Unrolling 5 times
gave an 18% improvement with an array length of 100, but it was 2% worse
with a length of 1,000,000. Your mileage may vary.
There is one other inconvenient truth, however. With the FORTRAN
program, with an array size of around 15,000 or lower (number of arithmetic
operations constant), so everything fit in the cache, the operation was
astonishingly fast -- 2 1/2 seconds, compared to 10 seconds at best for BB.
But again, that's an unfair comparison, because with range checking, FORTRAN
takes 10 minutes 30 seconds!
Rex Couture
=======
MODULE TestTime;
IMPORT StdLog, Services;
VAR
a, b: ARRAY 1000000 OF REAL;
i, j: INTEGER;
sum: REAL;
t0: LONGINT;
PROCEDURE Do*;
BEGIN
t0:= Services.Ticks();
sum:= 0;
FOR j:= 0 TO 199 DO
FOR i:= 0 TO 999999 DO
a[i]:= i-j;
b[i]:= i-j;
END;
FOR i:= 0 TO 999999 DO
sum:= sum +a[i]*b[i];
END;
END;
StdLog.Int((Services.Ticks() - t0));
StdLog.Real( sum); StdLog.Ln
END Do;
========
MODULE TestTime2; (* Unrolls loop *)
IMPORT StdLog, Services;
VAR
a, b: ARRAY 1000000 OF REAL;
i, j: INTEGER;
sum: REAL;
t0: LONGINT;
PROCEDURE Do*;
BEGIN
t0:= Services.Ticks();
sum:= 0;
FOR j:= 0 TO 199 DO
FOR i:= 0 TO 999999 BY 5 DO
a[i]:= i-j;
b[i]:= i-j;
a[i+1]:= i-j;
b[i+1]:= i-j;
a[i+2]:= i-j;
b[i+2]:= i-j;
a[i+3]:= i-j;
b[i+3]:= i-j;
a[i+4]:= i-j;
b[i+4]:= i-j;
END;
FOR i:= 0 TO 999999 BY 5 DO
sum:= sum +a[i]*b[i];
sum:= sum +a[i+1]*b[i+1];
sum:= sum +a[i+2]*b[i+2];
sum:= sum +a[i+3]*b[i+3];
sum:= sum +a[i+4]*b[i+4];
END;
END;
StdLog.Int((Services.Ticks() - t0));
StdLog.Real( sum); StdLog.Ln
END Do;
END TestTime2.
--- BlackBox
--- send subject HELP or UNSUBSCRIBE to blackbox{([at]})nowhere.xy
--- BlackBox
--- send subject HELP or UNSUBSCRIBE to blackbox{([at]})nowhere.xy
----boundary-LibPST-iamunique-588113885_-_-
Content-type: application/rtf
Content-transfer-encoding: base64
Content-Disposition: attachment; filename="rtf-body.rtf"
e1xydGYxXGFuc2lcYW5zaWNwZzEyNTJcZnJvbXRleHQgXGRlZmYwe1xmb250dGJsDQp7XGYwXGZz
d2lzcyBBcmlhbDt9DQp7XGYxXGZtb2Rlcm4gQ291cmllciBOZXc7fQ0Ke1xmMlxmbmlsXGZjaGFy
c2V0MiBTeW1ib2w7fQ0Ke1xmM1xmbW9kZXJuXGZjaGFyc2V0MCBDb3VyaWVyIE5ldzt9fQ0Ke1xj
b2xvcnRibFxyZWQwXGdyZWVuMFxibHVlMDtccmVkMFxncmVlbjBcYmx1ZTI1NTt9DQpcdWMxXHBh
cmRccGxhaW5cZGVmdGFiMzYwIFxmMFxmczIwIERlYXIgUmV4XHBhcg0KXHBhcg0KWW91IG11c3Qg
YmUgcmVhbGx5IGhhdmluZyBhIGZhc3QgY29tcHV0ZXIuXHBhcg0KV2hlbiBJIHRyaWVkIHlvdXIg
Y29kZSB1c2luZyBnbG9iYWwgaW5kZXggdmFyaWFibGVzIGl0IHRvb2sgODRzLlxwYXINCkNoYW5n
aW5nIHRvIGxvY2FsIHZhcmlhYmxlcyBleGNlcHQgZm9yIHRoZSBiaWcgYXJyYXlzIGFjY2VsZXJh
dGVkIHRoZSBzcGVlZFxwYXINCmJ5IGEgZmFjdG9yIDEwIVxwYXINClRoaXMgaXMgYmVjYXVzZSBh
cHBlYXJlbnRseSBCQiBhbGxvY2F0ZXMgcmVnaXN0ZXIgdmFyaWFibGVzIHdoZW4gYXZhaWxhYmxl
XHBhcg0KZm9yIGxvY2FsIHZhcmlhYmxlcywgQlVUIE5PVCBmb3IgZ2xvYmFsIG9uZXMuXHBhcg0K
XHBhcg0KWW91IHNob3VsZCBiZSBjYXJlZnVsIHdpdGggdGhlIHVzZSBvZiBnbG9iYWwgdmFyaWFi
bGVzIGFzIHRoZXkgYXJlIG1lYW50IHRvXHBhcg0KcmVwcmVzZW50IHRoZSBzdGF0ZSBvZiBhIG1v
ZHVsZSBhbmQgbm90IG11Y2ggbW9yZS5ccGFyDQpccGFyDQpVbmZvcnR1bmF0ZWx5IHRoZSBCQiBD
b21waWxlciBkb2VzIG5vdCBkbyBtYW55IG9wdGltaXNhdGlvbnMuIEl0IHZlcnkgbXVjaFxwYXIN
CnVuZm9sZHMgY29uc3RhbnQgZXhwcmVzc2lvbnMgYW5kIGFzIEkgZGlzY292ZXJlZCwgbm93LCB1
c2VzIHJlZ2lzdGVycyBhc1xwYXINCmxvY2FsIHZhcmlhYmxlcy4gQnV0IGZvciBleGFtcGxlIGl0
IGRvZXNuJ3QgcmV1c2UgcmVzdWx0cyBvZiBpZGVudGljYWxccGFyDQpvcGVyYXRpb25zIGV2ZW4g
d2l0aGluIHNob3J0IGRpc3RhbmNlLCBpLmUuIHlvdXIgaS1qLlxwYXINClxwYXINCkkgYnkgdGhl
IHdheSBoYXZlIHN0YXJ0ZWQgbG9va2luZyBpbnRvIHRoZSBjb2RlIG9mIHRoZSBjb21waWxlciBh
bmQgaGF2ZVxwYXINCmFkZGVkIHNvbWUgc21hbGwgc3ludGFjdGljYWwgZmVhdHVyZXMuXHBhcg0K
SSBoYWQgZXZlbiBjb25zaWRlcmVkIG1ha2luZyBpdCBhIHBlcnNvbmFsIHByb2plY3Qgb2YgaW50
ZWdyYXRpbmcgc29tZVxwYXINCmFkdmFuY2VkIGludGVybWVkaWF0ZSBwcm9ncmFtIHJlcHJlc2Vu
dGF0aW9uIGxpa2UgU3RhdGljIFNpbmdsZSBBc3NpZ25tZW50XHBhcg0KaW4gb3JkZXIgdG8gc3Rl
cHdpc2UgYWRkIG1vcmUgb3B0aW1pc2F0aW9ucy5ccGFyDQpCdXQgdGhlIHRoaW5nIGlzLCB0aGF0
IGFzIHNvb24gYXMgbWVtb3J5IGJlY29tZXMgdGhlIGJvdHRsZSBuZWNrIHRoZXJlIGlzXHBhcg0K
bm90IHVzZSBpbiBjb21waWxpbmcgYmV0dGVyLlxwYXINCkFuZCBpZiB0aGUgQ1BVIG9ubHkgaGFz
IHRvIHdhaXQgZm9yIE1lbW9yeSwgdGhhbiB0aGluZ3MgYXJlIHN0aWxsIG9rLixccGFyDQptb3N0
bHkgaXQgd2FpdHMgZm9yIHRoZSBJTyBhbnl3YXkhXHBhcg0KXHBhcg0KVGhhdCBpcyB3aHkgSSdt
IHdvcmtpbmcgb24gdGhlIGZyYW1ld29yayBhbGxvd2luZyB0byBpbXByb3ZlIGRldmVsb3BtZW50
XHBhcg0Kc3BlZWQsIHdoaWNoIGlzIG11Y2ggbW9yZSBjcnVjaWFsIHRoYW4gY29kZSBwZXJmb3Jt
YW5jZSwgd2hpY2ggYW55d2F5IGJ5IDkwJVxwYXINCmRlcGVuZHMgb24gdGhlIHByb2dyYW1tZXIg
YW5kIG5vdCBvbiB0aGUgY29taWxlci5ccGFyDQpccGFyDQpNYXJjbyAobWMpXHBhcg0KXHBhcg0K
XHBhcg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS1ccGFyDQpGcm9tOiBibGFja2JveEBvYmVy
b24uY2ggW21haWx0bzpibGFja2JveEBvYmVyb24uY2hdT24gQmVoYWxmIE9mIFJleFxwYXINCkNv
dXR1cmVccGFyDQpTZW50OiBGcmlkYXksIE5vdmVtYmVyIDAzLCAyMDA2IDIyOjA1XHBhcg0KVG86
IEJsYWNrQm94IE1haWxpbmcgTGlzdFxwYXINClN1YmplY3Q6IFtCbGFja0JveF0gLSBTcGVlZCB0
ZXN0LiBDbGFyaWZpY2F0aW9uXHBhcg0KXHBhcg0KXHBhcg0KICAgICAgICBUaGF0IG91Z2h0IHRv
IHRlYWNoIG1lIG5vdCB0byBwb3N0IHBhcnRpYWwgY29kZS4gIFRoZSBjb21wbGV0ZVxwYXINCnBy
b2dyYW1zIGFyZSBwb3N0ZWQgYmVsb3cgKEkgYWRkZWQgdGltZXJzKS5ccGFyDQpccGFyDQogICAg
ICAgIEkgbWFkZSB0aGUgYXJyYXlzIGJpZywgKDEpIHNvIEkgZGlkbid0IGhhdmUgdG8gYm90aGVy
IHdpdGggdGltZXJzIGluXHBhcg0KdGhyZWUgY29tcGlsZXJzLCBhbmQgKDIpIHRvIG1ha2Ugc3Vy
ZSB0aGV5IHdvdWxkbid0IGZpdCBpbiB0aGUgY2FjaGUuICBGb3JccGFyDQpjb21wYXJpbmcgY29t
cGlsZXJzLCBJIHdhbnRlZCB0byBhdm9pZCBlcnJhdGljIHJlc3VsdHMgZHVlIHRvIHRoZSBjYWNo
ZS5ccGFyDQpccGFyDQogICAgICAgIEkgbGlrZSBXb2p0ZWsncyBwcm9ncmFtLCB0aG91Z2guICBD
YWNoaW5nIG1ha2VzIGFsbW9zdCBhIHRocmVlZm9sZFxwYXINCmltcHJvdmVtZW50IGluIHNwZWVk
IG9uIG15IGNvbXB1dGVyLCBzbyB5ZXMsIG1lbW9yeSBpcyB0aGUgYm90dGxlbmVjay5ccGFyDQpc
cGFyDQogICAgICAgIEkgcGxheWVkIGEgbGl0dGxlIG1vcmUgd2l0aCB1bnJvbGxpbmcgdGhlIGxv
b3AuICBVbnJvbGxpbmcgNSB0aW1lc1xwYXINCmdhdmUgYW4gMTglIGltcHJvdmVtZW50IHdpdGgg
YW4gYXJyYXkgbGVuZ3RoIG9mIDEwMCwgYnV0IGl0IHdhcyAyJSB3b3JzZVxwYXINCndpdGggYSBs
ZW5ndGggb2YgMSwwMDAsMDAwLiAgWW91ciBtaWxlYWdlIG1heSB2YXJ5LlxwYXINClxwYXINCiAg
ICAgICAgVGhlcmUgaXMgb25lIG90aGVyIGluY29udmVuaWVudCB0cnV0aCwgaG93ZXZlci4gIFdp
dGggdGhlIEZPUlRSQU5ccGFyDQpwcm9ncmFtLCB3aXRoIGFuIGFycmF5IHNpemUgb2YgYXJvdW5k
IDE1LDAwMCBvciBsb3dlciAobnVtYmVyIG9mIGFyaXRobWV0aWNccGFyDQpvcGVyYXRpb25zIGNv
bnN0YW50KSwgc28gZXZlcnl0aGluZyBmaXQgaW4gdGhlIGNhY2hlLCB0aGUgb3BlcmF0aW9uIHdh
c1xwYXINCmFzdG9uaXNoaW5nbHkgZmFzdCAtLSAyIDEvMiBzZWNvbmRzLCBjb21wYXJlZCB0byAx
MCBzZWNvbmRzIGF0IGJlc3QgZm9yIEJCLlxwYXINCkJ1dCBhZ2FpbiwgdGhhdCdzIGFuIHVuZmFp
ciBjb21wYXJpc29uLCBiZWNhdXNlIHdpdGggcmFuZ2UgY2hlY2tpbmcsIEZPUlRSQU5ccGFyDQp0
YWtlcyAxMCBtaW51dGVzIDMwIHNlY29uZHMhXHBhcg0KXHBhcg0KUmV4IENvdXR1cmVccGFyDQpc
cGFyDQo9PT09PT09XHBhcg0KTU9EVUxFIFRlc3RUaW1lO1xwYXINCklNUE9SVCBTdGRMb2csIFNl
cnZpY2VzO1xwYXINClxwYXINClZBUlxwYXINCiAgICAgICAgYSwgYjogQVJSQVkgMTAwMDAwMCBP
RiBSRUFMO1xwYXINCiAgICAgICAgaSwgajogSU5URUdFUjtccGFyDQogICAgICAgIHN1bTogUkVB
TDtccGFyDQogICAgICAgIHQwOiBMT05HSU5UO1xwYXINClxwYXINClBST0NFRFVSRSBEbyo7XHBh
cg0KQkVHSU5ccGFyDQp0MDo9IFNlcnZpY2VzLlRpY2tzKCk7XHBhcg0Kc3VtOj0gMDtccGFyDQpG
T1Igajo9IDAgVE8gMTk5IERPXHBhcg0KICAgICAgICBGT1IgaTo9IDAgVE8gOTk5OTk5IERPXHBh
cg0KICAgICAgICAgICAgICAgIGFbaV06PSBpLWo7XHBhcg0KICAgICAgICAgICAgICAgIGJbaV06
PSBpLWo7XHBhcg0KICAgICAgICAgICAgICAgIEVORDtccGFyDQogICAgICAgIEZPUiBpOj0gMCBU
TyA5OTk5OTkgRE9ccGFyDQogICAgICAgICAgICAgICAgc3VtOj0gc3VtICthW2ldKmJbaV07XHBh
cg0KICAgICAgICAgICAgICAgIEVORDtccGFyDQogICAgICAgIEVORDtccGFyDQpTdGRMb2cuSW50
KChTZXJ2aWNlcy5UaWNrcygpIC0gdDApKTtccGFyDQpTdGRMb2cuUmVhbCggc3VtKTsgU3RkTG9n
LkxuXHBhcg0KRU5EIERvO1xwYXINClxwYXINCj09PT09PT09XHBhcg0KTU9EVUxFIFRlc3RUaW1l
MjsgKCogVW5yb2xscyBsb29wICopXHBhcg0KXHBhcg0KSU1QT1JUIFN0ZExvZywgU2VydmljZXM7
XHBhcg0KXHBhcg0KVkFSXHBhcg0KICAgICAgICBhLCBiOiBBUlJBWSAxMDAwMDAwIE9GIFJFQUw7
XHBhcg0KICAgICAgICBpLCBqOiBJTlRFR0VSO1xwYXINCiAgICAgICAgc3VtOiBSRUFMO1xwYXIN
CiAgICAgICAgdDA6IExPTkdJTlQ7XHBhcg0KXHBhcg0KUFJPQ0VEVVJFIERvKjtccGFyDQpCRUdJ
TlxwYXINCnQwOj0gU2VydmljZXMuVGlja3MoKTtccGFyDQpzdW06PSAwO1xwYXINCkZPUiBqOj0g
MCBUTyAxOTkgRE9ccGFyDQogICAgICAgIEZPUiBpOj0gMCBUTyA5OTk5OTkgQlkgNSBET1xwYXIN
CiAgICAgICAgICAgICAgICBhW2ldOj0gaS1qO1xwYXINCiAgICAgICAgICAgICAgICBiW2ldOj0g
aS1qO1xwYXINCiAgICAgICAgICAgICAgICBhW2krMV06PSBpLWo7XHBhcg0KICAgICAgICAgICAg
ICAgIGJbaSsxXTo9IGktajtccGFyDQogICAgICAgICAgICAgICAgYVtpKzJdOj0gaS1qO1xwYXIN
CiAgICAgICAgICAgICAgICBiW2krMl06PSBpLWo7XHBhcg0KICAgICAgICAgICAgICAgIGFbaSsz
XTo9IGktajtccGFyDQogICAgICAgICAgICAgICAgYltpKzNdOj0gaS1qO1xwYXINCiAgICAgICAg
ICAgICAgICBhW2krNF06PSBpLWo7XHBhcg0KICAgICAgICAgICAgICAgIGJbaSs0XTo9IGktajtc
cGFyDQogICAgICAgICAgICAgICAgRU5EO1xwYXINCiAgICAgICAgRk9SIGk6PSAwIFRPIDk5OTk5
OSBCWSA1IERPXHBhcg0KICAgICAgICAgICAgICAgIHN1bTo9IHN1bSArYVtpXSpiW2ldO1xwYXIN
CiAgICAgICAgICAgICAgICBzdW06PSBzdW0gK2FbaSsxXSpiW2krMV07XHBhcg0KICAgICAgICAg
ICAgICAgIHN1bTo9IHN1bSArYVtpKzJdKmJbaSsyXTtccGFyDQogICAgICAgICAgICAgICAgc3Vt
Oj0gc3VtICthW2krM10qYltpKzNdO1xwYXINCiAgICAgICAgICAgICAgICBzdW06PSBzdW0gK2Fb
aSs0XSpiW2krNF07XHBhcg0KICAgICAgICAgICAgICAgIEVORDtccGFyDQogICAgICAgIEVORDtc
cGFyDQpTdGRMb2cuSW50KChTZXJ2aWNlcy5UaWNrcygpIC0gdDApKTtccGFyDQpTdGRMb2cuUmVh
bCggc3VtKTsgU3RkTG9nLkxuXHBhcg0KRU5EIERvO1xwYXINClxwYXINCkVORCBUZXN0VGltZTIu
XHBhcg0KXHBhcg0KLS0tIEJsYWNrQm94XHBhcg0KLS0tIHNlbmQgc3ViamVjdCBIRUxQIG9yIFVO
U1VCU0NSSUJFIHRvIGJsYWNrYm94QG9iZXJvbi5jaFxwYXINClxwYXINCi0tLSBCbGFja0JveFxw
YXINCi0tLSBzZW5kIHN1YmplY3QgSEVMUCBvciBVTlNVQlNDUklCRSB0byBibGFja2JveEBvYmVy
b259fQBCIENvbXBpbGVyIGRv
----boundary-LibPST-iamunique-588113885_-_---
Received on Sat Nov 04 2006 - 00:21:48 UTC