Re: CAPS again

From: René A. Krywult <"René>
Date: Thu, 16 Sep 2004 16:10:58 +0200 (DFT)

----boundary-LibPST-iamunique-1510679016_-_-
Content-type: text/plain

> Hi,

>> The idea of "Pascal-like languages" is: Don't do at runtime, what
>> you can already do at compile-time. Type-safety-checks can ALWAYS
>>be done at runtime.

>
>you are quite right - I was too loose with my language. What I really
>mean is that the specification of CAP should not lead to an undefined
>result, because it forces the programmer to include run-time checks of
>the input. So the efficiency of the CAP implementation is buried under
>the inefficiency of the range checks.

But that's the fun part about it! There is no "undefined result". It is very much defined. It is not necessarily a USEFULL result (at least when you want to print it out on screen or printer), but is is defined.

If you are in a situation where only printable characters are allowed, then you have to do the type checking or use Strings.Upper. If that's none of your concern (and there are applications where this doesn't matter), then CAP will do the trick.

>Every possible CHAR maps to an upper case equivalent, even if it is a
>map back to itself. So, a->A, b->B, space->space, {([at]})nowhere.xy
>reason why this should not be so, and no reason why it should be >inefficient to implement.

So maybe you should do some studies in Assembler then, the explanation why CAPS is that incredible fast (compared to Strings.Upper) has already been given by Chris Burrows, IIRC.

But since it seems you missed his excellent explanation, here it is, in short words, again (it's been a long time since I dabbled in ASM, so forgive me, if there's something inexact in the following):

In the Latin-1 character set, "a" differs from "A" by 1 bit that is set in the case of "a", and not set in the case of "A".

a = 1100001
A = 1000001

b = 1100010
B = 1000010

As you see, the upper case letter differs from the lower case letter only in the "second bit from the left", where the UC has the bit set, whereas the LC has not.

So, if you want to shift from LC to UC, all you have to do is clear the "second bit from the left". No translation table is involved! It's just ONE machine instruction: A single bitwise AND operation, where the "clear" (0) is a constant!

And ADD is a very simple instruction, machine internally.

Now, this was easy, I think.

So what is done in case of the translation table you propose?

Somewhere in the memory, you need a table. That table has lower case letters in column 1 and upper case letters in column 2. You have to read (1st instruction) the table, and if you have indexed it cleverly, this is just one instruction, and not a LOOP UNTIL lower case letter is found END. So you move the contents of the indexed row of your table to a register (1st instruction), then you move the contents of the register to the variable that should go uppercase (2nd instruction). And - tata - you need TWO instructions instead of one.

Further, MOV is not a simple instruction, but the encoding for the mov instruction is probably the most complex in the instruction set.

More complex means longer execution time!

So you could say that 1 single instruction of the "table version" (of which there are two) is slower than the 1 instruction the ADD takes! So it is actually MORE than double time that is needed by the table version.

Does that help?

Rene



-------------------------------------------
Versendet durch AonWebmail (webmail.aon.at)
--------------------------------------------

To unsubscribe from this mailing list, send a message containing the word "unsubscribe" to:
   blackbox-request{([at]})nowhere.xy

To get a list of valid e-mail commands and instructions on their usage, send a message containing the word "help" to the above address.

Send any problem reports or questions related to this email list to the list owner at
   owner-blackbox{([at]})nowhere.xy

Current posting policy:

a) To post you should use the same address by which you are subscribed to the mailing list. That way, the list server will recognize you as subscriber and forward your posting immediately, without creating any overhead.

b) If, for some reason, you cannot post from the address, by which you are subscribed, your message will be moderated to avoid spam. Please understand that moderation will often cause some delay, in particular over weekends or holydays.


----boundary-LibPST-iamunique-1510679016_-_-
Content-type: application/rtf
Content-transfer-encoding: base64
Content-Disposition: attachment; filename="rtf-body.rtf"

e1xydGYxXGFuc2lcYW5zaWNwZzEyNTJcZnJvbXRleHQgXGRlZmYwe1xmb250dGJsDQp7XGYwXGZz
d2lzcyBBcmlhbDt9DQp7XGYxXGZtb2Rlcm4gQ291cmllciBOZXc7fQ0Ke1xmMlxmbmlsXGZjaGFy
c2V0MiBTeW1ib2w7fQ0Ke1xmM1xmbW9kZXJuXGZjaGFyc2V0MCBDb3VyaWVyIE5ldzt9fQ0Ke1xj
b2xvcnRibFxyZWQwXGdyZWVuMFxibHVlMDtccmVkMFxncmVlbjBcYmx1ZTI1NTt9DQpcdWMxXHBh
cmRccGxhaW5cZGVmdGFiMzYwIFxmMFxmczIwID4gSGksXHBhcg0KXHBhcg0KPj4gVGhlIGlkZWEg
b2YgIlBhc2NhbC1saWtlIGxhbmd1YWdlcyIgaXM6IERvbid0IGRvIGF0IHJ1bnRpbWUsIHdoYXRc
cGFyDQo+PiB5b3UgY2FuIGFscmVhZHkgZG8gYXQgY29tcGlsZS10aW1lLiBUeXBlLXNhZmV0eS1j
aGVja3MgY2FuIEFMV0FZUyBccGFyDQo+PmJlIGRvbmUgYXQgcnVudGltZS4gXHBhcg0KPlxwYXIN
Cj55b3UgYXJlIHF1aXRlIHJpZ2h0IC0gSSB3YXMgdG9vIGxvb3NlIHdpdGggbXkgbGFuZ3VhZ2Uu
IFdoYXQgSSByZWFsbHlccGFyDQo+bWVhbiBpcyB0aGF0IHRoZSBzcGVjaWZpY2F0aW9uIG9mIENB
UCBzaG91bGQgbm90IGxlYWQgdG8gYW4gdW5kZWZpbmVkXHBhcg0KPnJlc3VsdCwgYmVjYXVzZSBp
dCBmb3JjZXMgdGhlIHByb2dyYW1tZXIgdG8gaW5jbHVkZSBydW4tdGltZSBjaGVja3Mgb2ZccGFy
DQo+dGhlIGlucHV0LiBTbyB0aGUgZWZmaWNpZW5jeSBvZiB0aGUgQ0FQIGltcGxlbWVudGF0aW9u
IGlzIGJ1cmllZCB1bmRlclxwYXINCj50aGUgaW5lZmZpY2llbmN5IG9mIHRoZSByYW5nZSBjaGVj
a3MuXHBhcg0KXHBhcg0KQnV0IHRoYXQncyB0aGUgZnVuIHBhcnQgYWJvdXQgaXQhIFRoZXJlIGlz
IG5vICJ1bmRlZmluZWQgcmVzdWx0Ii4gSXQgaXMgdmVyeSBtdWNoIGRlZmluZWQuIEl0IGlzIG5v
dCBuZWNlc3NhcmlseSBhIFVTRUZVTEwgcmVzdWx0IChhdCBsZWFzdCB3aGVuIHlvdSB3YW50IHRv
IHByaW50IGl0IG91dCBvbiBzY3JlZW4gb3IgcHJpbnRlciksIGJ1dCBpcyBpcyBkZWZpbmVkLlxw
YXINClxwYXINCklmIHlvdSBhcmUgaW4gYSBzaXR1YXRpb24gd2hlcmUgb25seSBwcmludGFibGUg
Y2hhcmFjdGVycyBhcmUgYWxsb3dlZCwgdGhlbiB5b3UgaGF2ZSB0byBkbyB0aGUgdHlwZSBjaGVj
a2luZyBvciB1c2UgU3RyaW5ncy5VcHBlci4gSWYgdGhhdCdzIG5vbmUgb2YgeW91ciBjb25jZXJu
IChhbmQgdGhlcmUgYXJlIGFwcGxpY2F0aW9ucyB3aGVyZSB0aGlzIGRvZXNuJ3QgbWF0dGVyKSwg
dGhlbiBDQVAgd2lsbCBkbyB0aGUgdHJpY2suXHBhcg0KXHBhcg0KPkV2ZXJ5IHBvc3NpYmxlIENI
QVIgbWFwcyB0byBhbiB1cHBlciBjYXNlIGVxdWl2YWxlbnQsIGV2ZW4gaWYgaXQgaXMgYVxwYXIN
Cj5tYXAgYmFjayB0byBpdHNlbGYuIFNvLCBhLT5BLCBiLT5CLCBzcGFjZS0+c3BhY2UsIEAtPkAg
ZXRjLiBJIHNlZSBub1xwYXINCj5yZWFzb24gd2h5IHRoaXMgc2hvdWxkIG5vdCBiZSBzbywgYW5k
IG5vIHJlYXNvbiB3aHkgaXQgc2hvdWxkIGJlID5pbmVmZmljaWVudCB0byBpbXBsZW1lbnQuXHBh
cg0KXHBhcg0KU28gbWF5YmUgeW91IHNob3VsZCBkbyBzb21lIHN0dWRpZXMgaW4gQXNzZW1ibGVy
IHRoZW4sIHRoZSBleHBsYW5hdGlvbiB3aHkgQ0FQUyBpcyB0aGF0IGluY3JlZGlibGUgZmFzdCAo
Y29tcGFyZWQgdG8gU3RyaW5ncy5VcHBlcikgaGFzIGFscmVhZHkgYmVlbiBnaXZlbiBieSBDaHJp
cyBCdXJyb3dzLCBJSVJDLiBccGFyDQpccGFyDQpCdXQgc2luY2UgaXQgc2VlbXMgeW91IG1pc3Nl
ZCBoaXMgZXhjZWxsZW50IGV4cGxhbmF0aW9uLCBoZXJlIGl0IGlzLCBpbiBzaG9ydCB3b3Jkcywg
YWdhaW4gKGl0J3MgYmVlbiBhIGxvbmcgdGltZSBzaW5jZSBJIGRhYmJsZWQgaW4gQVNNLCBzbyBm
b3JnaXZlIG1lLCBpZiB0aGVyZSdzIHNvbWV0aGluZyBpbmV4YWN0IGluIHRoZSBmb2xsb3dpbmcp
OlxwYXINClxwYXINCkluIHRoZSBMYXRpbi0xIGNoYXJhY3RlciBzZXQsICJhIiBkaWZmZXJzIGZy
b20gIkEiIGJ5IDEgYml0IHRoYXQgaXMgc2V0IGluIHRoZSBjYXNlIG9mICJhIiwgYW5kIG5vdCBz
ZXQgaW4gdGhlIGNhc2Ugb2YgIkEiLiBccGFyDQpccGFyDQphID0gMTEwMDAwMVxwYXINCkEgPSAx
MDAwMDAxXHBhcg0KXHBhcg0KYiA9IDExMDAwMTBccGFyDQpCID0gMTAwMDAxMFxwYXINClxwYXIN
CkFzIHlvdSBzZWUsIHRoZSB1cHBlciBjYXNlIGxldHRlciBkaWZmZXJzIGZyb20gdGhlIGxvd2Vy
IGNhc2UgbGV0dGVyIG9ubHkgaW4gdGhlICJzZWNvbmQgYml0IGZyb20gdGhlIGxlZnQiLCB3aGVy
ZSB0aGUgVUMgaGFzIHRoZSBiaXQgc2V0LCB3aGVyZWFzIHRoZSBMQyBoYXMgbm90LlxwYXINClxw
YXINClNvLCBpZiB5b3Ugd2FudCB0byBzaGlmdCBmcm9tIExDIHRvIFVDLCBhbGwgeW91IGhhdmUg
dG8gZG8gaXMgY2xlYXIgdGhlICJzZWNvbmQgYml0IGZyb20gdGhlIGxlZnQiLiBObyB0cmFuc2xh
dGlvbiB0YWJsZSBpcyBpbnZvbHZlZCEgSXQncyBqdXN0IE9ORSBtYWNoaW5lIGluc3RydWN0aW9u
OiBBIHNpbmdsZSBiaXR3aXNlIEFORCBvcGVyYXRpb24sIHdoZXJlIHRoZSAiY2xlYXIiICgwKSBp
cyBhIGNvbnN0YW50IVxwYXINClxwYXINCkFuZCBBREQgaXMgYSB2ZXJ5IHNpbXBsZSBpbnN0cnVj
dGlvbiwgbWFjaGluZSBpbnRlcm5hbGx5LlxwYXINClxwYXINCk5vdywgdGhpcyB3YXMgZWFzeSwg
SSB0aGluay4gXHBhcg0KXHBhcg0KU28gd2hhdCBpcyBkb25lIGluIGNhc2Ugb2YgdGhlIHRyYW5z
bGF0aW9uIHRhYmxlIHlvdSBwcm9wb3NlPyBccGFyDQpccGFyDQpTb21ld2hlcmUgaW4gdGhlIG1l
bW9yeSwgeW91IG5lZWQgYSB0YWJsZS4gVGhhdCB0YWJsZSBoYXMgbG93ZXIgY2FzZSBsZXR0ZXJz
IGluIGNvbHVtbiAxIGFuZCB1cHBlciBjYXNlIGxldHRlcnMgaW4gY29sdW1uIDIuIFlvdSBoYXZl
IHRvIHJlYWQgKDFzdCBpbnN0cnVjdGlvbikgdGhlIHRhYmxlLCBhbmQgaWYgeW91IGhhdmUgaW5k
ZXhlZCBpdCBjbGV2ZXJseSwgdGhpcyBpcyBqdXN0IG9uZSBpbnN0cnVjdGlvbiwgYW5kIG5vdCBh
IExPT1AgVU5USUwgbG93ZXIgY2FzZSBsZXR0ZXIgaXMgZm91bmQgRU5ELiBTbyB5b3UgbW92ZSB0
aGUgY29udGVudHMgb2YgdGhlIGluZGV4ZWQgcm93IG9mIHlvdXIgdGFibGUgdG8gYSByZWdpc3Rl
ciAoMXN0IGluc3RydWN0aW9uKSwgdGhlbiB5b3UgbW92ZSB0aGUgY29udGVudHMgb2YgdGhlIHJl
Z2lzdGVyIHRvIHRoZSB2YXJpYWJsZSB0aGF0IHNob3VsZCBnbyB1cHBlcmNhc2UgKDJuZCBpbnN0
cnVjdGlvbikuIEFuZCAtIHRhdGEgLSB5b3UgbmVlZCBUV08gaW5zdHJ1Y3Rpb25zIGluc3RlYWQg
b2Ygb25lLlxwYXINClxwYXINCkZ1cnRoZXIsIE1PViBpcyBub3QgYSBzaW1wbGUgaW5zdHJ1Y3Rp
b24sIGJ1dCB0aGUgZW5jb2RpbmcgZm9yIHRoZSBtb3YgaW5zdHJ1Y3Rpb24gaXMgcHJvYmFibHkg
dGhlIG1vc3QgY29tcGxleCBpbiB0aGUgaW5zdHJ1Y3Rpb24gc2V0LiBccGFyDQpccGFyDQpNb3Jl
IGNvbXBsZXggbWVhbnMgbG9uZ2VyIGV4ZWN1dGlvbiB0aW1lIVxwYXINClxwYXINClNvIHlvdSBj
b3VsZCBzYXkgdGhhdCAxIHNpbmdsZSBpbnN0cnVjdGlvbiBvZiB0aGUgInRhYmxlIHZlcnNpb24i
IChvZiB3aGljaCB0aGVyZSBhcmUgdHdvKSBpcyBzbG93ZXIgdGhhbiB0aGUgMSBpbnN0cnVjdGlv
biB0aGUgQUREIHRha2VzISBTbyBpdCBpcyBhY3R1YWxseSBNT1JFIHRoYW4gZG91YmxlIHRpbWUg
dGhhdCBpcyBuZWVkZWQgYnkgdGhlIHRhYmxlIHZlcnNpb24uXHBhcg0KXHBhcg0KRG9lcyB0aGF0
IGhlbHA/XHBhcg0KXHBhcg0KUmVuZVxwYXINClxwYXINClxwYXINClxwYXINCi0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS1ccGFyDQpWZXJzZW5kZXQgZHVyY2ggQW9u
V2VibWFpbCAod2VibWFpbC5hb24uYXQpXHBhcg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS1ccGFyDQpccGFyDQpUbyB1bnN1YnNjcmliZSBmcm9tIHRoaXMgbWFp
bGluZyBsaXN0LCBzZW5kIGEgbWVzc2FnZSBjb250YWluaW5nIHRoZSB3b3JkICJ1bnN1YnNjcmli
ZSIgdG86XHBhcg0KICAgYmxhY2tib3gtcmVxdWVzdEBvYmVyb24uY2hccGFyDQpccGFyDQpUbyBn
ZXQgYSBsaXN0IG9mIHZhbGlkIGUtbWFpbCBjb21tYW5kcyBhbmQgaW5zdHJ1Y3Rpb25zIG9uIHRo
ZWlyIHVzYWdlLCBzZW5kIGEgbWVzc2FnZSBjb250YWluaW5nIHRoZSB3b3JkICJoZWxwIiB0byB0
aGUgYWJvdmUgYWRkcmVzcy5ccGFyDQpccGFyDQpTZW5kIGFueSBwcm9ibGVtIHJlcG9ydHMgb3Ig
cXVlc3Rpb25zIHJlbGF0ZWQgdG8gdGhpcyBlbWFpbCBsaXN0IHRvIHRoZSBsaXN0IG93bmVyIGF0
XHBhcg0KICAgb3duZXItYmxhY2tib3hAb2Jlcm9uLmNoXHBhcg0KXHBhcg0KQ3VycmVudCBwb3N0
aW5nIHBvbGljeTpccGFyDQpccGFyDQphKSBUbyBwb3N0IHlvdSBzaG91bGQgdXNlIHRoZSBzYW1l
IGFkZHJlc3MgYnkgd2hpY2ggeW91IGFyZSBzdWJzY3JpYmVkIHRvIHRoZSBtYWlsaW5nIGxpc3Qu
IFRoYXQgd2F5LCB0aGUgbGlzdCBzZXJ2ZXIgd2lsbCByZWNvZ25pemUgeW91IGFzIHN1YnNjcmli
ZXIgYW5kIGZvcndhcmQgeW91ciBwb3N0aW5nIGltbWVkaWF0ZWx5LCB3aXRob3V0IGNyZWF0aW5n
IGFueSBvdmVyaGVhZC5ccGFyDQpccGFyDQpiKSBJZiwgZm9yIHNvbWUgcmVhc29uLCB5b3UgY2Fu
bm90IHBvc3QgZnJvbSB0aGUgYWRkcmVzcywgYnkgd2hpY2ggeW91IGFyZSBzdWJzY3JpYmVkLCB5
b3VyIG1lc3NhZ2Ugd2lsbCBiZSBtb2RlcmF0ZWQgdG8gYXZvaWQgc3BhbS4gUGxlYXNlIHVuZGVy
c3RhbmQgdGhhdCBtb2RlcmF0aW9uIHdpbGwgb2Z0ZW4gY2F1c2Ugc29tZSBkZWxheSwgaW4gcGFy
dGljdWxhciBvdmVyIHdlZWtlbmRzIG9yIGhvbHlkYXlzLlxwYXINCn0

----boundary-LibPST-iamunique-1510679016_-_---
Received on Thu Sep 16 2004 - 16:10:58 UTC

This archive was generated by hypermail 2.3.0 : Thu Sep 26 2013 - 06:28:36 UTC