Choose your language: us

Why program in Component Pascal?

  • As simple as possible but not simpler
  • Safe
  • Modular
  • Object oriented (as need be)
  • Commands
  • Garbage collection
  • No explicit linking
  • Fast compilation
  • Compiler markers
  • Code analyzer

Component Pascal is a general-purpose language in the tradition of Pascal, Modula-2, and Oberon. Its most important features are block structure, modularity, separate compilation, static typing with strong type checking (also across module boundaries), type extension with methods, dynamic loading of modules, and garbage collection.
Type extension makes Component Pascal an object-oriented language. An object is a variable of an abstract data type consisting of private data (its state) and procedures that operate on this data. Abstract data types are declared as extensible records. Component Pascal covers most terms of object-oriented languages by the established vocabulary of imperative languages in order to minimize the number of notions for similar concepts.
Complete type safety and the requirement of a dynamic object model make Component Pascal a component-oriented language.

Compiler

When a Component Pascal module is compiled (Ctrl-K shortcut) errors are indicated by placing a marker into the text at the point where the error occurs. The following hello world example indicates that a typo occurred in the word String

HelloWorld.PNG

That feature is (to our knowledge) unique to Oberon and Component Pascal.  Most other language compilers specify a line number or at best place the cursor at the beginning of the line where the error occurs but do not tell you where the error resides in the line  By using an error marker the cognitive load on the programmer is reduced as well as the time to correct errors.

The second facet beneficial to programming is the use of commands.  A command is a parameterless procedure that is marked for export with the export mark '*' as 'World*'.  A command can be directly executed by mousing on a 'commander' (Ctrl-Q inserts a commander into one's code).  A commander looks like this Commander.PNG.  When clicked the procedure in the module that follows the commander is executed.

A third feature of the development environment is the Analyzer which can be run over one's code to find those variables that are not used or not initialized before use.  The analyzer uses markers just like the compiler.

Analyzer.PNG

The unused variable 'foo' has been found and marked.  Clicking on the mark will expand it to show the reason for the mark.


Comparison to other programming languages

Component Pascal vs. C++

C++ does not have a module concept in the language proper but simulates it in the well-known way via the C-preprocessor (cpp) and appropriate programming conventions (header files). The global name space that C++ inherited from C does not preclude name clashes during the linking step. To avoid this problem, classes are sometimes used to simulate the name scope of modules. In the case of interrelated classes or procedures which refer to more than one class, so- called friends must be used, which are sort of a scope-goto - a construct that allows to circumvent the usual scoping rules of the language. Friends make names visible where they would not be visible otherwise. Since this mechanism is still unsatisfying for large software systems, extensions to the scoping mechanisms - namespaces - are being discussed by the C++ standardization committee [5]. However, namespaces still depend on the C-preprocessor, thus, they cannot be regarded as a proper module concept in the language.

Another often cited criticism of C++, the missing initialization order, also solved by Component Pascal's modules. In contrast of cpp's include mechanism, the import relationship forbids cycles. Thus, the imported modules can always be initialized before their clients.

For system-level programming, Component Pascal offers the pseudo module SYSTEM, wh provides implementation and machine dependent operations. Modules which import SYSTEM are inherently unportable and unsafe but easily identified by the word SYSTEM in their import list. C++ allows the usage of system level operations without specially marking such programs. When porting programs from one machine to another, this might lead to unpleasant surprises and long debugging sessions.

Safety in programming languages

Nowadays nobody expects that an electric shaver can be plugged into a high- voltage socket. Furthermore, for the case of a short circuit or similar malfunctioning of a correctly connected appliance there are additional fuses. Surprisingly, these concepts of safety are not well-established in most programming languages. Of course, not every programming error can be precluded by the design of a programming language. Nevertheless, the avoidance of certain error classes and the detection of runtime errors are important quality aspects [4, 5]. Both Component Pascal and C++ rely on the notion of strong typing. The approach to that, however, is quite contrary. In Component Pascal (as in Pascal) a variable is associated with an arbitrary complex type, in C++ (as in C) a type is associated with an arbitrary complex designator (lvalue). This lvalue acts as a prototype for the usage of the variable and defines the variable's type implicitly. By inverting the declaration and isolating the variable, the variable's type can be reconstructed. A concrete example is the definition of a pointer v to a structure x as in:

struct x *v;

This means that the lvalue *v is of type struct x. '*' denotes dereferenciation, therefore the type of v can be deduced as pointer to struct x. In Component Pascal one would write

VAR v: POINTER TO x;

The variable v in this declaration is already isolated. In case of more complex declarations, Component Pascal's approach is definitely simpler and more regular. Eventually, in both languages a type is associated with every variable, which defines the set of values and applicable operators. By that, many erroneous usages of variables and procedures can be detected before program execution and help to avoid mysterious program crashes.
Pointer arithmetic in C++ is dangerous

For those errors that cannot be detected before program execution, Component Pascal goes one step further by guaranteeing type safety and memory consistency even at run time. The necessary fuses, for example for array-bound checking, can be implemented with almost no overhead in execution time and program size. C++ defines an array as identical with a pointer to the first element and allows pointer arithmetic. This precludes index checking in practice. A further safety loophole in C++ exists in the management of dynamic storage where Component Pascal still guarantees memory consistency by means of automatic garbage collection.

In contrast to BASIC and most scripting or fourth generation languages both Component Pascal and C++ offer the possibility to construct dynamic data structures which are interrelated by means of pointers. Such structures not only grow but also shrink. In the latter case, the C++ programmer has to explicitly free the unused storage. To support this task, C++ offers the notion of destructors, which are automatically activated whenever an object is deallocated.

Destructors, however, do not solve the problem that objects are deallocate too early or too late. Many hours of debugging time have already been spent to detect and fix such errors. In vain for extensible programming systems. It can easily be shown that the programmer cannot know the correct time of freeing an object in this case. Therefore, and not only for convenience, Component Pascal relies on a conceptually infinite heap storage, which only allows to allocate but not to deallocate objects.
Component Pascal with integrated garbage collection

In contrast to the programmer, the runtime system can easily decide, when an object is no longer in use and deallocate the associated storage. This technique, also called automatic garbage collection, implements the illusion of an infinite heap and leads to a significant gain in productivity. Probably, it is garbage collection and not the syntax that attracted the users of languages such as Smalltalk or Lisp. It is not surprising that introduction of garbage collection is a hot topic within the C++ community. Due to pointer arithmetic, however, it is much more difficult if not impossible to introduce it in C++.
OOP-concept is record extension

Roughly speaking, both Component Pascal and C++ are object-oriented extensions of existing languages. The approaches to this, however, are fairly different. C++ essentially supports object-oriented programming (OOP) a la Simula-67, Component Pascal does not suggest a particular OOP-style but leaves it to the programmer to select the appropriate technique for a given task. All these techniques are based on the notion of record extension, which replaces the variant records (Unions in C) of its predecessors. Record extension means that a new record type can be defined as an extension of an existing one.

The base type and the extended type are upward compatible to each other, all operations which can be applied to the base type can also be applied to the extended type but not vice versa. Two fundamental OOP-styles can be identified in Component Pascal. They are distinguished by the fact that a message is represented explicitly as an Component Pascal data structure or implicitly as a procedure call.

In the first case, messages are represented as records (message records) are passed explicitly to a procedure (the message handler) as variable parameters. The handler is typically bound to the receiving object by means of a procedure variable (c.f listing 2). Objects are usually allocated on the heap and referenced via pointers.

   TYPE
        Object = POINTER TO ObjectDesc;
        ObjectMsg = RECORD END ;
        Handler = PROCEDURE (O: Object; VAR M: ObjectMsg);
        ObjectDesc = RECORD
                handle: Handler
        END ;

Listing 2: Message records are explicitly passed to the handler procedure, which is bound to the receiving object by a procedure variable.

Applying record extension to messages it is possible to create an hierarchical message types. The message type DisplayMsg, for example, is derived from the base type ObjectMsg (c.f. listing 3). Further specialization of DisplayMsg is possible. The handler distinguishes different message kinds by means of the type test operator IS and responds in an object-specific way to the message. Using message records and handlers seems to be rather inconvenient and inefficient at the first glance. However, they do have certain advantages as well, which explains why they are the dominant OOP-style in the Component Pascal system.

The advantages are:

Messages can be introduced where they are needed. It is not necessary to declare them together with the base type.
Messages can be handled generically without knowing their type or interpreting their contents. A container object, for example, can forward messages to its members without knowing all these messages. Generic broadcast, forwarding and delegation is possible.
The effect of extensible parameter lists can be achieved by extending message records.
There is a clean separation between subtyping (record extension) and subclassing (code inheritance). Code inheritance including multiple and even dynamic inheritance can be achieved by programming an appropriate message dispatching mechanism in the handler (c.f. ELSE branch in listing 3).

   TYPE
        CopyMsg = RECORD (ObjectMsg)
                deep: BOOLEAN;
                cpy: Object
        END ;

        DisplayMsg = RECORD (ObjectMsg)
                F: Frame;
                x, y: INTEGER
        END ;

    PROCEDURE HandleMyObject (O: Object; VAR M: ObjectMsg);
    BEGIN
        IF M IS CopyMsg THEN ...
        ELSIF M IS DisplayMsg THEN ...
        ...
        ELSE Objects.Handle(O, M)
        END
   END HandleMyObject;

Listing 3: Record extension can also be applied to message records leading to a hierarchy of message types.

The dominant role of message records is also evident from systems such as MacOS, X11 or Windows where they appear as event records. In these systems, however, message records are expressed as non-extensible variant records (unions).

If efficiency rather than flexibility is crucial, there are further mechanisms available. In Oberon-2, which is supported by all commercial vendors, they include also type-bound procedures, which are similar to virtual functions in C++. A procedure Display, for example, can be bound to a type Line in the following way:

PROCEDURE (L: Line) Display (F: Frame; x, y: INTEGER);

C++ introduces object-oriented programming via a special syntactic construct, the class, which is a textual bracket around an extensible structure definition and functions bound to this structure. Although message records and handlers would also be possible in principle, this technique is not practicable due to the missing type test operator.

In the typical C++ OOP-style with classes and virtual functions, code inheritance cannot be expressed explicitly as with Component Pascal's message handlers. Therefore, the language already contains several important inheritance relations including multiple inheritance and virtual base classes. These predefined mechanisms can, however, not compete with the flexibility of an explicitly programmed message handler. Generic forwarding of messages is for example not possible.
Exception handling

Most language designers now agree that I/O operations, processes, threads, semaphores and similar things should not be defined within the language since there are too many different concepts and none of them is appropriate for all applications. This does not mean that programmers should not use these concepts but that they should be provided by means of modules instead of language constructs. This idea is also applied to exception handling in Component Pascal, whereas in C++ a particular exception handling mechanism is already defined within the language.


Genericity

A program is called generic if it is not specific to a particular programming task. In a strongly typed programming language types are usually constant, i.e. specific. It is, however, also possible to think of program components such as procedures or classes which are parameterized with types. The most prominent examples of such generic programs are container classes (lists, sets or trees), that consist of elements of a given type.

C++ allows the usage of 'templates', i.e. building blocks which can be parameterized even with types in order to increase program reuse. Unfortunately, there is no better implementation technique known than expanding templates for all different argument combinations. Templates actually represent another kind of preprocessor, one that knows about the scoping rules of C++. Maintenance of expanded templates across compilation units further complicates template implementation and usage. In current implementations this problem is mostly unsolved and frequent use of templates often leads to surprising code sizes due to unintended code duplication.

For this reasons, Component Pascal does not include a template mechanism within th language but delegates the task of expanding code fragments to the programmer. In principle, a template preprocessor would also be possible for Component Pascal, however, as a separate tool.

Overloading

One of the central design decisions of C++ was that it should be possible to define new data types that look exactly like built-in types. Consequently, it is necessary to allow user defined operators such as + or -. It is straight forward to extend this idea to overloading of functions as well.

In contrast to C++, one of Component Pascal's central design decisions was that imported objects should always be prefixed by the exporting module name in order to ease reading of programs. This is in conflict with operator and function overloading. Therefore, Component Pascal uses overloaded operators and functions only for language defined types. This restriction also helps to guarantee that overloaded operators are similar enough to justify overloading and helps to reduce unintended introduction of inefficiencies. Please note also, that overloading, although an established mathematical concept, has its pitfalls. The interested reader might want to find out which one of the following two functions is called (if any) and what happens if one of them is removed.

void f(char*);
void f(int);
... f(0);

Summarizing, it can be stated that the exceptional shortness of Component Pascal's language definition - less than 20 pages - does not origin in deficiencies of the expressivity of the language. Quite to the contrary, Component Pascal already contains some features which C++ programmers can only dream about. To mention just a few: modules, runtime type information and garbage collection. The latter is a necessary prerequisite for robust and reliable extensible software systems that will become more and more important in the future. It is hoped that the excellent educational Component Pascal implementations will be accompanied soon by equally well-implemented industrial programming tools to give also the practitioner a real alternative to C++.

     Criteria                                              Component Pascal                    C++

   ---------------------------------------------------------------------------------------------

  • type test                        yes                 no1
  • type BOOLEAN                     yes                 no1
  • modules                          yes                 no1
  • marking system-level programs    yes                 no
  • defined initialization order     yes                 no
  • garbage collection               yes                 no
  • dynamic arrays                   yes                 no
  • run time tests                   yes                 no
  • completely type safe             yes                 no
  • preprocessor necessary           no                  yes
  • exceptions                       no                  yes
  • templates                        no                  yes
  • overloading                      no                  yes
  • typing                           explicit            implicit
  • precedence levels                4                   17
  • language report (pages)          20                  150

1extensions are being discussed

[1] J. Templ, “Oberon vs. C++,” Erlangen’s First Independent Modula_2 Journal!, pp. 1–8, 1994.
 
 
 

 Brief comparison of Component Pascal and Java



Cuno Pfister, Oberon microsystems

 

Component Pascal is to Pascal what Java is to C and C++: a modern and safe next-generation language that combines the flexibility of dynamic languages with the robustness of static languages. Like Java, it builds on decades of experience, and makes reuse of existing skills easy. Like Java, it supports component software much better than its predecessors. Compared to Java, it is easier to learn and allows to produce more efficient code.

The most obvious difference is the "look and feel" of the two languages. Component Pascal is syntactically clearly in the Pascal family, while Java is in the C family. But this is a relatively superficial difference. Concerning the more important "design for safety", Java and Component Pascal are closely related, while C and even original Pascal (e.g., untagged variant records) are comparatively unsafe. Among other things, safety also implies automatic garbage collection. Garbage collection is necessary to avoid memory leaks and, more importantly, dangling pointers. Both Java and Component Pascal support the dynamic loading of code and metaprogramming (reflection). As a result, both languages can use virtually the same run-time system. This is proven by the real-time operating system JBed (http://www.jbed.com/) and a Component Pascal compiler that produces standard Java byte code class files. Note that C or C++ could not be translated into Java byte code, due to their inherent lack of safety. In this fundamental respect, Component Pascal and Java are closer than Java is to C++!

Component Pascal is more efficient than Java. Unlike Java, it also supports static (stack-allocated) data structures and has variable (VAR) parameters (with IN and OUT variants). Depending on the application, this can have a measureable impact on efficiency. For example, consider the Component Pascal method call

font.GetMeasures(ascender, descender, maxWidth)

which returns three values as OUT parameters. In Java, there is no efficient way to express this simple statement. Instead, a Java method could allocate, set up, and return an auxiliary object which contains the three result values. This is usually too heavyweight, since it not only implies heap allocation, but also the introduction of a new class and class file. Another solution would be to provide three different methods, one for every result parameter. This means three method calls instead of only one. Finally, VAR parameters could be emulated using arrays of length 1. Apart from the ugly misuse of arrays, this would imply unnecessary range checks, type checks, and heap allocation of auxiliary objects. Extensive heap allocation is undesirable in particular in real-time systems, because the need for real-time allocation severely restricts the freedom to choose efficient and flexible memory management algorithms. Finally, Component Pascal has more efficiently implementable arrays, which can be important for numerical and other computation-intensive applications.

Java has built-in support for threads. Component Pascal doesn't provide this feature in the language, because it leads to unportable software: Java threads can behave differently on different platforms. Furthermore, the Java thread scheme is too limited for hard real-time software: Java threads only support ten different user-assignable priorities. Hard real-time systems on the other hand require deadline-driven scheduling policies such as Earliest Deadline First scheduling, which dynamically apply a virtually unlimited number of priorities.

Unlike Component Pascal, Java has exception handling support. This works quite well, except for real-time systems. The problem is that Java exceptions are used also for debugging purposes, which in principle has nothing to do with exception handling. Due to this unfortunate mixture, the Java stack must be frozen and copied when an exception occurs, which is expensive and causes unpredictable delays.

Java has another feature which is problematic for real-time systems: its initialization semantics. The initialization of a Java class is delayed until it is used for the first time. This is reasonable for Internet software, since it allows to avoid (down)loading of classes that are never used. For real-time systems, however, this is fatal: an interrupt handler that should respond in 20 microseconds must not cause several milliseconds initialization time just because it is used for the first time... Moreover, implementing lazy initialization is fine for an interpreter, but it forces a compiler to generate suboptimal code.

[ Note that JBed adds a real-time API that overcomes the limitations of the built-in thread mechanism of Java, and it strictly separates exception handling from debugging. As for the initialization problem, it remains to be seen whether Sun can relax the language definition to allow earlier initialization. Lazy initialization is not difficult to implement, but it is never what you want in a real-time system. In spite of this minor ugliness, Java is still a far better language than C++, even for real-time systems.]

Component Pascal adds a few new features (e.g., implement-only export, explicit NEW attribute for newly introduced methods, EMPTY methods, LIMITED records) that specifically help to keep large evolving component-software systems under control, by making refactoring less risky. Basically, Component Pascal allows to better express design patterns explicitly in a framework's interface, so that the compiler can check for consistency of components with their component frameworks. This is crucial for getting closer to the ultimate goal of safe plug & play of independently developed components - in other words: to inexpensive component software.

Probably the most important difference is complexity: the language definition of Java includes over thirty classes with over 300 methods. Moreover, the language features (and sometimes even the standard classes) interlock very tightly, which often makes it difficult to truly understand one feature without already having understood the others. As a result, casual use of Java has proven difficult and more time-consuming than expected. Component Pascal is a much smaller and less complex language than Java. There are two main reasons. First, Component Pascal is, despite its small size, a language that can be understood and used incrementally, by progressively disclosing new features when they are needed. Second, Component Pascal leaves many Java features to libraries. The above mentioned thread and synchronization mechanisms are an example. As a result, further clean layering of mechanisms and abstractions is encouraged, while Java and large parts of the class library seem to form a vast and conceptually flat "web" of concepts.

(c) 1997-1999 Oberon microsystems, Inc.


Oberon microsystems, Inc.
Technoparkstrasse 1
8005 Zürich
Switzerland
    

Tel (+41 1 ) 445 1751
Fax (+41 1) 445 1752
Net info (at) oberon.ch