Coco/R

This is a page (last updated 11 October 2013) of useful information about Coco/R, an easy to use compiler generator, whose development originated in a project done by the software group of the University of Linz, was continued at ETH Zürich, and is now again based in Linz.

The Linz team now have have a mailing list server at www.ssw.uni-linz.ac.at/cgi-bin/mailman/listinfo/coco

My book, "Compiling with C# and Java", an introductory text on compiler construction using C# and Java, was published by Pearson Education on 5th November 2004. Following standard publisher practice, the book records a publication year of 2005, and the ISBN is 0-321-26360-X. Pearson have done a shocking job of marketing the book. It has never appeared on the www.amazon.com lists, and many people who enquire about it are told that it is out of print.

However, as far as I know, it is in print, and available from http://www.amazon.co.uk.

A feature of the book is that it demonstrates the use of Coco/R to implement compilers for the JVM and CLR platforms.



Introduction

Coco/R combines the functionality of the well-known UNIX tools lex and yacc, to form an extremely easy to use compiler generator that generates recursive descent parsers, their associated scanners, and (in some versions) a driver program, from attributed grammars (written using EBNF syntax with attributes and semantic actions) which conform to the restrictions imposed by LL(1) parsing (rather than LALR parsing, as allowed by yacc). The user has to add modules for symbol table handling, optimization, and code generation in order to get a running compiler. Coco/R can also be used to construct other syntax-based applications that have less of a "compiler" flavour than a parser for a programming language.

Coco/R has been used successfully in academia and industry. The original Coco system was developed for a diploma thesis in 1983 by Hanspeter Mössenböck under the supervision of Professor Peter Rechenberg in Linz, Austria. In the years that followed Mössenböck improved the tool and Peter Rechenberg and he wrote a book about it.

When he moved to ETH Zurich in 1987 the development lines of Coco split. While Heinz Dobler developed Coco-2 in Linz, Mössenböck developed Coco/R in Zurich; together with other compiler tools, Coco/R (in Oberon) became part of his PhD thesis in 1987. The main difference between the two generators is that Coco-2 produces table-driven parsers while Coco/R produces recursive descent parser. Table-driven parsers make it easier to have automated error handling while recursive descent parsers are faster.

Coco/R was first ported into Modula-2 by Mössenböck himself, for the Apple MacMeth system. A port was done to JPI TopSpeed Modula-2 at ETH Zürich by Marc Brandis and Christof Brass. This was made available to Pat Terry at Rhodes University in South Africa, who added a few features, enhanced the portability in conjunction with John Gough at Queensland University of Technology in Australia, and provided the Modula-2 versions which are now distributed. Coco/R has subsequently been ported into Turbo Pascal by Volker Pohlers in Germany and Pat Terry, and into C by Frankie Arzu in Guatemala. The first Java version was released by Mössenböck at the end of 1997. Delphi and Ada versions have also been produced.

In the last few years Mössenböck has released new C#, Java and C++ versions. These have extensions that allow for LL(k) lookahead to enable some LL(1) conflicts to be resolved easily. These features are not found in the other versions known to the author.

As mentioned above, Coco/R is one of a number of developments of the earlier work. Heinz Dobler produced a variant known as Coco-2 which generated table-driven parsers. A later development of this has recently been announced by an Australian concern. The product, named Cogencee, generates systems for Delphi (tm). Further details can be obtained from http://www.cocolsoft.com/cogen/cogen.htm .

Coco/R is only one of a number of compiler generators. Other useful tools are listed in various places, for example

German National Resource Centre
Compiler Connection
Compiler Internet resources
Burks (Brighton University Resource Kit)
Dragon Fodder
CodeCranker

Back to contents


Sample input

Here is a simple demonstration of input suitable for the Oberon version of Coco/R. Equivalent input for the other versions of Coco/R would differ from this essentially only in the snippets of code used in the semantic actions.
    COMPILER Demo
      IMPORT MyMod;

    CHARACTERS
      letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrtsuvwxyz".
      digit = "0123456789".
      EOL = CHR(13).

    TOKENS
      ident = letter {letter | digit}.
      number = digit {digit}.

    COMMENTS FROM "(*" TO "*)" NESTED
    IGNORE  EOL

    PRODUCTIONS
      Demo = Statement {";" Statement}.
    (*------------------------------------------------------------------------------------*)
      Statement                        (. VAR x: ARRAY 32 OF CHAR; y: INTEGER; .)
      = Ident <x> ":=" Number <y>      (. MyMod.Assign(x, y) .)
      .
    (*------------------------------------------------------------------------------------*)
      Ident < VAR x: ARRAY OF CHAR>
      = ident                          (. DemoS.GetName(DemoS.pos, DemoS.len, x).)
      .
    (*------------------------------------------------------------------------------------*)
      Number < VAR n: INTEGER>         (. VAR s: ARRAY 32 OF CHAR; .)
      = number                         (. DemoS.GetName(DemoS.pos, DemoS.len, s);
                                          MyMod.Convert(s, n) .)
      .
    END Demo.
Back to contents


Literature

The original design of Coco/R is described in two papers by Hanspeter Mössenböck:

Mössenböck, H. : A Generator for Fast Compiler Front-Ends. Report 127, Dept. Informatik, ETH Zürich (1990)

A PDF version of this paper is available at http://www.scifac.ru.ac.za/resourcekit/pdf/CocoReport.pdf

Mössenböck, H. : A generator for production quality compilers. Proc 3rd Int'l Workshop on Compiler-Compilers, Schwerin FRG, 1990

A Postscript version of this paper is available on the Web from http://www.ssw.uni-linz.ac.at/Research/Papers/Moe90.html

A description of the data structures used in the latest Linz versions is available in PDF form at http://www.ssw.uni-linz.ac.at/Coco/Doc/DataStructures.pdf

The technique for LL(1) conflict resolution used in the latest Linz versions is described in the paper

Wöß, A., Löberbauer, M., Mössenböck, H. (2003) LL(1) Conflict Resolution in a Recursive Descent Compiler Generator, Joint Modular Languages Conference (JMLC'03), Klagenfurt, 2003, reprinted in Lecture Notes in Computer Science 2789, Springer-Verlag, 2003.

A revised version of this paper is available in PDF format at http://www.ssw.uni-linz.ac.at/Coco/Doc/ConflictResolvers.pdf

The latest version of the User Manual for the latest Linz versions for Java, C# and C++ is availanle in PDF format from http://www.ssw.uni-linz.ac.at/Coco/Doc/UserManual.pdf

A 1996 textbook made extensive use of Coco/R. This was "Compilers and Compiler Generators: An Introduction with C++", by Pat Terry (International Thomson, London, England: ISBN 1-85032-298-8). This book came with a diskette containing numerous case studies, each one supplied in Modula-2, Pascal and C++ equivalents.

The book is, unfortunately, now out of print. International Thomson decided to drop their Computer Press division, and this title was one of those to be dropped after copies ran out. The copyright was returned to the author, who then created an online version at http://www.scifac.ru.ac.za/compilers. This site also contains compressed downloadable files of the online edition, files from which a paper copy may be produced on an HP Laserjet compatible printer, and the source code for the case studies.

A sequel, "Compiling with C# and Java", an introductory text on compiler construction using C# and Java, was published by Pearson Education on 5th November 2004. Following standard publisher practice, the book records a publication year of 2005, and the ISBN is 0-321-26360-X. The book may be ordered from various stores, including http://www.amazon.co.uk, and a feature of the book is that it demonstrates the use of Coco/R to implement compilers for the JVM and CLR platforms.

For reasons best known to Pearson Education, the book has not yet been distributed in the USA, and in particular is not yet available from http://www.amazon.com. Potential readers who might like to follow this up where other efforts (including my own) have failed might like to write to Owen Knight who was in charge of the publication effort in the United Kingdom.

There is an extensive website containing the "Resource Kit" for "Compiling with C# and Java" at http://www.scifac.ru.ac.za/resourcekit - a collection that includes the source code for all the case studies in the book, links to useful sites, ancillary documents, and instructions for downloading and installing the components of the Resource Kit (including Coco/R for Java and Coco/R for C# and user manuals for these systems) on a reader's own computer.

Back to contents


Available versions

There are several versions of Coco/R available in various distribution kits, which provide the full source of the system as well as some documentation. This section gives links to sites where the latest versions should always be available. There is also a fairly extensive set of mirror sites, not all of which carry all versions.

Users who implement Coco/R on other systems are encouraged to share their experience and to make their implementations available for incorporation into later releases of the system.

Oberon

The Oberon version, while not as highly developed as some of the later ports, is available in Mössenböck's original version for various Oberon systems.

The latest version should be available from ftp://ftp.ssw.uni-linz.ac.at/pub/Oberon/LinzTools/Coco.Cod

Back to contents

Modula-2

The Modula-2 distribution kits for MS-DOS are known to be immediately usable with any of the pre-ISO TopSpeed (JPI) compilers; the shareware Fitted Software Tools (FST) compilers, versions 2.x through 4.0; the Stony Brook QuickMod 2.2 compiler; and the Logitech compilers, version 3.03 and 4.

As from version 1.43, support has been provided for ISO compliant compilers, including XDS from XTech, Gardens Point,  StonyBrook V4, and the p1 compiler for the Apple Macintosh.  The latest versions of Coco can be used with frame files that are compatible with the earlier (non-ISO) releases; they can also be used with ISO-compliant frame files that will generate scanners, parsers and main routines that are independent of the traditional "FileIO" module, and which link in directly to the ISO I/O libraries.

The latest version is available at http://www.scifac.ru.ac.za/coco/coco153.zip (800K)  (this should be extracted with a -d option to retrieve the directory structure)

On UNIX systems, Coco/R was first ported to Gardens Point Modula-2 by John Gough. (Gardens Point Modula-2 is available for a wide variety of platforms, including Intel, Sparc and MIPS machines).

The latest version is available at http://www.scifac.ru.ac.za/coco/gpm153.tgz (194K)

Coco/R has also been ported to Mocka Modula-2 by Pat Terry and Toshinori Maeno, for both Linux and BSD386.

The latest (non-ISO) version is available at http://www.scifac.ru.ac.za/coco/mocka153.tgz (174K)

Full source code for Coco/R compatible with all these compilers is supplied. It is hoped that this will be trivially portable to other Modula-2 compilers presently available, save for FileIO, the I/O module used by Coco/R and parsers generated by it in pre-ISO mode; code for the I/O module for the pre-ISO compilers listed above is, naturally, supplied, and will act as a model for ports to other compilers.

An unmaintained port of Coco/R was performed for TDI Modula-2 for the Atari by Rolf Schrader. The sources are also supplied in a complete kit. These have numerous minor differences from the MS-DOS versions, although they were derived from them.

The last version (1.36) is available at http://www.scifac.ru.ac.za/coco/atari136.zip

Back to contents

Pascal

As from release 1.39, versions of Coco/R that produce Turbo Pascal units have been available through a port first done by Volker Pohlers and Pat Terry.

The latest version is available at http://www.scifac.ru.ac.za/coco/turbo153.zip (214K).

As from release 1.50, these sources should also compile using the excellent Free Pascal compiler available from http://www.freepascal.org.

Back to contents

Delphi

Users should check out http://www.tetzel.com/cocor.shtml where a port of the system to Delphi by Mike Reith is available.

Details of Cogencee, another Delphi based product that was developed from the variant of Coco known as Coco-2, can be obtained from http://www.cocolsoft.com/cogen/cogen.htm .

Pat Connors has released a verion called ParserBuilder - a Win32 application which provides an IDE from which you can edit, build and test your parsers. The only programming language supported by ParserBuilder 1.0 is Borland's Delphi Object Pascal. Further details and source code can be found at http://parserbuilder.sourceforge.net/

Back to contents

C/C++

The first C/C++ version of Coco/R was ported by Frankie Arzu.

This is known to be immediately compatible with a great many compilers. As from version 1.06, a user has had the option of generating either C or C++ code. The C++ version generates scanner and parser classes based on a simple but effective class hierarchy.

The latest release of this version is available at http://www.scifac.ru.ac.za/coco/cocorc17.zip (194K) for MS-DOS based systems, and as http://www.scifac.ru.ac.za/coco/cocorc17.tgz (104K) for UNIX systems.

Details of the latest C++ implementation developed by Hanspeter Mössenböck and Balazs Csaba can be found on the Linz web site at http://www.ssw.uni-linz.ac.at/Coco/, where you can find documentation, and links to the source files, frame files, and a toy example compiler.

This version is closely compatible with the Linz versions for Java and C#, but not immediately compatible with the Arzu version. In particular it provides the ability to resolve some LL(1) conflicts by multi-symbol lookahead.

Back to contents

Java

Details of the Java implementation developed by Hanspeter Mössenböck can be found on the Linz web site at http://www.ssw.uni-linz.ac.at/Coco/, where you can find documentation, and links to the source files, frame files, and a toy example compiler.

This version is closely compatible with the Linz versions for C++ and C#, but not immediately compatible with other versions (in particular the Arzu version). In particular it provides the ability to resolve some LL(1) conflicts by multi-symbol lookahead.

A modified version of this implementation, with some extensions, can be found in the Resource Kit for my book "Compiling with C# and Java" (Pearson, 2005) available at http://www.scifac.ru.ac.za/resourcekit

Back to contents

C#

Details of the C# implementation developed by Hanspeter Mössenböck can be found on his web site at http://www.ssw.uni-linz.ac.at/Coco/, where you can find documentation, and links to the source files, frame files, and a toy example compiler.

This version is closely compatible with the Linz versions for C++ and Java, but not immediately compatible with other versions (in particular the Arzu version). In particular it provides the ability to resolve some LL(1) conflicts by multi-symbol lookahead.

A modified version of this implementation, with some extensions, can be found in the Resource Kit for my book "Compiling with C# and Java" (Pearson, 2005) available at http://www.scifac.ru.ac.za/resourcekit

Icon/UnIcon

Jeremy Powers has developed a Unicon/Icon port of Coco/R. Further details can be found at http://www.wyrdtech.com/cocor/ along with the source distribution, which is also available at http://www.scifac.ru.ac.za/coco/cocoicon.zip.

Back to contents

Ruby

Ryan Davis has developed a port of Coco/R for Ruby 1.0.0. Further details can be found at http://sourceforge.net/projects/coco-ruby/

Coco/R(uby) is a port of Coco/R to ruby and generates pure ruby parsers and scanners. This version of Coco/R is not related to Mark Probert's version , found at http://raa.ruby-lang.org/list.rhtml?name=coco-rb.

Ryan's version of Coco/R generates pure ruby. Mark's version generates C for ruby extensions. If you find Ryan's version too slow, you might want to check out Mark's. If however, you need pure ruby or can't deploy where there is a C compiler, you finally have an LL solution.

FEATURES/PROBLEMS:

Back to contents

Ada

Oleksandr Havva has developed a port of Coco/R for Ada. This is available at http://www.ada-ru.org/files/cocor_ada-1.53.1.tar.gz

Component Pascal

Bernhard Treutwein of Ludwig-Maximilians-Universität, Munich has supplied details of a Component Pascal version of CoCo (based on one by Stewart Greenhill with contributions from Bernard). It can be found in Helmut Zinn's Component Pascal Collection at http://www.zinnamturm.eu/downloadsAC.htm#Coco

Back to contents


Case studies

Besides allowing users to generate their own systems, Coco/R can also bootstrap itself to generate a driver, parser, scanner, and semantic evaluator from its own attributed grammar CR.ATG. This grammar thus serves as a fairly large example of how to write compiler descriptions for Coco/R.

An example of a simple, but usable, Modula-2 to Pascal converter for use with the Modula-2 version can be downloaded from http://www.scifac.ru.ac.za/coco/mod2pas.zip (33K)

An example of a Pascal-S compiler coded for the TurboPascal version of Coco/R can be downloaded from http://www.scifac.ru.ac.za/coco/pascals.zip (41K)

Pat Terry's books also contain numerous detailed case studies.

Some of the distribution kits themselves contain various simpler complete case studies, including a suite of compiler/interpreter, pretty-printer and cross-reference generator for a very simple Pascal-like language. There are a number of further sample grammars for Coco/R, that are not necessarily suitable for immediate use, but which would need massaging and adaptation before sensible parsers could be constructed, for example

Back to contents


Disclaimers

It is important to realise that Coco/R was originally intended for use with grammars that meet the LL(1) conditions. Many grammars require some massaging before these conditions are met. Some of the examples in the kit are (deliberately) non-LL(1) and are intended as examples for study and experiment. However, the use of Extended BNF (EBNF) instead of simple BNF makes it easy to avoid most LL(1) conflicts. The latest versions for C#, C++ and Java allow for LL(1) conflicts to be resolved by multi-symbol lookahead or by semantic checks, but these facilities are not (so far as is known) implemented in other versions yet.

While every attempt has been made to ensure that Coco/R performs satisfactorily, the developers can accept no liability for any damage or loss, including special, incidental, or consequential, caused by the use of the software, directly or indirectly.

Back to contents


Mirror sites for the Modula-2, Pascal and C versions

Versions of Coco/R, including some earlier releases, should be available for anonymous ftp from the mirrors listed below (not all servers carry all versions, that in South Africa tends to be the most up to date):
In Europe          ftp://ftp.ssw.uni-linz.ac.at/pub/Coco

In Australia       ftp://ftp.fit.qut.edu.au/pub/coco

In South Africa    http://www.scifac.ru.ac.za/coco

In the USA         ftp://ftp.psg.com/pub/modula-2/coco
Coco/R is also available by mail server in the USA: The distributions come in various files (in each case "xxx" denotes the release number, for example 133 for version 1.33) The kits contain sources, objects, examples and vanilla-ASCII documentation files.

Back to contents


Contacts

Prof. Hanspeter Mössenböck
University of Linz
Institute of Computer Science
Altenbergerstr 69
A-4040 LINZ
Austria
Tel: +43-732-2468-7131
e-mail: moessenboeck@ssw.uni-linz.ac.at
WWW: http://www.ssw.uni-linz.ac.at/General/Staff/HM/
 

Prof. Pat Terry
Computer Science Department
Rhodes University
6140 GRAHAMSTOWN
South Africa
Tel: +27-46-603-8292
e-mail: p.terry@ru.ac.za
WWW: http://www.scifac.ru.ac.za/cspt/

Francisco Arzu
e-mail: frankie_arzu@yahoo.com or farzu@uvg.edu.gt

Back to contents


Trademarks

Any and all trademarks used on this page are duly acknowledged. In particular, UNIX is a trademark of AT&T Bell Laboratories, MS-DOS is a registered trademark of Microsoft Corporation, and TurboPascal is a trademark of Borland International Corporation.