LINFO

GCC Definition



The GCC (GNU Compiler Collection) is widely regarded as the most important piece of free software. Formerly called the GNU C Compiler, the GCC now contains compilers for the C, C++, Objective C, Fortran, Java and Ada programming languages.

A compiler is a usually a program that converts the source code versions of other programs into assembly language or machine language, which can be read directly by a processor (i.e., a logic chip). Some compilers, however, are designed to convert source code written in one programming language into the equivalent written in another programming language. Source code is the original version of software as it is written by a human in plain text (i.e., alphanumeric characters) using a programming language.

Free software is software whose license makes it available to everyone at no monetary cost and allows everyone to use it for any desired purpose, including installing on as many computers as desired, studying, modifying, extending and redistributing in its original or modified form. Among the best known examples are the Linux operating system and the Firefox web browser.

The GCC contains a separate program for each of the programming languages for which it can be used. All share a common internal structure that consists of a language-specific front end (i.e., the first stage of a compiler), which parses (i.e., analyzes) the programs and generates an abstract syntax tree, and a back end, which (1) converts the tree to GCC's Register Transfer Language (i.e., an intermediate form of the code), (2) performs various optimizations of it and (3) then generates the final assembly language. The assembly language is produced using architecture-specific (i.e., processor-specific) pattern matching techniques based on an algorithm written by Jack Davidson and Christopher Fraser.

The GCC contains a full-featured ANSI (American National Standards Institute) C compiler that also supports K&R C (the classic first version of C). This compiler provides multiple levels of source code error checking that are usually provided by other tools (e.g., Lint and Splint), outputs extensive debugging information and can perform many types of optimizations on the resulting object code.

C is the dominant language for systems programming (i.e., for developing operating systems, compilers and other programming languages), and it is also widely used for developing application programs. This is due to its simplicity, efficiency and flexibility and to the ability of C programs to be easily adapted to new platforms (i.e., processors and operating systems). The GCC itself is written almost entirely in C, although much of the Ada front end is written in Ada.

C++ is an object oriented language that is based on C and is widely used for applications development. Objective C is an object oriented superset of ANSI C created by Brad Cox in the early 1980s. Fortran (FORmula TRANslator), developed by IBM in 1954, was one of the first high-level programming languages, but it is still widely used for scientific computing because of its compact notation for equations, ease in handling large arrays and huge selection of library routines for solving mathematical problems efficiently. Java is an object-oriented language developed by Sun Microsystems that improves upon C++ and which is the main language for enterprise-class, networked applications. Ada, a large, complex, block-structured language aimed primarily at embedded applications, was developed at CII Honeywell in 1979 and made mandatory for Department of Defense software projects.

Front ends for several additional languages which are not yet integrated into the main distribution of the GCC have also been written, or are being written. They include Pascal, Mercury, Cobol and Modula-2.

The GCC was originally developed by Richard Stallman as part of his GNU project, which is aimed at developing a completely free, POSIX-compliant operating system that can work on multiple architectures and in diverse environments. A beta version of the GCC, version 0.9, was released in March 1987, and that was followed two months later by version 1.0. POSIX (Portable Operating System Interface for uniX) is a set of programming interface standards governing how to write source code so that the applications are portable between operating systems; Linux and other Unix-like operating systems are POSIX compliant.

In 1997, a group of developers who were dissatisfied with the slow pace and closed nature of the official GCC development process launched a project called EGCS (Experimental/Enhanced GNU Compiler System) which merged several experimental branches into a single project. In response to the greater productivity of their work, in April 1999 Stallman's Free Software Foundation (FSF) discontinued development of the GCC and designated EGCS as the official developer.

The GCC is now maintained by a diversified group of programmers from a number of countries. Major decisions are made by a steering committee, which was established for the purpose of preventing any particular individual, group or organization from obtaining control of the project and ensuring that it adheres to its fundamental principles as described in its mission statement. These principles include (1) developing and improving a world-class optimizing compiler that will work on multiple architectures and in diverse environments and (2) supporting the goals of the GNU project.

The GCC has been ported to (i.e., modified to run on) more than 60 platforms, which is more than for any other compiler. They include 3b1, AMD 29k, AIX385, DEC Alpha, Altos3068, Amix, ARM, Convex, CRDS, Elxsi, FX2800, FX80, Genix, HP320, Clipper, x86 (MS-DOS, ISC, SCO, SysV.3, SysV.4, Mach, BSD, Linux, Microsoft Windows, OS/2), Intel IA-64, Iris, i860, i960, Irix4, 68000, Motorola m88k SvsV.3, MIPS-news, mot3300, NeXT, NS32K, NWS3250-v.4, HP-PA, PC532, Plexus, Pyramid, ROMP, RS/6000, SPARC-SunOs, SPARC-Solaris2, SPARC-SysV.4, SPUR, Sun386, Tahoe, TOW, Umpis and VAX.

Among the GCC's target processors (i.e., processors for which it can compile software that will run on them) are Alpha, ARM, H8/300, System/370, System 390, x86 and x86-64, IA-64 (Itanium), Motorola 68000, Motorola 88000, MIPS, PA-RISC, PDP-11, PowerPC, SuperH, SPARC and VAX.

The GCC has been adopted as the main compiler for building and developing a number of operating systems, including Linux, the BSDs (i.e., FreeBSD, OpenBSD, NetBSD and Darwin), Mac OS X, NeXTSTEP (the operating system developed for NeXT computers) and BeOS.

Moreover, nearly every other piece of free software is to some extent based on the GCC. This includes programming languages such as Perl and Python, which are written in C and compiled using the GCC. In fact, the GCC has even been crucial to the success of Linux.

The GCC is far from being a finished work. Among the current goals are adding more languages, optimizations and targets, improving runtime libraries and increasing the speed of debugging cycles. As of July 2006 the newest release was 4.1.1, which is a bug-fix release for major release 4.1.0, which was launched at the end of February and incorporated a number of new features.

The GCC is automatically installed as a standard component of major Linux distributions and other Unix-like operating systems. It is also available online1, including precompiled binaries for several platforms.


________
1The GCC's home page is http://gcc.gnu.org/.






Created July 23, 2004. Updated July 6, 2006.
Copyright © 2004 - 2006 The Linux Information Project. All Rights Reserved.