Software and Computer
Programming Languages

Adapted (with much revision and expansion!) from http://fast.faa.gov/pricing/c1919-2.htm

J.Janossy 1/17/2006

What is Computer Software?

Computers dominate every aspect of modern life. Computers range in size and complexity from large mainframe computers used in major companies to the small personal computers which have become household items. The microprocessor chips at the heart of personal computers are also  now being used in such commonplace consumer goods as automobile engines, televisions, and microwave ovens. To function, computers must be programmed using the only "language" their circuits are based on.. This computer language consists of electronic signals represented by on/off electric charges that we commonly refer to by the binary number symbols "1" and "0". Groups of these binary numbers are interpreted by a computer as ASCII character data or numeric quantities or as instructions. Sets of related instructions form computer programs which constitute a basic element in the concept of software. Computer software may be defined as:

computer programs, procedures, rules, associated documentation
and data pertaining to the operation of a computer system

Types of Software

The basic types of software are listed in Table 1. All four types of software are available for and used with personal computers. Windows, for example, is a trade name for the Microsoft operating system used on many PCs; Unix is another operating system which present on MacIntosh computers. Listing a directory of user files is an example of a utility software function. Microsoft Excel is an example of a general purpose software package. A personal productivity system that provides the means to manage a set of customer contacts and sales calls is an example of application software. Software systems may be composed of a combination of multiple types of software.

Type of Software

Function

Application software

Software that performs specialized functions like payroll, human resources, student records keeping, controlling air traffic systems and MIS functions mentioned above, or other useful work not related directly to the operation of the computer itself.

General purpose packages

Off-the-shelf software that can do common chores. These include desktop-based word processors, spreadsheets, database programs, PowerPoint, and other "personal productivity" programs.

System software, also known as the operating system

A collection of programs that manages all the concurrent tasks being performed by a computer including the execution of application software programs.

Utility software

A set of programs that perform routine tasks, such as listing or compressing data, copying files and so forth.

Table 1. Types of Software

Software (Computer Programming) Languages

There also exist four generations  of computer programming languages (or perhaps five, if you count futuristic predictions). These are the means by which the computer programs forming any type of computer software are actually constructed. The generations proceed from the very first, which was the only generation available on early experimental and research computers of the 1940's. Later numbered generations demand more processing power since they require the computer to translate the code written by programmers into a first generation language. The later generations were developed as faster computer circuitry and greater storage capacity became available in the late 1950's and beyond.

First generation language. Machine language, sometimes referred to as object or machine instructions, is the actual binary instructions that a computer executes. Machine language is as low level as you can get. It is the actual instruction set developed by the engineers who designed a given type of computer (CPU) chip. While it is extremely efficient it is very difficult to write directly. It is also not portable. A given machine language program will work only on the computer chip it was intended for. Since scores of different computer chips (CPU's) exist, it is especially undesirable to write general purpose software intended for wide distribution in machine language.

Second generation language. Assembler code usually corresponds to machine instructions on an equal basis, that is, one assembler line of code usually translates into one machine language  instruction. Although assembler code is easier to write than machine code most programmers still find it arcane and cumbersome. Since computer horsepower now costs much less than a programmer's time, practically speaking very few employers will pay a person to write applications assembler language because it takes too long, is too error prone and is too difficult to maintain (change when requirements change). The only exception occurs in very specialized instances where a complex software requirement must be performed on a relatively low-powered CPU, such as occurs in spacecraft or small computer chips embedded in machines such as the carburetor in a car. Because it is essentially a shorthand way of writing machine language code, assembler programs are also not portable between different computer chips (CPUs). For example, a program written in the assembler for a PC won't work on a MacIntosh.

Third generation language (3GL). Next is a higher level language. High level languages translate human-understandable instructions into machine language. These languages are far removed from the actual bit manipulation of a given CPU chip. Popular higher level programming languages have included FORTRAN, COBOL, Basic, PL/1, C++, Java, Visual Basic, Ada and many other lesser-used languages. Compared to assembler language, higher level languages have the advantages of being easier to read and write but they do not always make optimal use of a computer's resources. In older times (about 1980 and before) this was a concern, but since computers are so much faster now this is no longer relevant. An instruction written in a higher level language is often referred to as a source code to differentiate it from a machine language instruction. One line of a higher level language such as C++ will be converted behind the scenes by software called a compiler into many lines of assembler (machine language). The programmer deals only with the higher level language and the assembler or machine language code generated by it is usually not seen. Third generation languages are also known as "procedural" languages since the programmer must develop the specific sequence of actions by which data will be located, brought into memory, interpreted, manipulated and then formatted for presentation on an output device. Among third generation languages FORTRAN (FORmula TRANslator) was widely used by scientists and engineers for mathematically-oriented computational programs, while COBOL (COmmon Business-Oriented Language) was very widely used for business software development. Both of these languages had their heyday from 1959 through the 1990s. Literally billions of lines of COBOL code are still in widespread use throughout the world, while C and C++ have largely taken over for FORTRAN programming. Because national standards were developed for these languages, programs written in them could be used on different manufactures of computers with identical results (each different machine had a different compiler producing the required machine language from the standard program statements.) This significantly contributes to the longevity of programs written in third generation languages.

Fourth generation language (4GL). Known as "non-procedural" languages, the most well known of these is SQL, structured query language. In this form of programming language the programmer specifies what information is wanted and from where by name, but need not be concerned with explicit actions to input the data, interpret it, and display it. In addition SQL provides scores of powerful functions to handle almost all data conversion and output formatting tasks. At least the essential elements of SQL are consistent among various relational database products, so learning how to use SQL is a valuable skill. Many other "proprietary" 4GLs exist, "owned" by one or another vendor. Unlike an open language like SQL, for which published standards exist, proprietary languages are owned by a single vendor. Using such a language ties  an organization closely to that vendor. Widely-used proprietary non-procedural fourth generation languages include SAS and SPSS, both used for statistical analysis of data.

Beyond the fourth generation? Although programs today are written in higher level languages there is a growing interest in languages that even more closely resemble human  language. Programs for spreadsheets, word processors, and similar applications are often written in a very high level language. These allow a person with little or no programming background to interact with a computer. The formulas you plug into a spreadsheet, for example, could actually be considered a very high level language. They are interpreted by code written in third generation languages such as C++ or Java. There are even programs proposed that enable a computer to accept vocal instructions directly. And scientists in the field of cybernetics postulate that somewhere in the distant future computers will have the capability to accept thought commands as inputs. While voice-to-computer translators exist now, and can be interfaced to programming environments, they are not widely used except in support of the disabled. Thought-driven programming? Come back in about 50 years. As of 2006 that level of human-computer interface is still very much in the realm of science fiction!