[Draft document. See link below for details on status.]

Perl: a technology white paper from O'Reilly & Associates

O'REILLY

Quicker and more reliable IT results with Perl

by Tim O'Reilly, Cameron Laird, Larry Wall, and Nathan Torkington

 

 

Your organization needs Perl. Its strengths in data processing are so many, and its costs so small and manageable, that we're ready to bet you'll profit from Perl, whatever the specifics of your situation. This white paper presents the facts for you to examine that claim for yourself.

This focus here is entirely on Perl's role in solving problems typical of information technology (IT) departments. Perl's history, the details of its implementation, and its appropriateness for use in educational settings are examples of topics which appear in this white paper only to the extent they're pertinent to Perl's practical application to solving problems.

One theme that does recur throughout this white paper is how Perl is like other computer technologies, and how it's different. Perl's full value to your organization depends on the balance between these two aspects. Perl is sufficiently similar to other languages that you can use your existing work practices: methods of assigning work, arranging for training, managing configurations, designing implementations, and so. To get the most benefit from Perl, though, you'll eventually learn to exploit its singular virtues, including its astounding run-time library of freely re-usable modules.

 

What is Perl?

Perl is a computer language, like RPG, C, or Java. You might have heard that Perl is a "scripting language", or is "interpreted", or several other characterizations. These are distractions; if you want to know more about such categories, several of the references in the Appendix explain such technical characteristics as Perl's byte-code compilation, syntax scheme, and embeddability. What's important to understand about Perl as a computer language are these qualities: it is ubiquitous, highly expressive, flexible, reminiscent of several other languages, inexpensive to use, and endowed with a remarkably powerful run-time library.

This section explains these qualities and their importance in your department's efficiency. The different qualities complement each other. Their culmination, explained in the final paragraphs of this section, is the Comprehensive Perl Archive Network (CPAN). You'll want to read all the way through to understand the incredible benefits CPAN can bring you.

Perl is ubiquitous. Expect to be able to run Perl on every computer. Good Perl processors are available not only for all flavors of Unix, Windows, and MacOS, but also for OS/400, OpenVMS, MS-DOS, and many more specialized operating systems. Perl's availability is far, far greater than Java's, for example, and it's much easier to write Perl programs that work identically across operating systems than it is to write such programs in C.

You can experience this platform-independence for yourself. Choose the desktop you prefer for your own work. Starting from scratch, you should be able to locate, download, and install a freely-available copy of a Perl processor for that operating system in less than an hour. In under an hour more, even if you have no previous experience with Perl, you can write a simple program to handle a modest but useful chore for you. You might report on all files on your system of a minimum size which haven't been changed recently, or automate retrieval through File Transfer Protocol (FTP) of daily operating statistics on a particular server, or enhance the sophistication of your spam filters. You probably have experienced first-hand the tedium of porting a substantial application coded in C or Java; the process seems always to uncover surprising details that must be accommodated. You'll enjoy the contrast with Perl. It will surprise you how sensible and trouble-free it is to port most Perl programs.

Perl is expressive. Perl has developed during more than a decade to simplify programmers' work. It has many abbreviations and shortcuts to streamline the effort of development, and is notorious for its ability to support a variety of coding styles. This is an advantage to you because developers generate brief programs with more efficiency and accuracy than long ones.

Perl dominates several different application realms. Estimates for Perl's portion of all Common Gateway Interface (CGI) programming, over all platforms, range up to 90%. System administrators have been writing small-but-powerful filters and converters in Perl for over ten years.

Perl is flexible. Perl is flexible to a fault. First, at the level of syntax, Perl is famous for offering a variety of ways to express the same semantics. Perl programmers with good senses of style can make their work more readable and idiomatic by choosing among such functional equivalents as:

if ($ytd < $budget) { $monthly =3D 14 };

$monthly =3D 14 if $ytd < $budget;

$ytd < $budget and $monthly =3D 14;

and

unless ($ytd >=3D $budget) { $monthly =3D 14 };

There's a more fundamental sense in which Perl is flexible, though. The problems it solves range over a wide span:

C, C++, and Java are generally regarded in the industry as the "standard" languages of implementation for new projects. They're seen as safe choices for a wide span of projects. Perl's record of success at solving problems that require a range of architectural approaches is entirely comparable to those of C, C++, and Java--except that Perl programmers generally get results quicker. A team with active competence in Perl is well equipped for most challenges.

 

Perl will remind your programmers of other languages. Perl looks like other languages, so it won't surprise your programmers. Whether the core competency of your programming staff is in Visual Basic, C, Java, COBOL, FORTRAN, or RPG, they'll find plenty in Perl that's familiar. Perl permits object-oriented coding, but doesn't insist on it. Whatever your developers' background, they'll find the transition to Perl a gentle one.

Perl is low-cost. It's not just that there are no licensing fees for running Perl. Its load on your machines is light; Perl programs typically perform within a small factor as well as those coded in C or assembler. Perl's superiority in speed of development generally more than compensates for any shortfalls in runtime speed a particular Perl program exhibits. In the few cases where Perl's performance constrains an application, it's a simple matter to re-link Perl with hand-coded C or C++ to eliminate bottlenecks.

Perl's run-time library is remarkable. There 's a crucial technical and cultural point that lies behind all the success among Perl practitioners you've been reading. Perl's run-time library (RTL) is qualitatively different from what you've experienced with other languages. It's important to examine this difference in detail.

Generally when people talk about the essentials of a computer language, they focus first on syntax. With C, for example, the source code for a program is a sequence of statements, most of which have expressions which are themselves made up of smaller elements. To understand C syntax requires comprehension of right-hand sides, left-hand sides, pre-operations and post-operations, rules of precedence, and more.

Even with all that, though, you're only half-way to understanding. To accomplish real work under a particular operating system, you need to use proper syntax in invoking the appropriate "run-time library". For a Windows application, this might be the Microsoft Foundation Classes (MFC); under Unix, the POSIX.1 library is a good choice. Such run-time libraries as these provide all the essential building blocks--reading, writing, establishing network connections, reporting on the environment, and so on--that

a useful program combines to achieve a result. C is often criticized for its lack of portability. Generally what is meant by this is that the run-time libraries used in developing C-coded applications are poorly standardized, not that the C syntax is interpreted differently on different platforms.

You've read above that the Perl syntax has certain virtues--it's generally readable to those who know other languages, it's admirably concise, and so on. While all these qualities are true, they don't explain Perl's record of achievement. To appreciate that, you need to know about Perl's core run-time library and its amazing extension, CPAN.

First, Perl's core run-time library is portable across all platforms (with minor exceptions--some facilities have no meaning under a particular OS). While the names of the functions and operations in the RTL often show their descent from Unix conventions, they work reliably and accurately on all supported platforms. In general, Perl programmers don't need different reference manuals for each operating system on which they deliver. In this, Perl is more like Java than C or Basic.

Along with the core RTL included with all Perl distributions, volunteers have assembled an incredible archive of specialized auxiliary run-time libraries called CPAN. While Java and many other technological innovations have presented themselves as the harbingers of "component-ware", it is CPAN and the controls (also known as OCX) market around Visual Basic that have made this promise real. Do you want to automate e-mail transactions, encrypt transmissions, or connect to a niche database? CPAN probably has a module that'll meet your need "out of the box".

Let's be clear on the magnitude of this achievement. Many languages have the technical capability to support re-use of pieces others have written. Perl has developed an entire culture of re-use, though. When a new technology emerges, for example, eXtended Markup Language (XML), or Lightweight Directory Access Protocol (LDAP), it's now natural and expected for volunteers to write a Perl interface to the technology, to contribute it freely to CPAN for re-use, for it to be properly indexed and archived for public access, for documentation to be available in a standard format, and for all this to be maintained efficiently. The combination is so powerful that it constitutes a professional imperative. Exploitation of CPAN's riches is an undeniable "best practice". Failure to take advantage of CPAN might soon be seen as malpractice.

 

How will Perl help you?

What specific benefits will Perl bring your department? Which problems do you face?

Perl is at its best in providing specific solutions that fit particular situations--exactly the kind of help that's hard to describe in general terms. What this white paper can do, though, is give a few examples of the ways Perl is likely to pay off for you.

Accelerate development. Practitioners consistently report that they code solutions in Perl with a fraction of the effort to do the same with C, C++, or even Java. The line-count for Perl programs is typically a third to a tenth the size of comparable C codings. The effort to write programs scales at roughly the same multiplier.

Let cheap Perl programs do mundane tasks; save expensive humans for more important work. Perl's power and expressiveness expand the horizons of what programs are worth writing. Are nodes on your LAN diverging in their configurations? Perl makes it easy to write a program which examines the Registry of each Windows desktop through the network, and reports discrepancies periodically. Do you find yourself doing a lot of repetitive sorting of your daily e-mail? Automate a Perl filter to file it all for you in separate folders. Is too much of a receptionist's time going to telling in-bound travelers about the local weather, or which salesmen are on the road today? Script a minimal Web page on your intranet so people can see the answers for themselves.

Expand portability. If you do most of your development in PowerBuilder or C++ or most other languages, you probably think of portability as a problem. That will change as you come to rely on Perl. Its portability is good enough that you'll come to assume your work is available on a variety of platforms. This is particularly important for organizations where information- processing is a strategic asset. Portability is a requirement for escaping from the pitfall of a single-sourced operating system.

There's more to portability than that, though. Because Perl is so flexible, you'll find that your teams can re-use what they've written across not just different operating systems, but across different software architectures and kinds of applications. The algorithms from the data-mining reports that started out on a mainframe can be imported directly into the CGI for your intranet applications. Little "hygienic" routines that your system administrators use to clean up file system corruption can be polished into automatic performance monitors that work around the clock. Command-line utilities that your engineers happily use can be wrapped up with inviting graphical user interfaces (GUIs) for the convenience of more casual users.

Java promises much the same--in principle, Java also can play all these roles. Perl is generally much quicker to write for small- to medium-size programs, and Perl is simply far easier to use today--and for the foreseeable future--across a variety of operating systems.

Perhaps Perl's strongest achievement in the dimension of portability is that Perl is likely to change your attitude about portability. Instead of the last-minute anxiety that portability is with many development technologies, Perl's portability is so strong and reliable that you'll find yourself expecting it simply to work right from the start.

 

Correcting the fables told about Perl

Perl is sufficiently different from other languages that many misconceptions about it are repeated. Here's the truth:

Perl is well-supported. Perl is an open-source project. Larry Wall originally created it, and distributes it under a very liberal license which allows others to use and re-distribute it quite freely.

Because Wall doesn't receive any money directly for his work on Perl, many people make the mistake of thinking that Perl is "unsupported". The fact is that, in any objective sense, Perl's support is better than that for more traditional languages. Do you want training in Perl? Dozens of firms offer training. Does the possibility you'll run into an error in Perl worry you? First, Perl's achievement in quality for over ten years has been at least as fault-free as those for commercial vendors of C compilers or Java development environments. When errors have been found, fixes have made their way into releases more rapidly than is typical for commercial language processors; in one case, Wall issued an official release within a week of receiving a security alert from the Computer Emergency Response Team (CERT).

A large community of programmers around the world is expert at modifying and correcting the core Perl sources; many of these programmers are available for hire. Finally, several companies=97most prominently, ActiveState Tool Corporation (under the Perl Clinic brand), and PerlSupport--offer contractual guarantees for Perl shops ready to pay for support.

It helps to realize that there are many aspects to a software product; its price is only one. While there's no charge for using Perl, its use is governed by a license and intellectual property law just as much as any product sold in a box on a shelf. You can, for example, freely write and run any Perl program you want, on any platform. You can't, though, simply change "Perl" to "Diamond", and start selling the "Diamond" language processor as your own work.

If you haven't used open-source products before--that is, those for which the source code is freely available--you'll want to familiarize yourself with the provisions of Perl's license. An open-source product isn't necessarily more or less convenient than a conventional commercial one; mostly it's just different. Keep in mind that the people who work with Perl are motivated to encourage its use. License restrictions are in place only to prevent such abuses as fraud.

It's fast. One wide-spread myth about Perl is that it's slow in operation. This often appears as a faulty syllogism: "Perl is just a scripting language, so it must be slow"; or "Perl is interpreted, so it must be slow." In fact, network or hard disk latencies are more likely to limit performance than is Perl's execution model. The easiest way to confirm this is probably to do the experiment yourself. As explained above, you can quickly install a Perl release on a handy machine, implement a few test programs, and time the results yourself. You're likely to find not only that Perl programs are about as fast those coded in C or Java, but also how much quicker it is to write in Perl rather than those other languages.

It's portable. The trade press echoes Java zealots' creed that that language makes it=20possible to "Write once, run anywhere". The truth is that Perl has achieved much greater portability than Java. Unlike Java, Perl has a single implementation with identical semantics on all platforms.

Perl also gives easy and powerful access to the facilities of its host operating system. A consequence is that many Perl applications are OS-specific. For example, it's common for Unix coders working with account management to read the system file /etc/passwd directly. This has no meaning under MacOS or Win3.1. However, the Perl language processor itself has a proven record of maturity on a much wider range of OSs than Java workers now enjoy. Programs which avoid OS-specific facilities are at least as portable as Java programs which do not exploit the Java Native Interface (JNI). In the example above, Perl programmers can code more portable access to user account information so that the same program works for both Unix and WinNT.

Perl applications are robust. The Perl language does much more for developers than do C or Java. It's possible to reference uninitialized variables or invoke subroutines with mis-matched arguments. Some language theorists argue that a proper language should restrict coders from such errors. They conclude that applications coded in Perl are fragile.

Quite the opposite turns out to be the case. Perl applications are "failsoft". Rather than core dumping at random times, Perl applications tend to degrade gracefully in the face of unexpected data and coding errors. Moreover, Perl's exception-handling is the equal of Java's and far superior to C's.

A particularly stark example of the range of attitudes possible on this subject has to do with type declarations. In C or Java, variables must be defined before they are first used. Perl typing is implicit--variables can be used without ever being declared. This difference certainly is important. It's a strength of Perl, though, not a weakness. Automatic type inferencing and casting are exactly the sorts of jobs that are better done by computers. While C coding is frequently regarded as more "rigorous" or "structured", what's most certain is that it is more verbose and redundant. These defects in C lead to such characteristic frailties as dangling and inconsistent type definitions.

Some computer theoreticians and decision-makers fall into the trap of trying to define languages like George Orwell's Newspeak, in which it is impossible to think bad thoughts. What they end up doing is killing the creativity of programming.

It certainly is possible to achieve quality while coding in Perl. The Perl processor includes several options to scan programs for syntactic dangers. Perl also detects many runtime errors that C compilers don't catch. Good stylists in Perl know how to use its power and flexibility to express solutions more precisely, succinctly, and safely than is possible with "third-generation languages" such as BASIC, C, and Java.

Perl is Y2K-compliant. Briefly, Perl is Y2K-compliant. The Perl Institute summarizes: "Perl is every bit as Y2K compliant as the C language upon which its interfaces are based ... That is, the interfaces giving access to date information in Perl, when used as designed, are Y2K compliant in every sense of that word."

Faulty programs can, of course, be written in any language. The best way to think about Perl in regard to the so-called "millenium bug" is that Perl is part of the solution, not part of the problem. Perl is so good at improvisational pattern-recognition that it's very likely you can use Perl to detect and correct errors in your existing software.

Good help isn't hard to find. Here's a nightmare you should never have to worry about: you begin using Perl. Most of your Perl coding is done by one system administrator and one enthusiastic development architect. Six months after Perl gains a toehold in your department, both these employees relocate, and none of your current employees have confidence in their ability to code in Perl.

Why shouldn't this worry you? It certainly is important to maintain a critical mass of development expertise on your staff. Perl is like other languages in this regard. The remedies for it are like those for other languages: you'll want to arrange for training in Perl, buy useful Perl tutorials and reference manuals, ensure that your coders use good style (including in-line comments), properly recognize achievement in Perl programming, and alert your recruiters that Perl expertise matters to you. There are no absolute constraints to replacing Perl talents. The market in freelance Perl consultants is a healthy one; several of the traditional leaders in consulting (including IBM, EDS, and Cap Gemini) keep Perl programmers in their stables; and recruiters now are accustomed to looking for mentions of Perl on resumes. Several estimates on the number of programmers worldwide comfortable with Perl have centered on a half-million--more than RPG or JavaScript, and perhaps as many as know Java.

 

Perl is safe. Perl's most telling distinction is one that's hard to quantify: individuals and organizations that learn Perl don't regret their decision. We know of many teams that have to abandon projects coded in C or C++ or Java after encountering a difficulty inherent in those languages. There are very, very few cases of the same with Perl. Beginners who worry about Perl's speed or readability almost always find that, in practice, Perl gives them all they need.

Sometimes engineers move on from Perl to less well-known languages that meet specialized needs. It's our universal experience in such cases that they continue to use Perl. They're not abandoning Perl for a different language, but replacing it for some uses. Perl's values are permanent. Once you've learned or adopted Perl, you'll find more and more reasons to confirm your choice.

 

What can go wrong when you rely on Perl: possible cultural hurdles you'll face, and how not to stumble over them

Installation of apps. One of the biggest practical challenges of working with Perl is that Perl programs have generally assumed a Perl interpreter is already installed. There's only a weak tradition of constructing stand-alone executables. The most frequent habit of experienced Perl programmers is to distribute solutions as bundles of scripts. This requires that Perl already be properly installed on a target host--an assumption generally valid on the Unix nodes where much early Perl work was done.

The general expectation of workers with Windows and MacOS is that an application can be installed without "prerequisites." Good technical solutions to this challenge exist: despite what many people believe, it is feasible to compile Perl applications into standalone installations. However, they're not yet widely understood. If you need standalone installations, be sure you're working with a consultant who understands the issues for the operating systems on which you deliver.

Licensing. A user has the choice of whether the GNU General Public License (GPL) or the Artistic License governs Perl. In any case, your organization has complete freedom to create any applications with Perl it needs for its own use.

Source code management. Perl encourages brevity and personal coding styles. This is particularly appropriate for the one-liners and special-purpose tools system administrators often need. Larger-scale projects, and especially those with a long enough life cycle to involve more than one person, merit more explicit and thorough coding. Java and Eiffel aim to enforce readability in their syntaxes; sometime the true consequence is to enforce verbosity. While Perl isn't so restrictive, it certainly permits readability. A wise organization recognizes the importance of supporting programmers in disciplined coding practices. This applies with particular force for Perl.

Perl's renowned flexibility also challenges project scalability in more technical ways. Perl is a good object-oriented language, that is, one in which it's efficient and straight- forward to code good object-oriented designs. Perl doesn't enforce object orientation, though. A programming team must establish explicit standards for its work so that individual work products will mesh smoothly. To repeat: it is practical to write large-scale programs with Perl. It's not entirely automatic, though. Perl allows you to choose your level of discipline--that is, it demands leadership, just as C and Java do, in their own way.

 

Piloting Perl into real use

This white paper claims that an investment in Perl will return positive net results for your organization. One of Perl's charms is that the initial investment can be very modest. It's much different in that respect from Java, for example. With Java, you need a bulky Java Developers Kit (JDK), and indoctrination in the Java approach to object orientation, and often a Java Interactive Development Environment (IDE), before you can produce useful results.

Perl's much lighter. Everything you're likely to need can fit on a single floppy. Many people have taught themselves enough Perl from a single magazine article to achieve some result=97a dynamic Web page or new administrative report--they wanted.

Give Perl a chance. Identify a problem area for your organization that seems apt for Perl, and run a minimal pilot program. Experience for yourself Perl's features, so you can make a good decision about how it can best contribute to your success.

 

Appendix: learning more about Perl

You can learn more about Perl through all the usual mechanisms. Dozens of books, some of them quite good, are available. Training classes are abundant, many of them on-site. Perl topics frequently appear on the schedules of professional conferences, and a few focus exclusively on Perl. Many magazines cover Perl, both in feature articles and monthly columns. There's even one English-language monthly, "The Perl Journal", devoted exclusively to Perl.

More than with other languages you use, though, much of the intelligence available about Perl is on-line. Both Perl.COM <URL:http://www.perl.com/> and The Perl Institute <URL:http://www.perl.org/> are well-organized sites that should lead you quickly to on-line information you might need.