Where is the archive for newsgroup X?

This document is a first stab at collecting answers to the titular question, "where can I find an archive for a newsgroup?" Information appears in these sections:

Remarks on formats

The volunteers who archive newsgroups do so in a variety of formats. Indexes, compression schemes, and so on vary in sophistication.

Indexes

Remarks.

Compression

Section 2. of the main comp.compression FAQ explains the compression schemes commonly encountered. Most used are the Gnu Project zip (gzip) and compress.

Remarks on sites

Credit goes to all the sites which make newsgroup archives available to the Internet community. I know that some of the most remarkable achievements are those done on a shoestring. At the same time, the Sunsites and Mark Kantrowitz's AI archives at CMU (mail ai+news-archives@cs.cmu.edu with comments) deserve special recognition for the volume of material they maintain. [Explain more.]

Advice to those who want to construct an archive

Automation tools

HURL

The HURL project demonstrates one innovative approach to constructing an archive of newsgroup postings.

Hypermail

rkiv

rkive, the USENET newsgroup archiver, is a project of Kent Landfield. Interest in it supports a mailing list.

Usenet-Web 1.0

Benjamin Franz's Usenet-Web 1.0 archiving software.

Notable archives

Some newsgroup archives distinguish themselves by ...

Archive angels

...XXXX, YYY, Benjamin Franz , ... have experience at constructing and maintaining newsgroup archives, and have graciously offered to make their email addresses available to those seeking advice.

The moderated-groups mailing list [give ref] comprises a wealth of experience on archive maintenance.

One bit of counsel: read the basic IR literature.

What to do if you can't find an archive

Feel free to email me; it occasionally happens that, with the motivation of a specific need, I can turn up an archive that hasn't yet made its way into this index. Before resorting to that, though, there are a few other resources which archive-seekers ought to know:

NetNews databases and indexes

Altavista

Altavista and DejaNews are the two archives I use most often.

DejaNews

DejaNews is simply great. For all the attention Yahoo receives, fulfilling a different need, I find DejaNews a far more impressive technical feat. If you think you read something in some newsgroup within the last two months, and you can remember a few keywords from the article, or the author's name, or the newsgroup, or perhaps some other shred of information, you're likely to find exactly what you need with DejaNews in quick order.

Excite

Excite implements interesting new indexing technology, which it applies both to Web and newsgroup documents. I have yet to succeed in convincing it to deliver me any of the latter, though; it definitely searches and locates relevant postings, but my experience is that something is wrong with delivery.

InfoSeek

InfoSeek indexes something like the most recent month of over 10,000 NetNews newsgroups. InfoSeek charges a fee, but permits limited demonstration searches at no charge. I'm still experimenting with InfoSeek; it looks quite promising.

InfoSeek

InReference

Internet Archive

I haven't researched this site at all. I do know that founder Brewster Kahle is a thoughtful pro, and I look forward to studying his work here.

NPAC Oracle 7

Gang Cheng of the Northeast Parallel Architectures Center began in 1995 experimental (?) archives of a number of newsgroups (and mailing lists). The user interface has yet to communicate to me all that I believe is intended, but I do know how to reach the Hypermail view there of several dozen newsgroups. I've included them in their proper places in this index. Perhaps others can locate even more information at this site; I know it's not all getting through to me.

Usenet Newstand

Critical Mass Communications' Usenet Newstand most closely resembles DejaNews, from what I can tell. It indexes a far, far more restricted universe of newsgroups, but seems to afford more precision in its searches. I'm still researching this one.

The CD-ROM answer

For the first six months of the life of this document, I included excerpts from the promotional literature of CD Publishing Corporation and InfoMagic. I've since decided that hypertext references to their WWW pages suffice. I have no relation to either organization, not even that of customer. Both offer newsgroups-on-CD-ROM. See them for more details. Also, PCM-Productions publishes CD-ROMs of several alt.binaries* and Visual Basic-related newsgroups.

Steve Murray tells me that InfoMagic's CD-ROM includes a useful selection of comp.* newsgroups, their FAQs, miscellaneous other FAQs, but no more than that, and that CD Publishing does not reply to his emailed inquiries. Be alert for WWW pages with superficial appeal but which haven't been maintained for many months. Good luck, all.

More recently, I've learned of "Internet on a Cd-Rom", from Logica Servizi Edizioni Software s.r.l. Again, I have not seen this for myself, but promotional literature promises tens of thousands of quality articles per CD, for under $75 US each. Contact Logica as ndr@logica.it or log-info@logica.it, or telephone 39-6-44291214, or FAX 39-6-44291390.

Public newsreading sites with long expirations

Perhaps you're searching for a particular article posted recently to NetNews, but long enough ago that it has already expired from your home system. Is there someplace else you might read it? Perhaps so. Here are some possibilities:

Why

Why I composed this Index

Like many of the things I've done--and most of my best work, in particular--I created this index on a day when it seemed less trouble than not to do it. A net_journalist pushed me recently, and I estimated my original motivations as My realized rewards after a year and a half have been more like In my life as a software engineer, I've learned wild enthusiasm for reviews. I've also learned that for results to match expectations even this closely should be counted a notable success. I do; on the whole, I'm glad I started this Index to Newsgroup Archives.

Why newbies read it

I know Ron Meisenheimer only through a few email exchanges. He's far too thoughtful and articulate for me to imagine that he ever was a "newbie", but he once called himself that, and explained
Before I signed on with a access provider, I read something on netiquette and style. One tip that made an impression on me was that you should get a feel for a newsgroup before you barge in with something that might be inappropriate. Archival material is perfect for getting that feel. You don't have to lurk for weeks or even months and still be unsure that your posting won't draw a let's-not-go-into-that-again response.

Why net anthropologists study it

Why long-t...

Mirrors

Copies of this meta-index, slightly lagged, are available also at

Acknowledgements

Thanks to Jorn Barger, Jamie Blustein, Josh Hayes , Jennifer Hodgdon, Jim Jewett, Kent Landfield, Larry London, Ron Meisenheimer, Gerald Oskoboiny, Jon Reeves, Edward Vielmetti, and Danny Yee, who supplied me with various combinations of support, inspiration and information. Also credit David Pascoe , who methodically searched for comp.binaries.* archives, and reported the results to me for the benefit of all of us, and many other individuals, who each filled in one slot in the list nearby. Finally, there are about two dozen others who chipped in at various times, and whom I intended to thank explicitly, but I mislaid the list of their names--sorry!

Thanks to NeoSoft, a commercial provider of, among other services, Internet connectivity. They have made this space available at reduced charge.

Administrative details

Status

This is a work in progress.

In late 1995, there have been around three to four hundred accesses each day of the *newsgroup_archives web.

New directions

I'm Cameron Laird. I'm looking for help:

History

I began a draft of this document on 16 August 1994, and immediately pushed it into FTPspace with the dozen archives I knew at that time. I have big plans to beef up the list as I learn more. Suggestions, corrections, ... couldn't be more welcome. Please note that I'm maintaining this reference with less quality assurance than is appropriate for many other projects; in particular, I make no guarantees about the completeness of archives (some I know to be truncated), currency of addresses (I update them as often as I make time, but that's not much), or even hospitality of the hosting sites (I don't *think* any of these archives are supposed to be confidential, but I have explicitly confirmed that in only a few cases).

Validation methods

I'm slowly automating aspects of validation of these WWW pages, particularly in regard to [Explain details, some day.]
First drafted in HTML on 1 August 1994 by Cameron Laird. http://starbase.neosoft.com/~claird/news.lists/newsgroup_archives.html