Googling Turritella, or
The Present and Future Value of the Web for Paleontological Research

Introduction: Thanksgiving in the Adirondacks

Last Fall I went looking for fossils. As is often the case with such exploring, I had something particular in mind, but wasn’t really sure in what form it would be if and when I found it. I was very successful; I discovered several significant fossil occurrences that had not been previously reported in the technical literature and that promise to reveal interesting patterns about the history of a particular group of organisms and its environment. I wrote up the results and submitted them for publication. I found some other occurrences that I could make less sense of, and added them to my "to do" list with the intention of looking for more information in the future.

This description probably fits experiences that all paleontologists have had. What made this one noteworthy, at least to me, is that I did almost all of it on-line. Sitting in front of a warm fireplace in the Adirondacks over Thanksgiving weekend 2003, I for the first time typed the genus name of my favorite group of gastropods—"turritella"—into Google.

The results of this exercise were really no different from any other more traditional paleontological exploring—in the field or in the drawers of a museum collection. Coming upon new fossil occurrences fortuitously but with a prepared mind has been a crucial part of our field from its very beginnings. What surprised me, however, were the sources of my "discoveries". They were not in the databases of professional researchers, nor in the electronic catalogs of institutional collections, nor in the virtual libraries of data or images (now increasingly referred to as "cyberinfrastructure" that are currently the focus of so much of so much of our field’s activity and funding. They were on websites selling fossils or displaying the personal collections of amateurs. They were on the decorative homepages of museums or departments or small towns. They were in the on-line versions of local fossil club newsletters.

It is a commonplace observation that the Internet has changed our personal and professional lives. The almost daily effects for paleontology include not only the ubiquity of email but also the increasing ease with which fossils are bought and sold (for discussions of the fossil trade, on- and off-line, see, e.g., Forster 2001; Long 2002; Secher, 1999, NRC 2002 and references therein). Like so many things today, fossils have been globally commodified. The online fossil trade has in some cases brought institutions more acquisitions, but also more headaches. Many of the fossils easily available on-line have been illegally collected from other countries. Many specimens on-line are fakes, or composites, or restored without being advertised as such. Many have no or incorrect locality information. (These problems are not unique to on-line sales, of course, but they are magnified by the ease and volume of on-line transactions.) Increasingly attuned to potential commercial value by the Internet, furthermore, collectors (or their heirs) are sometimes more reluctant to donate specimens to institutions. When someone does donate a specimen or collection to my institution, I now routinely direct them to eBay as the easiest place to determine its cash value for tax purposes, and I have heard staff from other museums say the same thing.

Saturated though I am with these day-to-day influences, as well as with seemingly endless professional meetings, grant proposals, initiatives, consortia, and workshops devoted to making the collections and data of our field available on-line, I was startled by the manner in which the Internet had abruptly affected my own research. The Web—or at least Google—did make information available to me that would not have been available otherwise; but it was not the information that I hear colleagues and funders talking so much about. From this experience, I eventually found myself asking two questions: (1) What is the actual (as opposed to the potential) utility of the Web as a research tool for paleontology right now (not just in the distant future)? (2) What do the answers to question 1 suggest about the directions that current and future on-line initiatives in our field should take?

The World of Google

Just as it is a gateway to the Web for its users, in some seemingly very real technical (and financial) respects, Google is a microcosm of the promise and peril of the Internet—in general and for paleontology in particular.

Every minute of every day, in more than 90 languages, Google is queried more than 138,000 times, That’s almost 200 million searches daily of more than 6 billion web pages, images, or postings (Newsweek, 2004-03-22). Even before its recent and highly publicized initial public stock offering, Google was a cultural phenomenon, or at least it sounded like one. Commentators and reporters spoke regularly of how it has changed our lives, our behaviour, our relationship to information and to each other. Ultimately, they have tried to convince us, Google promises to "burn a hole in the zeitgeist... changing it forever" (M. Malone, Wired, 2004-03), producing a "Google Zeitgeist" (S. Levy, Newsweek, 2002-12-16), inevitably contracted to "Google-geist" (Forbes, 2003-05-26).

Some items from the recent popular press (all located using Google) testifying to this social transformation are presented in Appendix 1. If we don’t want to depend on just such breathless exaltations of the media for evidence of Google’s wider cultural impact, there are plenty of other signs out there (Appendix 2). As a blogger named Michael Tucker wrote: "It’s fascinating to see how Google has changed Internet usage; not only does it dazzle and entertain, but its logs are apparently becoming valuable social reflection". 

Certainly Google’s founders, Sergey Brin and Larry Page, have encouraged these grand assessments. They seek to make Google the solution to what Brin calls "a really important, big problem for the world", helping people find information that’s important to them (San Jose Mercury News, 2003-05-04). "Google’s long-term dream is to index all the world’s public information and make it searchable..." (Q. Hardy, Forbes, 2003-05-26). Brin says: "I’d like to get to a state where people think that if you’ve Googled something, you’ve researched it, and otherwise you haven’t and that’s it" (S. Levy, Newsweek, 2002-12-16). "‘Users love Google’ says Brin, ‘because they find things there when they are desperate to know an answer.’ Page adds that Google has become ‘like a person to them, helping them and giving them intelligence any hour of the day’" (Q. Hardy, Forbes, 2003-05-26).

Even with all this hype, serious reservations and qualifications have been expressed in many quarters about Google’s effects and accomplishments. These misgivings center on at least two areas. First, regardless of aspirations, Google doesn’t do it all. Even the company’s inner circle realizes this. Page, for example, says that "the ultimate search engine would understand exactly what you type and would give you the right things back," but he then admits, "We’re pretty good, but we’re nowhere close to being perfect. We won’t be for a long time" (San Jose Mercury News, 2003-05-04). Similarly, Google’s Director of Technology, Craig Silverstein, has been quoted as saying that the very reason for search engines was to "seem as smart as a reference librarian", but he acknowledged that this goal was "hundreds of years away" (Kenney et al. 2003). Google does not in fact provide easy access to the entire Internet. "It gives you a false sense that you are close to the entire Internet, that it’s all just a click away" says Siva Vaidhyanathan, who teaches communication at New York University (D. LaGesse, US News and World Report, 2004-05-10). As science writer Joel Achenbach notes, there are many large proprietary or fee-based databases, such as Lexis-Nexis or the Oxford English Dictionary, that Google cannot crawl through. "The Library of Congress," he points out, "has about 19 million books with unique call numbers, plus another 9 million or so in unusual formats, but most have not made it onto the Web. That may change, but for the moment, a tremendous amount of human wisdom is invisible to researchers who just use the Internet".

Second, Google is changing behavior, but perhaps not always for the better. For example, reporters who do research mainly or solely by Googling have been cited for lazy and sloppy work. Instead of actually checking on whether something is really a "trend" by talking to actual people, a reporter will "google" a topic and report simply how many hits resulted (L. Beehner, The Christian Science Monitor, 2004-02-27). Librarians similarly worry that students (and even librarians) are growing lazy, using only Google instead of looking at other available resources. "I use it myself, every day," says Joe James, assistant professor in the information school of the University of Washington, "but I worry about how over reliance on it might affect the skill-set of librarians" (S. Levy, Newsweek, 2002-12-16).

Librarians are worried about several things. One is their jobs. Since the rise of Google, use of traditional reference library services has been declining; one estimate suggests that Google handles more queries in a day and a half than all the nation’s libraries handle in a year. Cornell librarians recently did a modest study to compare the performance of GoogleAnswers, the fee-based Google service, with their own staff. The results (Kenney et al. 2003) don’t indicate a clear "winner". Google did better on some things, Cornell’s librarians on others. Librarians are also worried that students will use only Google in their research. As University of Richmond librarian James Rettig puts it, the average student’s "cluelessness" about the relative value and complexity of information, combined with his or her heightened desire for immediacy, may be a recipe for disaster, unless librarians learn to respond adequately to their users' needs and values. More than a decade ago, I had already personally seen widespread evidence that many college students thought that "research" consisted of typing a word or two into a computer (back when it was just a library catalog). Now the very nature of libraries and information science is changing, and we have only begun to see the effects. "In Westport, Conn. consultant Elena Amboyan’s kids use Google daily; even when they research something at the library, they say they’re Googling it" (Q. Hardy, Forbes, 2003-05-26).

Computer scientists themselves understand both the promise and the inadequacy of Google, and all other search engines. Google has been successful because it detected and harnessed the structure of the Web as it currently is—the way one part is linked to others. It reports back Web pages in order of their importance as measured by these linkages, linkages created by thousands or millions of other users. Thus, as Achenbach puts it in reflecting on the younger and younger age of the average Web-surfer, "The results of a Web search reflect the tastes of a broad swath of ordinary Americans who in some cases are still wearing short pants". Computer scientists know that this is not an optimal situation, and that better search engines will analyze individual user’s queries and, over time, personalize future searches. This burgeoning field of research is called "user modeling", and such a search engine is called an "intelligent agent". Primitive versions of such agents already exist; Amazon.com’s custom recommendations of books, movies, and music are a familiar example.

Google thus resembles the Web itself:

(1) It has huge power and potential for manipulating information and putting it in front of people who otherwise would not have access to it.

(2) It is the subject of enormous expectations and plentiful superlatives about its capability to change everything we do, and enormous resources are devoted to it on this basis.

(3) It doesn’t (yet) do everything we want it to and has not (yet) replaced all traditional sources of information.

(4) Its structure and function allow/permit it to do things that were unintended—good and bad.

(5) Points 3 and 4 are sometimes forgotten or ignored by those engaged in point 2.

With all of these plusses and minuses, it was Google that I used to search the Internet for information on fossil turritellid gastropods.

Googling Turritella

I googled "turritella". Here’s what I found:

(1) Besides being a group of snails, "turritella" is also:

(a) the name of a mountain in a "mythic fantasy role-playing game" called Everway;

(b) the name of a princess in at least one children’s story (which confusingly seems to have two titles: "The Blue Bird", and "The Green Fairy Book");

(c) the name of a restaurant in the town of Castel Viscardo, near Orvieto in Umbria, about 100 miles south of Florence, Italy;

(d) the name of a British ship of 5,528 tons captured by a German raider on 27 February 1917, during World War I, and scuttled off the Isle of Skye.

More directly relevant to the subject at hand, I further learned:

(2) As any search engine user knows, it’s all in the terms you enter (Table 1). Date and search engine also matter, but to a lesser degree. I got 7,930 results from the simplest search ("turritella") on 2004-08-07, but 8,340 pages with the same search on 2004-08-31. On both dates, Google returned a larger number than any other search engine for this simplest search.

(3) There are lots of fossil Turritella out there for sale (almost 15% of the top 400 results; Table 2).

(4) Almost 20% of the results are not true Turritella (a marine group), but "turritella agate", which is a coquinite of shells of the freshwater pleurocerid gastropod Elimia (= Goniobasis) from the Eocene Green River Formation of Wyoming. Most of these sites were selling, and more than a quarter of them were concerned with the purported mystical and "new age" powers of this gemstone.

(5) There is a lot of "turritella kitsch" (i.e., shell arts-and-crafts of various sorts) out there for sale (see images of my own collection). 

(6) Of the top 400 results located in the 2004-08-07 search (Table 2), only 8 (2%) were museums, and none of these were links to their collections databases. The closest was Malacolog, a Recent mollusk database at the Academy of Natural Sciences in Philadelphia (Rosenberg 1993; Morris and Rosenberg 2002). The top-ranking museum (number seven among all results) was the Paleontological Museum at the University of Oslo. About 13% of the top results were in some way "professional" or "institutional" science, including faculty or student research or datasets, or departmental or personal sites.

(7) Less than 10% of the top 400 results were connections to technical publications. Particularly prominent were the BioOne service of Biosis, PitBossAnnie.com, and abstracts and other publications of the Geological Society of America.

(8) The most numerous type of site, with almost 24% of the top results, is what I refer to as "private science". These included postings of personal collections ("virtual museums"); extensive lists or photo galleries of fossil or living shells; or descriptions (photos and text) of local faunas from particular sites or stratigraphic units.

(9) My "discoveries" of previously unknown or poorly known turritellid occurrences came from websites either selling fossils (item no. 3, above) or built by amateurs (no. 8). In some cases these discoveries led me back to technical literature I had overlooked; in some they were the only known records.

Specifically, Google led me to seven instances of "turritelline-dominated assemblages" ("TDAs") that had not been previously reported as such in the literature. TDAs show a peculiar distribution through time: they are widespread in siliciclastic sediments in both the Cretaceous and Cenozoic, but their occurrence in carbonate facies is limited (with one exception) to the Cretaceous and Paleocene. Explaining this pattern may have implications for the overall evolutionary history of turritellines, and especially for our understanding of their changing relationship with marine nutrient levels (Allmon 1988; Allmon and Knight 1993; Allmon 2004a).

By email, I contacted individuals associated with each of these seven occurrences. The results were decidedly mixed:

(a) Walnut Formation (limestone), upper Lower Cretaceous (Albian), Coryell County, Texas: Specimens of a TDA collected in the 1930s or 40s by Don Brenholtz of Abilene, Texas and displayed in his "virtual museum". Mr. Brenholtz recalled in detail the locations where the specimens were found, and donated one for my study. I subsequently ran across a photo of an identical specimen in a popular guidebook to Texas fossils (Finsley 1989), and was led by it to the collections of the Dallas Museum of Natural History, from which I borrowed a similar specimen that had more detailed stratigraphic data than Brenholtz was able to provide. I hit even more paydirt when I ran across the Texas Roadrunners website (a weblog-type site devoted to just about anything you might see on a roadtrip in Texas), which mentioned "some Turritella Limestone as described in the June [2003] issue of Rock and Gem magazine". An Internet search allowed me to purchase a copy, and in it I found a marvelous article by William Rader of Austin (Rader 2003), who described in detail his work at a turritelline-packed site in Coryell County, providing apparently the only published account of this occurrence in situ. An Internet phone book allowed me to find his address and make contact, and he subsequently donated a large slab for my research. 

(b) Woodbine Formation (sandstone), upper Lower Cretaceous (Albian), Tarrant County, Texas: Two specimens collected and offered for sale by Lee Duchouquette of Gentry, Arkansas. I purchased both specimens and Mr. Duchouquette was very helpful in providing detailed locality and stratigraphic information. Although a classic monograph exists on the Woodbine fauna (Stephenson 1952), no such turritelline concentration had previously been reported.

(c) Weno Formation (claystone), upper Lower Cretaceous (Albian), Marshall County, Oklahoma: Specimens were offered for sale by Glen Kuban of Houston, Texas. He had collected them near a creek that runs into the northern side of Lake Texoma in Marshall County, south-central Oklahoma in the mid-1990s. Mr. Kuban was very forthcoming with stratigraphic details of the site, but unfortunately said that the owners of the property would not allow access to it nor for him to divulge its exact location. Through traditional means, I eventually located relevant technical literature (Bullard 1926, 1928) that put the find in context.

(d) Nekum Member, Maastricht Formation (limestone), Upper Cretaceous (Maastrichtian), Netherlands: A specimen was illustrated for decorative purposes on the homepage of the Maastricht Museum of Natural History.

Dr. John Jagt of the Museum informed me that Binkhorst (1861) and Kaunhowen (1898) had both mentioned this occurrence, but not the abundance. He could not locate that specimen illustrated on the Website, but kindly offered to collect a specimen for me.

(e) Bordeaux, France, Miocene (Burdigalian): The attractive white fossil turritellids mounted on tan matrix that are for sale at every gem and mineral show and shop come from an area in Bordeaux, France (e.g., . I communicated with a number of fossil dealers who handle these and learned that almost all have had their matrix consolidated with an artificial fixative. I eventually found Pierre Lozouet, who shared with me photos of the "outcrops" from which these fossils are taken. They are just holes dug in flat ground, from which come thousands of the beautiful while shells. He is currently working on these faunas, and is unaware of, and I have been unable to find, any technical publications describing this occurrence.

(g) Germany: Near the German city of Ulm is the only public park in the world (that I know of) dedicated to turritellids. The Erminger Turritellenplatten is a Miocene sandstone packed with turritellids. It is well-known in the local community, so much so that there is a small roadside interpretive site, complete with detailed geological panels. I struck up email communication with Klaus-Dieter Hildebrandt, a local amateur. The only publications known to mention turritellids from this locality (Quenstedt 1885; Lutzeier 1921) do not explicitly discuss their abundance.

(h) Bodjong Formation, Java (Pliocene). The University of California Berkeley Museum of Paleontology website has a page devoted to fossils of the Pliocene Bodjong Formation in Java. Written by undergraduates in 2000, it mentions "turritella sandstone" and gives a long list of references. Unfortunately, none of the references mention abundant turritellids. I contacted the Museum and found that there were collections made by J. Wyatt Durham in Java in the 1930s. Although the UCMP collections contain some of the turritellids, they unfortunately do not include any material in its original matrix that would allow assessment of original abundance.

I also found five other previously unknown TDAs in more traditional, but no less accidental, ways. Two of these were found by browsing the traditional literature.

(i) Fort Terrett and Segovia Formations (limestone), Pecos and Kimble Cos., Texas (Lower Cretaceous): I was leafing through back issues of Transactions of the Gulf Coast Association of Geological Societies and ran across an article by Brian Lock (Lock and Roberts 1999) describing "high-spired snails" in Lower Cretaceous limestones in west Texas. I contacted Dr. Lock by email, and he has generously shared material from two localities. 

(j) Woodbridge Clay Member, Raritan Formation, Upper Cretaceous, New Jersey: I ran across this one in a partial photocopy of a fieldtrip guidebook article (Owen et al. 1977), which I found while cleaning up old files in my office.

Three others were found in the drawers of my own institution’s collections (see Allmon and Poulton 2000; Allmon 2004b for further discussion of this very important mode of "exploration"):

(k) Eocene of Egypt: A moldic claystone from "the pyramids of Giza or Saquara".

(l) Neogene of Venezuela: A sandstone with no other data.

(m) Esmereldas, Ecuador (Angostura Formation, Late Miocene): A sandstone collected from a loose block on the beach.

Unfortunately these specimens did not have much locality data, and I am now pursuing additional information in the literature and other museum collections on these three occurrences.

So What?

What (if anything) does all this teach us about the Web and modern paleontological research?

(1) Access to non-traditional information sources. The much-heralded "democratic" aspect of the Web is real, in that it connects professionals and non-professionals. Just as blogs allow anyone to have their own magazine, the web allows amateur (non-professional or avocational, if you prefer) paleontologists an opportunity to contribute to professional science. The Web can provide anyone with a venue to connect collections to users. A much undercited paper (Teichert et al. 1987) discussed the wealth of mostly unused information held in unpublished theses and dissertations. The same could be said for non-professional collections, which may outnumber those in the world’s museums (Allmon 1997, 2000). It may be that the Web is the best way to connect these collections to the wider world.

(2) Where are the big databases? The data I ran across using the most powerful search engine the Web has to offer did not locate any of the major database initiatives under construction. This of course doesn’t mean that I couldn’t go to Paleobiology Database, or NMITA,  or CHRONOS, or individual museum collections databases and find considerably more information. It just means that Google did not access them, and that if I didn’t know about them from some other source I wouldn’t know to look at them. In any case, none of these large efforts, to my knowledge, are aimed at compiling all existing paleontological information. They are designed for specific research purposes, not as encyclopedias. Some of the many efforts to catalog the world’s living species (e.g., Species 2000, the All Species Foundation, the Integrated Taxonomic Information System (ITIS) come closer to this as their stated purpose. (It has been pointed out to me by Paul Morris at the Academy of Natural Sciences that the design of most of these databases has up to now focused mostly on search interfaces (i.e., "having a bunch of text boxes in a search form map onto database fields, using a few text boxes map intelligently into more complex data structures, or using a single text box with a free text search or a search parser and interpreter"), and for the most part, has not included interfaces to browse into the database in a way that would make its contents available to a search engine such as Google. Presumably, this could be changed.) 

(3) Where is the literature? My Google search also did not pick up many technical publications (except those listed on individual bibliographies). Where were the indexes to journal titles? To unpublished masters and doctoral dissertations? (The just-launched "Google Scholar" appears to a major step in this direction.)

(4) Museum collections on-line. The situation with on-line paleontological collections databases is much improved from where it was 10 years ago; for example, almost all major type collections in the U.S. are now on-line. But there are still enormous holes in the on-line information available about existing institutional collections, suggesting that we should be rethinking both our planning and execution. For example:

(a) Non-types. There seems to be little hope for data or images from a significant proportion of non-types in major collections becoming available any time soon. The kind of on-line "browsing" I was doing—looking at specimens in a way similar to what one would do in museum drawers—is therefore currently not possible at all for any major institutional collections. (As anyone who has tried can testify, the major costs of putting collections on-line are currently not technology but labor.) Are there steps short of entering every label of every specimen that could improve this situation? What if the 20 largest paleontological repositories in the U.S. put on-line basic inventory information and a photograph (at an appropriately high resolution) of every drawer in their collections? (It has been pointed out to me that this would also need some sort of image "zoom" mechanism, such as is currently used on the Academy of Natural Sciences prototype website "allcatfish" and probably scans of all labels, but surely this is worth doing if it opens these collections to huge numbers of potential new users.) The Fossil Gallery page of the Paleontology Portal appears to be a good start in this direction, but it needs to be massively enlarged if it is to serve as a genuine research tool.

(b) Connectivity. The often-stated goal of being able to move seamlessly between different institutional databases appears to remain a distant one at best. Despite efforts to develop common standards (e.g., White and Allmon 2000), there is still no single format for on-line data on museum collections in invertebrate paleontology, no single portal for searching the information already available from all collections. The Paleontology Portal has recently launched a first step in this direction, offering access to the linked collections databases of the University of California Berkeley, the Academy of Natural Sciences, the Yale Peabody Museum, and the Florida Museum of Natural History. This could be a very important beginning to solving this problem. 

(5) We aren’t there yet. Google isn’t the only way to get information on a particular kind of fossil any more than a traditional encyclopedia is the only way to get information on Nathan Hale. But the point is that it aspires to be. The Internet itself daily aspires (and sometimes promises) to deliver everything to everyone. And we are all beginning to treat the Web like it already holds everything, at least everything of value. "Google is the ultimate mirror world, reflecting the aggregate brilliance of the World Wide Web, on which is stored everything..." (S. Levy, Newsweek, 2002-12-16). I recently had two NSF program officers tell me that any database of museum invertebrate paleontology collections supported by NSF had to be connectable to both the Paleobiology Database and CHRONOS. But when I queried the principals of these two projects, I learned that they had never considered the issue of connecting to museum collections, and weren’t immediately sure how such connections could or should be made.

Like most new technology, the Web provides access to information we didn’t have before, and opens up exciting new research possibilities. But we are a long way from where we say we want to be—a long way from a true transition from analogue to digital paleontological information (which is being made much more rapidly by other fields, such as genomics). I do not doubt that we will someday get there. But how can we make that happen most quickly and easily? There is no single answer. We clearly must continue the ambitious on-line initiatives already underway and build the "cyberinfrastructure" of paleontology, but we should also think carefully about just what and where paleontological "information" really is, and about how we really want the Web to serve (someday) as our single conduit for all of it. Better search engines are coming, and we can do more to be ready. We can think about the nature of our own data, on- and off-line, and be better prepared for the successor to Google.

If you googled your organism, what would you get?

PE Editorial  Number: 7.2.3E
Copyright: Coquina Press December 2004