| [Thread Prev] | [Thread Next] | [Thread Index] | [Date Prev] | [Date Next] | [Date Index] |
The white paper by Norm MacLeod et al. does a good job of outlining issues of quantification and databases in paleontology, but I would like to see consideration given also to another facet of computer applications in our field, that might be characterized as "embedded intelligence" or "expertness". This has a bearing also on an issue raised in the white papers on biostratigraphy and systematics - namely, the growing shortage of experts in those fields. AN EXAMPLE PROGRAM To set the stage for this memo, I'll briefly describe a program called COREXPERT that we developed at Scripps Institution to help technicians and students with limited experience to make reliable descriptions of our sediment cores. This was an appropriate field for experimentation with an expert program, because Cenozoic pelagic sediment sequences contain only a limited number of fossil groups, mineral species, types of sedimentary structures and bedding contacts, etc. When data on sediment samples are being entered using this program, mineral and fossil constituents are entered from menus, each item of which has a hypertext link to textual and graphic information to assist in its identification. Additionally, there is information on where and under what conditions that constituent commonly occurs. As the user inputs percentages of microfossil groups and mineral constituents in a sample, the program checks a set of rules to see whether there are any anomalies. For example, if the user enters an unusually high or low percentage for a constituent, the program displays a warning and explains any special circumstances under which such extreme values can occur. The user can then adjust the estimated percentage to conform to expectations, or keep the original entry. When all the constituent percentages have been entered, the program checks the input for unexpected and expected associations of constituents. For example, if sponge spicules have been recorded the program checks to see whether radiolarians have also been recorded, since the latter almost always accompany the former in pelagic sediments. The user can either modify the entry to conform to the "expert" expectation, or retain the anomalous record - which can then become a target for special investigation. Besides checking the validity of data entry, the program also allows intelligent inference of information not explicitly stored in the database. For example, heteropods are an uncommon sedimentary constituent not stored in a separate field in our database. But the program can infer that heteropods are probably present in the sediment if the sample contains 15% more each of foraminifera and coccolithophorids, and the water depth is less than 4,000 meters. This rule can be invoked when the database is searched for occurrences of heteropods. A non-technical account of this software has been published by Tway and Riedel in the Jan/Feb issue of PC AI. It illustrates how relatively easy it is to embed some "intelligence" into data entry and retrieval software. EXTENSION TO STRATIGRAPHY AND PALEOENVIRONMENT It would be a straightforward matter to incorporate a similar level of expertise into a system for entry of taxa into a database to be used for biostratigraphic and paleoenvironmental purposes. The user could have access to textual and graphic aids to species identification. The software could recognize anomalous presences or absences in each assemblage recorded, and it could look at samples in a sequence up and down a sediment column and interpret paleoenvironmental changes and zonal boundaries, explaining itself as it proceeded. In this way, expertise could be passed on from seasoned veterans to inexperienced beginners. AND TO STRATIGRAPHIC SYNTHESES It is possible to envisage an expert system of a higher level of complexity to make "intelligent" stratigraphic syntheses from large groups of sequences described in terms of fossil occurrences, sedimentological characters, paleomagnetism, isotopic data and so on. It has always bothered me that probabilistic methods of stratigraphic correlation ignore a large amount of information on the RELATIVE RELIABILITY of each earliest and latest occurrence of a taxon in a single sequence, and similarly each interpretation of a magnetic reversal, isotopic shift, etc. Take, for example, limits of stratigraphic range of a fossil taxon in a sequence. The reliability of such an observed limit can range from very poor to very good, depending on a number of easily determined factors. The reliability will be greater for a taxon which is present as tens of individuals per sample, than for one present as one or two individuals per sample. It will be greater for a taxon easily distinguished from all co-occurring taxa, than for one that is distinguished with difficulty. It will be greater for a locality well within the area of distribution of the taxon, than for a locality near the periphery of the distribution. And there are a number of other factors involved in determining this reliability - see Riedel and Westberg, 1982, DSDP vol.LXVII, p.289 [Please excuse the egocentricism.]. Rules could be written into a program that calculated an "index of reliability" for each determination of an earliest and latest occurrence in each sequence, on the basis of such factors as these. And an appropriate set of reliability-determining factors could be developed also for non-fossil characters used in stratigraphic correlation. Automatically calculated indices of reliability could greatly improve the quality of correlations between numbers of sequences, by using some "intelligent" weightings rather than applying simple majority rules. Bill R. W. Riedel Scripps Institution of Oceanography UCSD La Jolla, CA 92093-0220 wriedel@ucsd.edu phone (619) 534-4386 fax (619) 534-0784 . . . . May the Force be with you . . . .
Partial index: