[Thread Prev] | [Thread Next] | [Thread Index] | [Date Prev] | [Date Next] | [Date Index] |
Hm, I've just checked to confirm that all 600 pages or so of citations in the Foram Catalogue are on the CD-ROMs; they're just not accessible to taxonomic-name searches. They're in TIFF file format, like the taxonomic pages. To get an idea, look at files 25284 to 25483 on the Foram Disk #4 (this is the "bibliography volume" 30). We could post the filenames (page numbers) of these and other bib pages, and those who wish, and who have the CD's (we could loan a few), can sign up for a block to work on. It turns out that TIFF files can be directly converted to data text by some, if not all OCR programs without the need to be printed out and re-scanned. In Omnipage 12 (our preference, c. $320 street), this is done via the "Schedule OCR" option. FineReader Pro 6 ($150 from Amazon, $270 most other retailers) is another excellent choice. Each page will have from 20 to 50 references, which (after proofreading) could be shipped back and added to the rapidly growing database. Said database to be made available to all on the public-access "Micropaleontology" website, soon to be activated. Sounds oddly doable. John Van Couvering PS - Omnipage 12 At 05:39 PM 1/15/2003 +0000, you wrote: >I appreciate the difficulties - our Department has much the same problem >with all types of archive records (many hand written). OCR software is >improving but probably still to costly to make it worthwhile. > >The benefits of the online catalogue far outweigh the lack of full >bibliographic citations (species searching for one!) and it wasn't meant >as a criticism. Which brings me to the point of this post. I believe that >we could remedy this problem through the Micropalaeontological community. >If a number of foram. workers signed up for "citation duty" then we could >gradually input citation details into a database (Endnote for instance). >We then would be well on our way to an international Foram. reference >database. Not such an onerous task if the load was spread. > >Andy > > > > >>Andy - that information is in the bibliography pages at the front of each >>Catalogue volume. These pages have not been available on the internet, as >>you point out. Taken together they represent a pretty fair resource - we >>estimate between 12,000 and 15,000 citations. Volume 30 of the original >>Catalogue, for example, which was intended to be the stopping point >>(we're working on v. 105 at present!), contains 270 pages of citations to >>the works processed for vols. 1-29. Including everything d'Orbigny published. >> >>It would be relatively easy to scan all the bibliography pages but a >>back-breaker to OCR them and proof them, so that they could be >>searchable, downloadable data -- at least with current OCR technology. As >>an alternative back-breaker, we could manually prepare a simple >>senior-author-and-date index to the entries on the scanned pages, so that >>one could at least find the page to look at. I can post the scans as PDF >>files and anyone is free to download a stack and start in. >
Partial index: