[Thread Prev] [Thread Next] [Thread Index] [Date Prev] [Date Next] [Date Index]

Palynology teaching dataset




On the Ecology and Evolution gopher archive on sunsite.unc.edu, I
have put a small "synthetic" (that is, made-up) dataset representing
a sample of pollen of variable morphology.  The dataset is intended
for training in multivariate statistics:  the "pollen" can be sorted
into groups that may represent discrete species.

If you're interested in using the dataset yourself, please download
it and let me know if it is useful to you.  Also, if you have any
similar datasets or other teaching exercises, and would like to 
share them with others, please let me know.  Thanks!

Here is the gopher link information:

Type=1+
Name=Multivariate analysis of pollen grains
Path=1/../.pub/academic/biology/ecology+evolution/teaching/pollen
Host=sunsite.unc.edu
Port=70

The dataset can be retrieved via anonymous FTP as well.  Below, I have
included an edited version of an actual FTP session to retrieve this
dataset from the archive.  Indented text is produced by the computer;
my comments to you are in square brackets;  the rest I typed to tell
the computer what to do.

	Una Smith
	una.smith@yale.edu

----------------------------------------------------------------------

	doliolum{una} 29:
ftp sunsite.unc.edu
	Connected to sunsite.unc.edu.
	220 calypso-2.oit.unc.edu FTP server (Version wu-2.4(30) ...
	Name (sunsite.unc.edu:una):
anonymous
	331 Guest login ok, send your complete e-mail address as password.
	Password:
una.smith@yale.edu [I could not see this as I typed it]
	230-             WELCOME to UNC and SUN's anonymous ftp server
	230-                       University of North Carolina
	230-                     Office FOR Information Technology
	230-                             SunSITE.unc.edu
	[I've deleted bits here]
	230-
	230-  If you email to info@sunsite.unc.edu you will be sent help 
	[...]
	230-
	230 Guest login ok, access restrictions apply.
	ftp>
pwd [this means "print the working directory", or where am I now?]
	257 "/" is current directory.
	ftp>
cd pub/academic/biology/ecology+evolution [move to this (sub)directory]
	250-
	250-Welcome to the Ecology and Evolution community archive!  The ...
	250-here is on things of interest to research ecologists and evol...
	250-biologists.
	250-
	250-Recent changes/additions as of December 1994:
	250-
	250-* bioguide/  "A Biologist's Guide to Internet Resources" has ...
	250-             directory now, to accomodate copies in various formats.
	250-             Current versions:  1.7 and 1.8a (December 1994 ...
	[...]
	250-
	250 CWD command successful.
	ftp>
cd teaching/pollen [move to this subdirectory of the current directory]
	250-
	250-This dataset is synthetic.  It was generated by David Coleman at
	250-RCA Laboratories in Princeton, N.J.  For convenience, we will
	250-refer to it as the POLLEN DATA.  The first three variables are the
	250-lengths of geometric features observed sampled pollen grains - in
	250-the x, y, and z dimensions: a "ridge" along x, a "nub" in the y
	250-direction, and a "crack" in along the z dimension.  The fourth
	250-variable is pollen grain weight, and the fifth is density.
	250-
	250-There are 3848 observations, in random order (for people whose
	250-software packages cannot handle this much data, it is recommended
	250-that the data be sampled).  The dataset is broken up into eight
	250-pieces, POLLEN1.DAT - POLLEN8.DAT, each with 481 observations.
	250-We will call the variables:
	250-
	250-1. RIDGE
	250-2. NUB
	250-3. CRACK
	250-4. WEIGHT
	250-5. DENSITY
	250-
	250-6. OBSERVATION NUMBER (for convenience)
	250-
	250-The data analyst is advised that there is more than one "feature" to
	250-these data.  Each feature can be observed through various graphical
	250-techniques, but analytic methods, as well, can help "crack" the ...
	250-
	250-
	250 CWD command successful.
	ftp>
dir
	200 PORT command successful.
	150 Opening ASCII mode data connection for /bin/ls.
	total 205
	drwxr-xr-x   3 90       25            512 Mar 30  1994 .
	drwxr-xr-x   6 90       25            512 Apr  1  1994 ..
	drwxr-xr-x   2 90       25            512 Mar 30  1994 .cap
	-rw-r--r--   1 90       25           1035 Dec  5  1993 README
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen1.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen2.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen3.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen4.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen5.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen6.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen7.dat
	-rw-r--r--   1 90       25          25012 Dec  5  1993 pollen8.dat
	226 Transfer complete.
	796 bytes received in 1.8 seconds (0.42 Kbytes/s)
	ftp>
get README
	200 PORT command successful.
	150 Opening ASCII mode data connection for README (1035 bytes).
	226 Transfer complete.
	local: README remote: README
	1062 bytes received in 0.37 seconds (2.8 Kbytes/s)
	ftp>
prompt
	Interactive mode off.
	ftp>
mget *.dat
	200 PORT command successful.
	150 Opening ASCII mode data connection for pollen1.dat (25012 bytes).
	226 Transfer complete.
	local: pollen1.dat remote: pollen1.dat
	25493 bytes received in 14 seconds (1.7 Kbytes/s)
	200 PORT command successful.
	150 Opening ASCII mode data connection for pollen2.dat (25012 bytes).
	[and so on for all 8 files]
	ftp>
quit
	221 Goodbye.
	doliolum{una} 30:


Okay?  Note you don't need the .cap file;  that's dressing for the gopher
users only.  The README file in each directory is printed by the FTP 
program when you enter the directory.  That's where the text prefaced by
"250-" comes from.