1. Introduction: This is a html version of a technical report describing the development of a series of Fortran 77 stand alone applications for the morphometric analysis of microfossils under the binocular under reflected light. While the first part of this report deals with the extraction and analysis of two dimensional outlines of microfossils under reflected light on a purely Macintosh based imaging system, the supplements contain calibrations and tests in combination with the later development of AMOR (Automated Measurement system for the mORphometry of microfossils) on a PC based system. Especially supplements No. 7 and 11 contain programs, that can be applied for outline analysis of microfossil images, that were created with PC-based AMOR, although analysis still occurs on Macintosh.
Some of the programs are useful for the analysis of any microfossil under the light microscope, while others were especially adapted to the analysis of globorotalid foraminifera in side view. The programs include batch programs for outline extraction from digital images (Trace33_batch.out), a simple bivariate data plotting program (XYPlot2.out) for quick visual control of extracted outlines, and several programs for various standardizaton methods of outline data (Sprep53.out, Norm52.out, Homolo31.out., and Averou3.out) were written. Size and simple metric shape parameters can be analyzed with programs KeelWidth93.out or KeelBend7.out, and their statistics is computed using programs Compose31.out, Stat_Prep21.out and Stats62.out. For graphical representation of results commercial programs are recommended, such as Excel, Cricket Graph, CaleidaGraphor any other software capable to plot data. For shape analysis a simple Fourier decomposition program Harman2.out was written. Program ShapAv2.out calculates average harmonic functions from a series of outlines and program Recon2.out provides the inverse from the harmonic analysis (e.g. composition of the courve from its harmonic functions). All programs can be downloaded as an archive (MacOS 9.2). They were developed and used during a study about menardiform globorotalias (Knappertsbusch, 2007).
2. Methods:
2.1. Programming environment The software code was written in Fortran 77 using the MPW 3.4.1 development environment for PowerMacintosh by Absoft. Most of the programs do not need specific graphical libraries code and so are easily portable to other computer platforms, provided that the respective fortran compiler is available. Program development was done on the PowerMac series of computers under MacOs 8.1 and 9.2.
2.2. Digital imaging Size and shape data from microfossils were obtained by digital imaging using a video camera from Kappa, that is mounted on a binocular microscope and connected to a PowerMacintosh 8200. For the acquisition of outlines digital images should be in a Tiff format or raw format, the former with 8 bit grey-level resolution and with dimensions of 640x480 pixels. Here, Nih-Image 1.6.1 from Wayne Rasband was used for image acquisition and preprocessing. Nih-Image is in the public domain and was developed at the National Institute of Mental Health (NIMH), part of the National Institutes of Health (NIH). Nih-Image can be downloaded from the Nih-Image website of the National Institute of Health under http://rsb.info.nih.gov/nih-image/Default.html. For image processing Adobe Photoshop is required, mostly for the transformation of Tiff images into raw format (although this can also be done with other software as well). More information about hardware used by the author for microfossil analysis is given in Knappertsbusch (1998 and 2000), and Knappertsbusch et al. (2006).
3. Overview of programs
The MorphCol collection allows for a variety of analyses of microfossils:
Cycle (I) - Outline extraction to x,y coordinates, visual checks, transformations, derivation of simple size and shape parameters per specimen.
Graphical presentation: Use commercially available software.
Trace33_batch.out XYPlot2.out Sprep53.out KeelWidth93.out KeelBend7.out Norm52.out Compose31.out Stat_Prep21.out Stats62.out Homolo31.out Averou3.out HARMAN.out Recon2.out Rename_raw.out Rename_Tiff.out Grid2.1.out
3.1. Simple analysis of size and shape
Outline extraction: Program Trace33_batch.out (Cycle I) The first step of any morphometric analysis is outline extraction with program Trace33_batch.out (see Figure 1). The program is calibrated for the Leica MZ6 binocular and allows for output of x,y coordinates either in pixel values or in microns.
Input files: Input files are digital images of 640x480 pixels in raw format, and with all grey-level information removed (application of LUT changes to a binary black and white image, see Figures 2 and 3).
All area to the left side of the shell should be in black, especially along horizontal line with y=240 pixels (e.g. the middle of the image). White areas around the object can occur as long as they have more than 2 pixel difference to the outline. The file names of the raw images are of the following format:
where xxxx is a four digit integer designating the specimen in a particular cell of a microslide, and r indicates, that the file is a raw file. Other input files, that are necessary is a text file File_list containing the names of the image files, and the microscope magnification. The format is
where the first five characters indicate the file names of the raw images, and the following digits (separated by a comma) indicate the magnification read at the magnification changer on the microscope. The number of input image files is unlimited, depending only on the memory of the computer. The program reads the image names and the magnifications from file File_list, and then treats every file in sequence (batch mode of operation). If an image is missing from this list, a message is sent to the screen and the error is recorded in an external file called missing_images.
Output files: For each image an output file abcdefghijkKxxxx_T is written to disc containing the xy-coordinates of the object outline. The sample name abcdefghijk is 11 characters long and must be entered at launch of the program. K stands for a letter indicating the orientation of the specimen, usually K for keel view, U for umbilical view and S for spiral view. The four integers xxxx indicate the image file name xxxxr, and the suffix _T indicates that the file contains Traced data. Other output files are missing_images and Image_errors (see further below). Figure 4 illustrates an outline consisting of 708 points, which was generated from the image shown in Figure 3.
Figure 4. Traced file 502_0100CCK0501_T at magnification 2.00x (e.g. the record in file-list was 0501r,2.00).
Application: After launching Trace33_batch.out the user is prompted for the sample name (11 characters long, see above). After entry, the programs asks whether outline coordinates in pixel values or micrometers are desired. Enter 1 for pixel values and 2 for microns. Thereafter, the list containing the names of the images and the magnifications are requested. Enter the name containing these data (maximum 20 characters).
Calibration: Trace33_batch.out has been calibrated for an imaging system consisting of a Kappa CF 11/2 color camera, a Leica MZ6 stereo microscope and a PowerMacintosh (G3) under system 9. Conversion of x-pixels and y-pixels into micrometer values is performed with the two equations
where MAG is the magnification read from the File_list.
Pixel to micrometer conversion is done by multiplying the x,y (pixels) values with the corresponding inverse of Xprec or yprec. If a different camera, a different type of computer, or another microscope are used this calibration must first be verified and the system must eventually be recalibrated.
Extraction algorithm: Because each pixel of an image is 8 bit in depth, a particular pixel and its grey value is recognized by one ascii character, which is also 8 bit long. The grey level is recognized by the standard character to integer conversion function ichar(ch), where ch stands for one character. The method of outline extraction is by checking all grey-levels in the neighbouring eight pixels. If a certain level is met the center point is stored and its x,y coordinate (in pixels) can be calculated via the record number of each pixel (direct access). The algorithm is described in more detail in Knappertsbusch (1998).
Known difficulties: The program works fine in almost every case of an outline. Nevertheless, outline coordinates should be checked before using them in statistical analyses (this can be done with program XYPlot2.out, see section 3.2.1 further below). Problems have been encountered if the (white) object has embayments, that are 1 pixel wide. In that case the program continues calculating but does not advance to the next point in the outline. Because of this difficulty, an interruption is implemented, which stops outlining that file and the program advances to the next image. In that case an error is reported to the screen and a message write to the external file called Image_Errors, designating the name of the failured image. Other problems (like odd outliers in the outline) are not entirely excluded. In addition, if the left margin of the input images contain non-black pixels at the y coordinate of 240 pixels, a program stop is enforced, and only a runtime error is reported (reporting a text warning directly informing on this particular problem would be better).
Other versions: There are two other versions, Trace34_batch.out, which has been adapted to MacOs X for handling file extensions necessary under System X, and Trace32_batch.out (has an older calibration for conversions of pixels into micrometers at various magnifications).
3.2. Statistical analysis of size and simple shape parameters.
3.2.1. Checking outlines: Program XYPlot2.out (Cycle I) This program has been written in order to quickly batch run a visual check whether outline extraction on images has been correctly performed in each image before starting the statistical analysis (check for eventual strange outliers, see Figure 5). As does Trace33_batch.out the program XYPlot2.out runs in batch mode and plots x,y coordinates from a series of files to the screen of a Macintosh computer of the series G3 or G4. Although simple, XYPlot uses Macintosh specific graphic libraries, which means that this code cannot be directly ported to other operating systems without adaptations for the graphics output.
Input There are two types of input files: 1.) A file (filename maximum 20 chars long) containing a list with the names of all traced data files. Example:
2.) The traced files themselves, which contain the cartesian x,y coordinates of the outlines. Each filename is 17 characters long.
Output The program draws a window to the screen and plots the outlines in series of 5 curves. After each series of five curves, the operator needs to press return and the program plots the next following curves until all traced files are plotted.
Figure 5.
Known difficulties: No difficulties encountered so far. XYPlot2.out has been adapted to plot outlines of menardiform globorotalias into MPW's output window. If other coordinates are required the program must be adapted acordingly and recompiled.
3.2.2. Preparation of outline coordinates for size and shape analysis: Program Sprep53.out (Cycle I)
Sprep53.out prepares - in batch mode - the traced outlines from Trace33_batch out for further analysis: it determines the number of outline points in the traced files, the minima and maxima of the x and y coordinates, respectively, the coordinates of the centroid, the area enclosed by the outline, and reduces the number of outline points at equiangular distances with respect to the coordinates of the centroid by linear interpolation and in anti-clockwise direction. The number of desired new outline points can be defined by the operator. The x and x coordinates are then transformed into a new coordinate system u,v, so that the centroid of the transformed outline becomes the new origin. The new outlines are then written to external files.
2.) The traced files from Program Trace33_batch out, with the X,Y coordinates of the outlines. X and Y are separated by a comma.
Output files: 1.) Files with the suffix _POL, which contain the polar coordinates of the outline. All angular values (THETA) are in Radians.
2.) Files with the suffix _INT contain a reduced set of X,Y outline points, which are interpolated at equiangular distances (pixels or micrometers). The values are cartesian coordinates. Figure 6 illustrates the reduced (now 250 points) and translated outline shown in Figure 4. The origin of the coordinate system coincides now with the centroid of the outline. The dimensions of the outline are the same as in Figure 4. Figure 7 illustrates the shift from traced outline to the interpolated outline.
3.) A file called "Errors_during_Sprep" recording errors during processing.
4.) A file with the name "MEASUREMENTS", that contains individual size and shape parameters for each file. The format of MEASUREMENTS is as follows: The first line contains the sample age (in million years). The second line contains the header. The third line contains the sample name (17 chars long), the age in Ma, the original number of outline points, the difference between Xmax and Xmin (=the spiral height in µm), the difference between Ymax and Ymin (the axial diameter in µm), and the area in millimeter squared. File MEASUREMENTS will be further used by other programs.
Filename convention: Sample file names are 15 characters long, and suffixes are separated by a dash.
_T are traced files _INT are files with interpolated cartesian xy coordinates _POL are files with interpolated coordinates but in polar coordinates.
Operation: After launching Sprep33_batch.out the program calls for the sample age. Enter the age (5 characters long with two decimal points). Thereafter it calls for the new number of points to which the traced input files shall be reduced. In case of G. menardii 250 points were standardly used. Finally the program calls for the list containing the names of all traced files. After entry the program treats all samples in a batch and writes the files with the interpolated _INT and _POL data, MEASUREMENTS, and Errors_during_Sprep to disk.
Known difficulties: The code of Sprep53.out has a patch to calculate the maximum diameter of the outline. This portion works not correctly and has been outcommented. Note, that decimals in the input file must be separated by a point (not by a comma).
3.2.3. Analysis of size and simple shape parameters: 3.2.3.1. Program KeelWidth93.out (Cycle I)
Given the x,y coordinates of the silhouette of a globorotalia in keel view, this program determines the width of the keel D10 and D90 at the lower and upper edge of the profile, respectively (see Figure 8).
Figure 8: Parameters measured with KeelWidth93.out
The lower keelwidth of the profile is determined at 10 % of the vertical extension of the shell in keel view, and the upper keelwidth is determined at 90 % of the vertical extension of the shell in keel view. The basis (=0% of the vertical extension of the shell in keel view) is set at the lower keel.
The keelwidth is defined as the horizontal (x) distance between two points with identical (or nearly) identical y-coordinates at 10% or 90% of the vertical (y) extension of the shell in keel view. Y10 and Y90 are relative parameters with resepect to the shell size, which means that the outlines need not to be normalized beforehand and x,y coordinates can directly be taken from the traced (_T) files.
In addition, the program calculates the "keel-angles" Phi1 and Phi2, and the "residual keel areas" (e.g. the difference between the keel view area and the area of the inner polygon defined by the minima and maxima of the x and y coordinates (see Figure 8).
Input files: Two types of files are needed as input: 1.) One file called "MEASUREMENTS", which contains the names of the specimens to be treated read, and some of the morphometric parameters. "MEASUREMENTS" must first be generated with program Sprep53.out (see above). As noted above, it is not necessary to normalize the traced files.
2.) The other input files are the traced (_T) files containing the individual x,y coordinates for each specimen. From "MEASUREMENTS" and from the corresponding _T files the program determines the extremes in x and y coordinates, and then calculates the upper Phi1 and lower Phi2 keel angles, as well as the residual area.
Output files: Two output file are generated, one which contains the results, and a second file called Errors_during_KeelWidth, that contains an error report. The name of the result file is calculated from the sample name (encoded in the _T files), the species name, the number of chambers in the final whorl and the coiling direction (all entered by the operator during program run), which means, that the files must be sorted in the appropriate way. The name of the result file is of the general form RESKW_sample_species_chambers_coiling.
Example: reading the specimen 502A106125K0001_T from "MEASUREMENTS" and getting menardA as species code, 6chambers in the last whorl and a sinistral coiling direction, the output filename will be RESKW_502A106125K_6Ch_s.
The output file has the following format: Specimen, Age, #pts, Species, #Ch, Coil, X (µm), Y (µm), R=X/Y, Ar (mm2), D10 %, D90 %, D9010%, Dmin%, Dmax%, Phi1°, Phi2°, ARESLO (mm2), ARESUP (mm2), DEV (in %)
Operation: Upon starting the program, the operator is prompted to enter the species name (a name code consisting of 7 characters, see Table 1 below), the number of chambers in the final whorl (2 characters), and the coiling direction (s for sinistral or d for dextral), each separated by a comma. The program then writes the error report and the result file to disk.
Known difficulties: A problem has been encountered with the calculation of keel residual areas: The keel view area has been calculated previously by program Sprep53.out and is written in input file MEASUREMENTS. In program KeelWidth93.out the sum of the partial areas under the curve (ABM1, M2BC, DM2C, and AM1D) should add to the value given in MEASUREMENTS. However, there is a deviation up to 5% and the reason was not found yet in subroutine ARSEG (thats why the variable DEV was introduced in the results file from KeelWidth93.out). Note, that decimals in the input file must be separated by a point (not by a comma).
Table 1: Name convention for species names in KeelWidth93.out and KeelBend7.out The names are 7 characters long (followed by the chamber # and coiling, separated by comas)
3.2.3.2. Program KeelBend7.out (Cycle I) This is an alternative program for the investigation of globorotalia keel profiles by determination of the bending of the keel around the upper or lower keel region. The program runs in batch mode and can handle many specimens in a series. The method is by approximation of a selected segment around the keel by polynomial regression (the degree can be selected between 3 and 5) and then to calculate the bending of this function at the closest position to the minimum and maximum y value of the outline curve (see Figure 9).
Figure 9. In-circles to characterize the bending of the keel regions.
Experiments with menardiform shells have shown, however, that this method of keel characterization includes some problems, especially if the keel region has irregularities or shows embayments. In such cases two or more in-circles need to be defined per keel region, which is impractical. In addition, the radius of the in-circle is influenced by the length of the curve segment, which approximates one of the two keel regions. For example, if the cubic spline approximation was determined from a smaller number of points, the radius of the in-circle will tend to be smaller; if a larger number of points was used the in-circle gets larger because more variation is included. If a small shell is analyzed only few points can be used to characterize the keel region, in case of a large shell is used with more points per unit length on the segment, this leads to a non-comparable situation. The method thus depends critically on the selection of the length of the approximation and on shell size, which is not ideal for our purposes. For this reason program KeelBend7.out was not used in any further globorotalid analysis of outlines, but is presented here for reasons of completeness.
Input files: Two types of input files are needed: 1.) The outline files containing the x,y coordinates. The program can be run either on traced (_T), interpolated (_INT), or normalized (_NORM or NORD) files.
2.) The file MEASUREMENTS, which was generated by program Sprep53.out. The program reads the specimen names from this file.
Output files: The program generates several types of output files: - For every input curve two files are written, which contain the x,y coordinates of the segments of the upper and lower keel regions. The names of these files are 20 characters long, of which the first 15 characters designate the specimen and the last 5 characters are either _MinK or MaxK. The suffix MinK designates the lower keel region, the suffix MaxK designates the upper keel region.
- The results are stored in one file called RES_sample_species_chambers_coiling. It contains for every specimen the following parameters: Sample (specimen name) Age (Ma) (Million years) #_Tpts (original number of points in traced outline) #_Acpts (number of points in interpolated or normalized outline; allows for detection of errors in interpolated outlines) Species (a code for the species name) #Ch (the number of chambers in the final whorl) Coil (the coiling direction) X (µm) (the spiral height of the shell in keel view) Y (µm) (the axial diameter of the shell in keel view) R=X/Y (inflation) Ar(mm2) (the area in keel view in millimeters squared) Rupkeel (the radius of the in-circle of the upper keel region, in µm) Rlokeel (the radius of the in-circle of the lower keel region, in µm)
- File Missing_files is an error report and records files, that are in FILE_LIST, but that were not found during program run.
Known difficulties: None known. Note, that decimals in the input file must be separated by a point (not by a comma).
3.2.4 Normalizing outline data: Program Norm52.out (Cycle I) Several analyses may require a normalization of the outline data, for example for specimen intercomparison or for fourier analyses. Normalization is performed with program Norm52.out, and there are two normalization methods implemented, which can be chosen:
1.) Mean Radius Normalization 2.) Maximum Radius Normalization
The Mean Radius Normalization method calculates the mean of all radii, and then devides all radii by that mean. The effect is, that outlines with different sizes have a also different sizes in their normalized form. This method is applied when outlines are prepared for fourier analysis: The zero'th harmonic function from the normalized outline will be a circle around the origin and with radius 1 (larger outline in Figure 10).
Figure 10.
The Maximum Radius Normalization method determines the length of the longest radius, and then divides all radii by that length. The result will be, that the maximum length of the normalized outline will be equal to 1. This method is applied if the shape of different outlines are to be directly compared (for example by graphically superposing of outlines; see smaller outline of Figure 10).
Normalized outlines from the same outline illustrated in Figure 6 (250 points). The larger outline was generated using Mean Radius Normalization as method (output file 502_0100CCK0501_NORM, 250 points), and the smaller outline was generated using Maximum Radius Normalization (output file 502_0100CCK0501_NORD, 250 points). The Maximum Radius Normalization has its lower edge at -1. Normalized coordinates are without dimensions.
Input files: There are two types of input files: 1.) A file containing the list of names of all traced or interpolated files to be normalized.
2.) The files with the x,y coordinates of the outlines (one file per outline). Traced (_T) or interpolated (_INT) outlines can be normalized, depending on the requirements. Usually, interpolated files (ending with _INT) are used.
Output files: There are two types of output files: 1.) One file per input file with the x,y coordinates of the normalized curve. The normalization can be either the Mean Radius Normalization method, in that case the suffix of the files is _NORM. Or, the normalization method can be the Maximum Radius Normalization method; in that case the suffix of the output files is _NORD.
2.) An error report "Errors_during_Norm" is written to disk indicating the outline files, where an error was encountered during operation (an error is reported, if a file, that is listed in LIST_INT_FILES does not exist as input file).
Operation: After starting the program, Norm52.out first asks about the method of neormalization. Enter 1 for the "Mean Radius Normalization" method, or 2 for the "Maximum Radius Normalization" method.
Thereafter, the file containing the list of filenames for the individual outlines must be entered, enter a name (for example LIST_INT_FILES). The name have up to 20 characters.
Known problems: No problems known. Note, that decimals in the input file must be separated by a point (not by a comma).
3.2.5. Preparation for statistical analysis: Program Compose31.out (Cycle II) With program KeelWidt93.out result files of the general form RESKW_sampleK_nCh_c are generated, which contain the measurements of simple morphological parameters for a particular sample, per species, per chambers numbers and per coiling direction.
Given a number of such result files, program Compose31.out composes (in batch mode) one single file "Composed_files" containing the measurements from all files. "Composed_files" can be imported in a commercial data plotting program (Cricket Graph, Excel, other programs) in order to generate scatter plots from the individual variables.
Input files: There are two types of input files: 1.) A file FILE_LIST, which contains the names of all individual files to be treated. The name of FILE_LIST can have up to 20 characters.
2.) The individual files with the measurements of each specimen. The files, that contain the measurements have the general name "RESKW_sample_species_nChamb_COIL". They were generated with program KeelWidt93.out, and contain the following variables in this order:
SPECIMEN,AGE,N,SPECIES,NCHAMB,COIL,DX,DY,R,AREA,D10,D90,D9010,DMIN,DMAX,PHI1,PHI2,ARESLO,ARESUP,DEV
Output files: One file with the name "Composed_files" is generated. From each file "RESKW_sample_species_nChamb_COIL" the program strips off the header information and writes the morphometric data into "Composed_files". This file can be read with a commercial data plotting program (Cricket Graph, Excel) to generate scatter plots.
Operation: After starting the program the user is prompted for the name of the file containing a list of all result files to be treated. One file called "Composed_files" is then written to disk.
Known problems: No problems were encountered so far. Note, that decimals in the input file must be separated by a point (not by a comma).
3.2.6. "Splitweighting" - correction for different splitting within size fractions: Program Stat_Prep21.out (Cycle II) Under some circumstances treatment of uneven sample splits becomes necessary (see Knappertsbusch, 2007). This is the case, if for example, a sample has been sieved in two size fractions "small" and "large", and if the number of measured specimens from the smaller size fraction heavily exceeds the number of specimens in the larger size fraction. If, for practical reasons, there are too many specimens in the smaller size fraction, their number must be set into relation to the specimens of the larger size fraction if mean values are to be determined for a given morphological variable in that sample. Such uneven sampling is compensated for with program Stat_Prep21.out and must be done prior to statistical calculations (see Figure 1).
Stat_Prep21.out takes into account the degree of splitting and number of picked specimens in each size fraction. (Example: Only 1 specimen was found and measured in the entire larger size fraction, but 100 specimens were found in 1/4 split of the smaller size fraction. In that particular case, each data row in the input matrix representing the morphometric measurements from an individual shell from the smaller size fraction was replaced by 4 identical rows prior to the calculation of statisticals, while the number of data rows from the larger size fraction remained one. In this case the split factor is 4). If subsplits of the two size fractions contained similar frequencies of menardiform globorotalias' the splits could be picked quantitatively and no such corrections were necessary.
Input files: Input is a single result file, which was previously generated with program KeelWidth93, and which has been slightly modified to the following format, e.g. with two new variables at the beginning of each record. The first variable indicates the split factor (2 characters), and the second variable indicates the corresponding size-fraction (10 characters) to be repeated in the new file by the number indicated by the split factor. The following variables remain unchanged.
04, 100-500µm, 502A211057K0602_NORM, 3.39, 585, menardB, 5, d, 268.8 . . . 02,500-1000µm, 502A211057K0701_NORM, 3.50, 800, menardi, 6, s, 900.8 . . .
Output files: One output file is generated, which contains the split-weighted sample, and which then can be submitted to program Stats61.out or Stats62.out. With the above example for input data, the output with the split-weighted sample looks then:
502A211057K0602_NORM, 3.39, 585, menardB, 5, d, 268.8 . . . 502A211057K0602_NORM, 3.39, 585, menardB, 5, d, 268.8 . . . 502A211057K0602_NORM, 3.39, 585, menardB, 5, d, 268.8 . . . 502A211057K0602_NORM, 3.39, 585, menardB, 5, d, 268.8 . . . 502A211057K0701_NORM, 3.50, 800, menardi, 6, s, 900.8 . . . 502A211057K0701_NORM, 3.50, 800, menardi, 6, s, 900.8 . . .
Operation: For Stat_Prep21.out only a single processing mode (e.g. no batch mode) was written, and the program is adapted to run in combination with KeelWidth93.out, Stats61.out, and Stats62.out. After starting the program prompts for the name of the sample file that needs to be split-weighted. Enter that name (maximum 31 characters long) and enter a name for the output file. The program then writes the splitweighted file to disk.
Known problems: No problems were encountered. Note, that decimals in the input file must be separated by a point (not by a comma).
3.2.7. Statistical analysis: Program Stats62.out (Cycle II) Program Stats62.out calculates in batch mode the per sample means and 95% confidence intervals about the means for all morphological parameters, that were obtained from KeelWidth93.out and that are listed in the result files (RESKW_sample_species_nChamb_COIL). These values can the plotted, together with the measurements from individual specimens (from program Compose31.out) using a commercial plotting program (see Figure 11 as an example).
Figure 11. Example showing keel view area measurements of Globorotalia menardii shells from single specimens in each sample (smallest dots, generated with program Compose 31.out) and means and ±95% confidence intervals about the means (generated with Stats62.out) for each sample and per species.
Input: There are two input files for Stats62.out: 1.) containg the names of all individual files RESKW_sample_species_nChamb_COIL to be treated. The individual RESKW_sample_species_nChamb_COIL were first generated with program KeelWidth93.out and then eventually recomposed for chamber number and coling direction per species prior to feeding them to the Stats62.out program.
2.) One or several files with the generalized name RESKW_sample_species_nChamb_COIL. The names of these RESKW_sample_species_nChamb_COIL files are always 31 chars long. They contain the age, specimen- and species names, number of points in the originally traced outlines, number of chambers in the last whorl, coiling direction, and the morphological measurements (X, Y, R, Area, D10, D90, D9010, DMIN, DMAX, PHI1, PHI2, ARESLO, ARESUP, DEV).
Output: Output is a single file called "Means + 95%ci" containing the age and names of each sample, the species names, chamber #, coiling direction, and the means and ±95% confidence intervals about the means for each of the numeric morphological characters. The extrema, variances and standard deviations, and upper and lower 95% confidence level are calculated but not written to the output file (this can be modified if needed).
Known problems: No problems were encontered so far. Note, that decimals in the input file must be separated by a point (not by a comma).
3.3. Polar fourier analysis of outlines Polar fourier analysis is a method of analyzing shapes of a curve, in our case closed, plane curves. The shape of foraminiferal tests is estimated by expansion of the periphery as a function of angle q about the test's centroid (center of gravity) by a Fourier series. The radius R at angle q (polar coordinates) is then given by the formula (Davis, 1973)·
where q denotes the polar angle from an arbitrary reference line, Ro is the average radius of a circle, which best fits the outline, k is the harmonic number, and the Ak's and Bk's are the amplitudes of the trigonometric terms of the harmonic function with the number k (the harmonic function is the sum between the angular brackets).
Given N radii Rj at equiangular distances about the test's center of gravity, the amplitudes of each harmonic function can be determined by
Figure 12. Polar coordinates of point Pj of an normalized outline (mean radius normalization method).
Prior to harmonic analysis the original outline must transformed with respect to the gravity center, the distances between points must be at equal angualar distances, and the outline should be normalized using the method of mean radius normalization. These steps are identical to the first steps with Trace_33.batch.out and Sprep53.out described in sections 3.1 through 3.2.2 above.
New steps in Fourier analysis are described in the following: In order to avoid rotational bias when comparing phases of power spectra between different curves, all outlines need to be in homologous position and should therefore have identical starting points and identical directional sense of the outline extraction. While the directional sense is already given in program Trace_33.batch out, the starting point is kept constant by manually placing the apex of the foraminiferal shell constantly to the horizontal line at 240 pixels on the imaging frame (e.g. placement of the apex to the middle of the image). Maximum homologous position between a series of outlines is achieved with program Homolo31.out (in batch mode), which rotates a given outline around the center of gravity against a predefined reference outline until maximum correlation is attained. The homologized outlines can then either be averaged to a single curve (using program Averou3.out) or directly fed to program Harman2.out, which then calculates the coefficients of harmonic functions for each outline. If necessary, the outline can be reconstructed from its fourier coefficients by inverse Fourier transformation using program Recon1.out (see flow scheme illustrated in Figure 13).
Figure 13. Flow diagram for Fourier shape analysis.
3.3.1. Preparation for harmonic analysis - rotation of outlines into homologous positions: Program Homolo31.out Given a series of normalized outlines with (usually 250) points at equiangular distances, the program Homolo31.out compares each outline with a reference outline (also normalized and with identical number of points) and then rotates the specimen outline about its centroid (=origin) until it matches best with the reference outline. For each increment of rotation the best fit is estimated by finding the maximum linear correlation coefficients between corresponding points of the specimen outline and the reference outline. Because the program compares the radii at similar angular arguments it is important, that the senses of direction and the starting point during outline extraction of the reference curve and the specimen outlines were identical. If the rotation angle exceeds 30° in clock or anticlockwise directions without finding a best fit position the program is interrupted and the curves need to be visually checked. The program was written for usage in batch mode.
Input files: There are two input files: 1.) A file called Ref_char_nchamber_coil, which contains cartesian coordinates of the normalized reference outline. The reference outline must have the same number of points as the specimen outlines. If the number of points are not identical or if a particular file is missing, a message is written to the error report. The name of the file containing the reference outline can be up to 21 characters long.
A collection of reference outlines is given in the directory Reference outlines. The naming of reference outlines follows the following standard: The first three characters are Ref in order to indicate reference outlines. An abbreviation of the species name is given from position 5 to 11 in the name, a K, U or S indicates Keel, Umbilical or Spiral position of the reference specimen, the last character indicates the coiling direction (s for sinistral, d for dextral coiling. Example: Ref_GPERTEN_K_6Ch_s (see index for the encoding of species names and the corresponding sample, where the reference outline was taken from).
2.) The second input file is a list with name FILE_LIST containing the names of all files with the individual specimen outlines (INPUT). FILE_LIST is a character variable of length 20.
Output files: For each specimen outline there are two output files written to disk, one ending with suffix _HOM, and the other ending with suffix _CORR. The first contains the cartesian x,y coordinates of the rotated and homologized specimens in position of the maximum correlation with respect to the reference specimen. The names of the _CORR files are composed from the corresponding INPUT file. The files ending with _CORR contain the linear correlation coefficients between rotated outlines of the specimen and the reference specimen as a function of the rotation angle. Given are the direction of the last rotational operation, the total angle of rotation (in degrees) and the correlation coefficients of the rotated speciem with respect to the reference outline for every position. The names of these files are also composed from the corresponding INPUT file. One file called "Errors_during_Homolo" is written to disk containing runtime error reports.
Operation: After launching the program, the operator is prompted for the name of the reference outline (a name with 21 characters in maximum). Thereafter the names of the interpolated and normalized outlines (file_list) must be entered. The program calculates the the rotated outlines and the angular steps with the corresponding correlation coefficients.
Known problems: Not known. Note, that decimals in the input file must be separated by a point (not by a comma).
3.3.2. Preparation for harmonic analysis - calculate an average outline: Program Averou3.out (in batch mode) At this level of preparation the analyst may decide whether to calculate harmonic functions from an average outline of all homologized curves, or whether to determine the harmonic functions from each individual homologized curve and then to calculate average harmonic functions. For the first case program Averou3.out was written, which calculates an average outline from the previously homologized curves. This is performed by calculating the mean radial length r at their corresponding angular values q from the homologized curves. As was necessary for program Homolo31.out the number of points, directional sense and the starting point during outline extraction of the homologized curves must all be identical.
Input files: There are two kinds of input files: One file (=FILE_LIST) containing a list of the file names (=INPUT) with the cartesian X and Y coordinates of the individual homologized outlines, from which the average outline is calculated. FILE_LIST is a character variable of 20 characters lengths. The other input files are the individual files with the X,Y (cartesian) coordinates of the homologized outlines. The names of these files are interpreted through the character variable INPUT. The expression of INPUT is used to compose the name of the output file (suffix _AVOUTLIN).
Output files: Output is a single file with the suffix _AVOULIN containing the cartesian X,Y coordinates of the desired mean outline.
Operation: After starting Averou3.out the program first asks whether all outlines are in homologous position. Enter y for yes or n for no. If no was the answer the program stops. If yes was the answer the program continues and inquires for two options: Option 1 (=simple averages) is the case if no split-weighting needs to be applied (e.g. if samples have comparable specimen numbers in different size fractions or if there was only one size fraction). The format of the input file during option 1 are simply the names of the otline files (19 characters long). The program the prompts for the name of the list containing all homologized outlines.
If split-weighting needs to be applied (=option 2), a split factor and a size fraction must be provided, and input files are slightly different: The file file_list contains - in that sequence - the split factor (20 characters), the size fraction (10 characters), and the name of the homologized outline (19 characters long). The program prompts for the name of that modified file list. The other files are again the homologized outlines.
After entry of the file list the outlines are read and averaged with the respective metod. In case 1 one file name_AVOUTLIN is written to disk containing the cartesian coordinates of the averaged outline. In case 2 two files are written to disk: name_AVOUTLIN (averaged outline) and FILE_LIST_spw, which is a list of the homologized outline file names with each name listed by a factor indicated by "split factor".
Known difficulties: No problems were encountered so far. Note, that decimals in the input file must be separated by a point (not by a comma).
3.3.3. Harmonic analysis of an outline: Program HARMAN.out (single mode only) Given a single closed plane curve consisting of N discrete points the program HARMAN.out calculates the Fourier coefficients Ak and Bk of the first k harmonic functions, the corresponding radial coefficients rk and phase angle Fk of the polar form of the fourier series, and the power POWER(k) of the discrete spectrum of the curve as a function of k (see Figure 14a-c).
Input files: Input is a single file (name maximum 10 characters long) containing the cartesian x,y coordinates in normalized form (mean diameter normalization method).
Output files: One single output file "HARMONICS" is written to disk, containing in that sequence the harmonic number k, ak, bk, rk, Fk, and Power(k).
Figure 14. (Figure 14) Power-spectra (logarithmic scale) in normal (Figure 14a) and polar (Figure 14b) form, and phase relationships (Figure 14c) of normalized outline of specimen 502_0100CCK0501 in keel view (all from same specimen shown in Figures 2 through 4, 6, 7 and 10). The normalized outline constists of 250 points, the maximum number of harmonic functions is therefore 125 (k=0 through 124).
Operation: After launching the program prompts for the name of the file with the outline, and then writes the file HARMONICS to disk.
Problems: None. Note, that decimals in the input file must be separated by a point (not by a comma).
3.3.4. Inverse Fourier Transform: Program Recon2.out (single mode only) Inverse Fourier transformation allows to reconstruct the original discretized outline from the coefficients of its discrete fourier approximation (method of Ehrlich and Weinberg, 1970). If only a partial sum of the harmonic functions (see Figure 15) or even single harmonic functions (Figure 16) are inversely Fourier transformed, the respective component to the entire form of the outline is reconstructed. Given a set of harmonic functions at their number k program Recon2.out calculates the x,y coordinates of either the original outline or the contribution of a respective component or sum of components.
Input files: Two iput files are necessary: 1.) One file containing the angular values of each radius of the outline (just enter the name of the file containing the coordinates of the interpolated and normalized outline). 2.) The file containing the coefficients harmonic functions to be included for the reconstruction (enter the file HARMONICS).
Output files: One file ending with _REC (for reconstructed) is written to disc containing the cartesian coordinates of the outline, the partial sum or the harmonic function of order k.
Operation: After starting the program the operator is first prompted for a file containing the angular values of each radius of the outline. It is possible to enter the file containing the cartesian x,y coordinates of the interpolated and normalized outline (name_NORM). Thereafter the program prompts for the name of the file containing the harmonic coefficients (HARMONICS). If the original outline is to be approximatively reconstructed HARMONICS must contain all harmonic functions. If a certain degree of the outline is to be reconstructed, HARMONICS must contain the coefficients for k=0 through a limited number of k. If only the harmonic function of order k is to be reconstructed, HARMONICS must contain the coefficients for that particular value of k.
Figure 17: Correlation of x ynd y coordinates of the original, normalized outline and the outline, that was reconstructed by inverse Fourier transformation (from specimen 502_0100CCK0501 illustzrated in Figures 15 and 16).
Known problems: None, except that in its present version (Recon2.out) the last point (point with number 250) plots sometimes oddly (re-check output of latest point in program code !). The program is made for single mode operation only. Note, that decimals in the input file need to be be separated by a point (not by a comma).
3.4. Auxiliary programs
3.4.1. Renaming a series of images in batch mode: Programs Rename_raw.out and Rename_Tiff.out Analysis of size and shape from populations of shells invokes often manipulation of large numbers of image files, wich - for what reason ever - need sometimes to be renamed. Although for this purpose other solutions can be developed (such as application of Apple Script programming), it was found easier to directly program applications in Fortran. The two applications Rename_raw.out and Rename_Tiff.out presented here allow to batch change the names of an unlimited number of image files, which are given in raw and Tiff formats, respectively. The images have a size of 640x480 pixels and are grey-level images (8 bit or 256 grey-levels). Both programs take advantage of the fact that the tiff header is 768 bytes long by standard, which is an integer fraction of both, a Tiff image (768 + 640 * 480 = 307968 bytes) or a raw image (640*480 = 307200 bytes). Both programs read an image file by direct access and with a record lenth of 384 bytes, which is half the length of the Tiff header (this record length was arbitrarily chosen in order to not make the record length too big; using a full Tiff length, however, would speed up the program by a factor of two). The programs are not made for universal input file names and need to be adapted and recompiled from case to case if other lenths for filenames are required.
Input files: There are two types of input files: Firstly, a list (file_list) containing the file names to be renamed. And secondly, the individual image files. In the present form program Rename_Tiff.out reads Tiff files with a 4 character long name and adds the extension .TIFF to them. Program Rename_raw.out reads raw files with a 5 character long name and adds the extension .RAW to each file.
Output files: For each raw or tiff input file one file with the extension .RAW or .TIFF is written to disk, respectively.
Operation: After launching the programs an input file dialog window appears on the screen. Double click into the file containing the list with all input files. Thereafter the program writes the renamed files to disk.
Known problems: None so far.
3.4.2. Generating bivariate frequency distributions from x,y data: Program Grid2.1.out Given a number of of files containing bivariate measurements (for example x,y measurements) the program Grid2.1.out performs a gridding in batch mode using a grid-cell size of Dx in x direction and Dy in y direction and counting all x,y points that fall within a particular grid-cell. The size of Dx and Dy can be varied. Output is a frequency matrix per input file consisting of categories of Dx in x direction and Dy in y direction. These matrices can be read by other programs, such as Surace III 2.6 plus from the Kansas Geological Survey capable of importing z value matrices of an xyz data set, and plotting them as contoured diagrams. This program has been used to generate the contour plots illustrated in Knappertsbusch (2000).
Input files: Two types of input files are needed for Grid 2.1.out: First, a list (List, file name maximum 20 characters long ) containing the names of all data files. Secondly, the individual data files holding the bivariate x,y data. The names of the data files must be 20 characters long.
Figure 18: x,y data of a sample input.
Output files: For each input data file one file with the gridded data is written to disk. The filenames have the extension .grid.
Sample output from scatter data showwn in Figure 19: 0-cmxxxxxxxxxxxxxxxx.grid DeltaX, DeltaY, Number of X-intervals, Number Y-intervals: 1.00000, 2.00000, 15, 20 .0- 1.0 1.0- 2.0 2.0- 3.0 3.0- 4.0 4.0- 5.0 5.0- 6.0 6.0- 7.0 7.0- 8.0 8.0- 9.0 9.0-10.0 10.0-11.0 11.0-12.0 12.0-13.0 13.0-14.0 14.0-15.0 .0- 2.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.0- 4.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4.0- 6.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6.0- 8.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8.0-10.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10.0-12.0 : 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 12.0-14.0 : 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 14.0-16.0 : 0 0 0 2 8 4 0 0 0 0 0 0 0 0 0 16.0-18.0 : 0 0 0 2 10 15 1 0 0 0 0 0 0 0 0 18.0-20.0 : 0 0 0 4 7 30 4 1 0 0 0 0 0 0 0 20.0-22.0 : 0 0 0 1 2 18 5 2 0 0 0 0 0 0 0 22.0-24.0 : 0 0 0 0 2 11 12 2 0 0 0 0 0 0 0 24.0-26.0 : 0 0 0 0 0 3 4 2 0 0 0 0 0 0 0 26.0-28.0 : 0 0 0 0 0 0 1 7 1 0 0 0 0 0 0 28.0-30.0 : 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 30.0-32.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 32.0-34.0 : 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 34.0-36.0 : 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 36.0-38.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 38.0-40.0 : 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Figure 19: Contoured xyz data from distribution shown above.
Operation: Upon launching Grid 2.1.out the program prompts for the upper x and y limits of the x, and y. Enter a value equal or higher than the maximum x and y values in the input file(s). Thereafter the program asks for the binwiths Dx and Dy. Finally, the name for the list with the input data files must be entered. The program writes then the xyz matrices for each input data file.
Versions: Grid 2.1.out: batch mode operation. Grid 1.2.out: single mode operation. Grid 1.1.out single mode operation.
Known problems: None. For finding reasonable Dx and Dy values and a good range for the z-value matrix the operator needs to experiment a bit with different values. Depending on the contour program numerical labels must be manually adjusted (scaled) on the x- and y axes. Note, that decimals in the input file must be separated by a point (not by a comma).
4. Acknowledgements Program development and tests occurred in the course of sereral projects funded by the Swiss National Foundation for Scientific research (Grants No. 2000-043058.95/1 and No. 2000-050558.97/1, 2000-056875.99/1 and 2100-067970/1). I also wish to thank the Natural History Museum Basel and the City of Basel for supporting this research.
5. Literature Davis, J.C. 1973. Statistics and data analysis in geology. John Wiley & Sons, 2nd edition, 646 p.
Ehrlich, R., and Weinberg, B. 1970. An exact method for characterization of grain shape. Journal of Sedimentary Petrology, vol. 40, No. 1, pp. 205-212.
Knappertsbusch, M. 1998. A simple Fortran 77 program for the outline detection of digitized microfossils. Computers & Geosciences, vol. 24, No. 9, pp. 897-900.
Knappertsbusch, M. 2000. 3D-animated views of the morphological evolution of the coccolithophorid Calcidiscus leptoporus from the Early Miocene to Recent. 8th INA Conference Bremen 2000, Programme and Abstracts. Journal of Nannoplankton Research, vol. 22, No. 2, 2000, p. 117.
Knappertsbusch, M., Brown, K., and Rüegg, H.-R. (2006). Positioning and enhanced stereographic imaging of microfossils in reflected light. Palaeontologia Electronica, vol. 9, issue 2; 8A:10p, 30.1 MB. URL http://palaeo-electronica.org/2006_2/reflect/index.html
Knappertsbusch, M. (2007). Morphological variability of Globorotalia menardii (planktonic foraminifera) in two DSDP cores from the Caribbean Sea and the Eastern Equatorial Pacific. Carnets de Géologie / Notebooks on Geology, Brest, Article 2007/04 (CG2007_A04). URL http://paleopolis.rediris.es/cg/CG2007_A04/index.html
6. Appendix - Program listings
7. Appendix - Applications (Mac only) (download archive PROGS_MAC.sea.hqx, 1.7 MB)