Issue
Table of Contents

R Tools for Paleontology:
VAVREK

Plain-Language &
Multilingual  Abstracts

Abstract

Introduction

What is R and Why Should We Use It?

Setting Up the Environment

Loading Your Data in R

Distance/Stance/Similarity/Beta Diversity Indices

Non-Parametric Species Estimators and Rarefaction

Minimum Spanning Trees

Biogeography and GIS

Conclusion

Acknowledgments

References

Appendix

 

Print article

 

 
 

LOADING YOUR DATA IN R

Large databases used in palaeoecology studies are often simply tables, whether in plain text files or Excel tables, where every row consists of a unique observation, usually of a species at some location in space and time. However, the species, locations and times in these lists are rarely unique, and often consolidation of the data into usable matrices of species versus location is needed. There are two functions that aid in the conversion of lists of points into two types of matrices that will be referred to throughout the remainder of the paper. The first function is the create.matrix() function, which takes a list of species and their occurrences and converts it to a matrix of species (rows) by localities (columns). With the commands

>data(fdata.list)
>create.matrix(fdata.list,tax.name="species",locality="locality")

we can create an occurrence matrix from the fdata.list example data set; alternatively, if we wish to create an abundance matrix, we use virtually the same command, but include the option abund = TRUE and give the name of the abundance column (in this case, 'abundance') for the abund.col option. This method will give us an abundance matrix identical to fdata.mat.

>data(fdata.list)
>create.matrix(fdata.list,tax.name="species",locality="locality",
+abund=TRUE,abund.col="abundance")

For the fossil package, data follows the convention of species as rows and localities as columns. Data that is in matrix format already but with species as columns and localities as rows can be transposed with the t() command.

Similarly, much palaeontological data comes with some sort of spatial data about its provenance integrated with the occurrence data. As such, the locality data is often duplicated for each unique species at a certain site. In order to simplify plotting georeferenced data, a function called create.lats() can be used to extract the site coordinates from a list, eliminating duplicate entries.

>data(fdata.list)
>create.lats(fdata.list,loc="locality",long="longitude",
+lat="latitude")

 

Next Section

R Tools for Paleontology
Plain-Language & Multilingual  Abstracts | Abstract | Introduction | What is R and Why Should We Use It?
 Setting Up the Environment | Loading Your Data in R | Distance/Stance/Similarity/Beta Diversity Indices
 Non-Parametric Species Estimators and RarefactionMinimum Spanning Trees
Biogeography and GIS | Conclusions | Acknowledgements | References | Appendix
Print article