When using the crest.get_modern_data() function, crestr may raise a series of notes and warnings that must be assessed. They however do not need to be ‘fixed’ to be able to continue with the analysis. There are markers of small deviations from the perfect dataset, but this might be expected. These notes can be accessed with the PSE_log(). Their origin is detailed here and practica solutions are proposed.

1. One or more taxa were are not in the proxy-species equivalence table and have been ignored. Use PSE_log() with the output of this function for details.

This warning is raised to indicate that some taxa reported in the data file (df) were not found in the proxy-species equivalency table (PSE). Possible cause(s):

  • The names are not spellt the same in the two files (check for leading and lagging spaces, e.g. ‘Poaceae’). If this is the case, warning #5 should also be raised.
  • The df dataset has not been properly cleaned and contains extra columns, such as ‘total sum’, age ranges, …

2. One or more taxa were are not in the selectedTaxa table. They have been added but are not selected for any variable. Use PSE_log() with the output of this function for details.

This warning is raised when some taxa names found in df or PSE are not found in the selectedTaxa input file. These taxa are added automatically with a default value of 0 (i.e. the taxa will not be used for the reconstruction of any variable).

3. One or more taxa were not associated with species. Use PSE_log() with the output of this function for details.

This warning is raised when some taxa names found in the PSE file are not properly classified (i.e. no family, genus and species names provided). If you use the template generated by createPSE(), the first column should contain the value ‘4’. Note: You do not have to classify all the taxa, especially the taxa that are so broad that they will be excluded from the reconstructions.

4. The percentages of one or more taxa were always 0 and have been removed accordingly. Use PSE_log() with the output of this function for details.

This warning is self-explanatory. The df dataset is checked. The sum per column is calculated and all the names of taxa that are not recorded in any sample are excluded.

5. One or more taxa were are not recorded in the data file. Use PSE_log() with the output of this function for details.

This warning is raised to indicate that some taxa reported in the proxy-species equivalency table (PSE) were not found in the data file (df). Possible cause(s):

  • The names are not spellt the same in the two files (check for leading and lagging spaces, e.g. ‘Poaceae’). If this is the case, warning #1 should also be raised.
  • The PSE dataset has been recycled from another study and contains taxa observed in other records.

6. The classification of one or more taxa into species was not successful. Use PSE_log() with the output of this function for details.

This warning is raised if no species correspond to the classification proposed in the PSE file. Unclassified will not be used for any reconstructions. Possible cause(s):

  • The names are spellt incorrecty (check for leading and lagging spaces, e.g. ‘Poaceae’).
  • A species names includes the genus, e.g. family: ASTERACEAE genus: Artemisia species: Artemisia afra
  • The names of the taxa has changed. For instance, the pollen type ‘Acacia-type’ observed in Africa should not be linked to Acacia species anymore, but to Senegalia and Vachellia species. You can use the www.gbif.org website to identify if new names exist.

7. For one or more taxa, no species remained associated with the proxy name at the end of the classification. Use PSE_log() with the output of this function for details.

This warning is raised if, at the end of the classification process, a taxon name has been depleted of all its composing species. This is different than warning #6. In this case, some species were first found, but subsequently reclassified into a different, better resolved category. Unclassified will not be used for any reconstructions.

8. No data were available within the study area for one or more taxa. Use PSE_log() with the output of this function for details.

This warning is raised if the classification is successful, but no data correspond to the species within the study region. If the taxon is important, consider expanding the study area to include its native area, or make sure that the name of the targeted group of species has not changed (i.e. the species can be found in other regions, but the regional subgroup has been reclassified as something else).

9. An insufficient amount of calibration data points was available within the study area for one or more taxa. Consider reducing ‘minGridCells’. Use PSE_log() with the output of this function for details.

This warning is raised if the regional classification is successful but there are not enough data to fit the PDFs. If the taxon is important, consider lowering the minGridCells parameter to include species with fewer data points, or increase the extent of the study area.