Vegetation Measurement and Monitoring

4.1: Monitoring Implementation, Quality Control and Best Practices

Video Presentation

To add captions to this video click the CC icon on the bottom right side of the YouTube panel and select English: Corrected captions.


Learning Guide

Introduction

Why consider best management practices for data management?

We collect data to provide information about how the resource has responded to management, and to inform future management decisions. This information needs to be correct and reliable!

There are many stages along the path from data collection to analysis and interpretation:

  • Field data collection
  • Data entry
  • Data analysis
  • Interpretation

Each of these stages has multiple steps.  In any multi-step process, there are many potential weak points along the path where non-sampling errors can occur. Recall that non-sampling errors affect accuracy, and are not detectable statistically. But they are avoidable, and it is essential that we do all that we can to eliminate them.

When you report the findings of a sampling effort, you put your name on that report, and people are relying on the information that you provide to them.  We build our professional reputation through the work that we produce, and we generally try hard to protect our reputations and credibility. We are all familiar with the saying “Garbage In, Garbage Out”, and we certainly don’t want people thinking of the work that we do as unreliable or suspect.

The following recommendations include ways that we can avoid making non-sampling errors that tend to occur as we collect and analyze data. Certainly it is not an exhaustive list, but a good start!


 

Field Data Collection

1. Ensure that all individuals collecting data are trained and calibrated.

a. Establish clearly defined ground rules for data collection

i. Review methods for taking measurements

1. Consider the possible ways that measurements can be taken

2. Select the protocol that best meets sampling objectives

ii. Review how to record data

1. Abbreviations

2. Number formats

3. How to indicate “repeated information”

It is helpful to have an established protocol for the data recorder to mark “repeated information” on data sheets efficiently. Leaving blank spaces on paper data sheets (Figure 1a) can be misleading and easily misinterpreted during data entry. Establish a protocol for clearly marking in cells that contain repeated values (Figure 1b).

Figure 1. Examples of data sheets in which repeated observations were marked. a) Cells left blank below the entry (red ovals) cannot be distinguished from missing data and are NOT recommended. b) Vertical lines through cells (blue ovals) indicate cell values that are the same as the previous entry.

b. Practice data collection and data recording

i. Training and calibrating ensures consistency

ii. Train observers and recorders together – helps to identify problems

iii. Enables you to identify and address problems EARLY

c. Document all data collection methods and ground rules.

Information about methods and ground rules should be documented in the monitoring plan or sampling protocol, but it’s equally important to document any decisions that were implemented in the field. This step is important because you can’t count on remembering details, and even if you do, you may not be the person who needs this information in the future!

2. Use well-designed data sheets.

Although data may be collected using electronic recording devices, paper data sheets are probably the most common way to record data in the field. Data sheets need to facilitate and streamline data recording. A well-designed data sheet should be organized to efficiently collect information specific to the attribute or attributes being measured, and always includes fields for “metadata” – the where, when, who, and what describing the information that was collected. Characteristics of data sheets are covered in Chapter 9 in Measuring and Monitoring Plant Populations by Elzinga et al. (1998).

Fortunately, data sheets have been developed and tested for most types of data collection, and they are readily available from different sources. Some agencies may require that you use specific data sheets. Otherwise, you can find templates of data sheets in Measuring and Monitoring Plant Populationsthe Monitoring Manual: Volume 1the Monitoring Manual: Volume 2, and Sampling Vegetation Attributes. Depending on your sampling project, you may need to adapt existing data sheets to meet your specific purpose. In that case, be sure to include the essential fields for metadata, ask colleagues to review the new format, and test it out to ensure that it serves your purpose efficiently.

If electronic tablets or laptops are used for recording data in the field, be sure that the spreadsheets are organized for efficient data-recording, retrieval, and analysis. Also, in cases where data have been recorded electronically in the past, ensure that the spreadsheet is organized so that the data can be readily compared with data collected in the past. Ensure that if changes are made in spreadsheet organization, these changes are done deliberately, with the intention to improve the process.

3. Always record metadata on the data sheets.

Metadata is the information included on datasheets that links the actual data that is recorded to the location, sampling unit, date, and personnel involved in data collection. Without this information, the data are merely unidentifiable numbers, letters, and words.

Recording metadata is essential: without the identifying information (i.e., metadata) the data sheet is unidentifiable and the effort to collect the data was wasted. Filling out metadata can be repetitive, and it may be tempting to take shortcuts by recording just one piece of identifying information, such as transect number. Avoid this temptation!  You can save time by pre-printing certain fields before photocopying, or filling in fields during “downtimes” such as while traveling between sampling locations (assuming you are not the driver!).

4. Take photos.

Photos document actual conditions and tie those conditions to the recorded data (Figure 2). This is an easy step, and provides very helpful information to individuals who were not present during data collection.

Figure 2. Example of a photopoint of a transect, with an ID board providing basic identifying information.

The photograph documents conditions at the sampling unit location, and information on the ID board documents essential identifying metadata, such as site ID, sampling unit ID, and date.

5. Check the quality of data regularly as it is being collected!

This is a form of quality control that is essential to avoid potential errors that occur if someone uses “creative” methods that are not aligned with the pre-determined methodology.

a. Identify and correct data collection errors.

This includes errors by both the observer and the recorder. For example, if one set of observers identify and record plant species using common names while others are using species codes based on scientific names, this creates a data entry nightmare that could easily be avoided. Similarly, if an observer consistently guides a pin flag to the ground instead of dropping it, that is a potential source of bias that needs to be addressed.

b. Ensure that recorders write legibly!

No, we don’t need to have the perfect handwriting that our first grade teachers used, but we do need to write to communicate clearly. The person recording data may not ultimately be responsible for entering the data, and if their handwriting is not legible or difficult to decipher, it can easily be misread (Figure 3).

Figure 3. Poor handwriting example from canopy gap data sheet. Values (in centimeters) represent the start and end points of gaps on a transect. a) the entries in row 4 appear to be 1286 and 1300, which would produce a negative measurement (-14cm). b) The recording error is avoided because the numbers are written legibly.

In some cases, it may be necessary to insist that some numbers are written a certain way. In other cases, a recorder may need to make a notation on each sheet, such as:

“This is a FOUR:    and this is a NINE:    “.

6. Avoid worker fatigue, boredom, and physical stress

Taking care of workers is both ethical and good data management practice because it helps you avoid both data collection errors and lost time for medical reasons!

a. When working in pairs, trade jobs periodically. Observing often involves repetitive motions such as bending over to identify species, determine the ground cover reading, etc.  Recording can be monotonous and boring. Be aware of physical well-being and keeping your mind engaged.

b. Make sure crew members are prepared to work in the “elements”.  Be prepared for sun, wind, rain, cold, insects, and other potential hazards. As always, practice “Safety First!”

c. Crews need to take breaks to rest, eat, and stay hydrated.


.

Data Entry

After all the work in the field, you want to be sure that the recorded information is accurately transferred into electronic format.

1. Structure of electronic data sheet

Set up the electronic data sheet so that it is easy to transfer data from the written sheet into the computer. Electronic data sheets are usually created in a spreadsheet program, such as Microsoft Excel, to facilitate data manipulation after the data have been entered.

a. Sometimes the electronic data sheet may exactly resemble the written sheet. This makes crosschecking the data easier, and it ensures that the metadata stay with the recorded information. The data can be re-arranged to facilitate analysis after it has been entered.

b. If you want to organize the electronic data sheet to include information from multiple field data sheets, be sure to organize the electronic sheet so that every observation can easily be crosschecked, or tied back to the original paper copy. Usually this involves adding columns for site ID, sampling unit ID, observation date, and observer ID. Since the electronic data sheet doesn’t resemble the paper copy, be sure that each column has a “header” that identifies the contents of the column.

c. Record the measurement units and metadata electronicallyIN the electronic data file. For example, are the measurement units in centimeters, meters, inches, or feet? Be sure to document:

i. measurement units

ii. the transect and quadrat dimensions

iii. spacing between measurements

iv. species codes

v. if nested quadrats were used, be sure to indicate the dimensions of the nested quadrats and which species were measured in which quadrat.

Some of this information, such as sampling unit dimensions and measurement units, can be included at the top of the electronic data sheet. When there is a substantial amount of information that needs to be documented, it helps to create one or more separate worksheets that have informative names such as “Notes” or “Species Codes”. It is essential that these documentation worksheets are included IN the same electronic file as the data.

2. How to handle missing data

Sometimes you will find a blank cell or line on the field data sheet. Unfortunately, missing data happens…. Be sure you know what to do when there is missing data – DO NOT insert a zero in place of missing data!

A Zen master of data management once famously said:

“Zeroes mean something – they mean nothing!”

When a piece of data is missing, you have no information about the actual measurement value. By inserting a zero for missing data, you are artificially selecting a value for that measurement.

How you handle missing data depends on the statistical software that you’ll be using to analyze the data. If the data will be analyzed in MS Excel, leave the cell that corresponds to the missing data blank. If you are using more sophisticated software such as SAS or SPSS, you may want to enter a period “.” in place of the missing data, or you may need to define a special value, such as “-99” and use that only in place of missing data.  The point here is to know what to do when you encounter missing data during data entry, and ensure that everyone who enters data follows the same protocol.

3. Verify the data for correctness

Once the data have been entered electronically, you need to compare the data from the field data sheets to the electronic data to verify the accuracy of the electronic data.  Although this is time consuming, it is a very important step. Otherwise, how will you know that the data have been entered without mistakes? There are several ways to detect data entry errors:

a. Crosscheck data in the electronic version to the field data sheets.

i. Have one person read the electronic data out loud while another person reads the data on the corresponding field data sheet. This is the preferred approach, but it involves more time because two people are needed.

ii. If you don’t have the personnel available for the “team approach”, one person should visually compare the electronic data to the field data sheets, looking for mismatches between the two versions.

b. Next, sort or filter the data to check for numerical abnormalities in the electronic data.

A Word of Caution! Before you sort or filter your data, be sure to save the file! This protects your data (and the hard work you did to enter it) in case you accidentally rearrange or move a subset of the data while you are sorting.

Look for the following values:

i. Outliers, unusually large or small values compared to the other values in the data set. Flag or highlight the outliers and verify them in the original data sheets. For example, if most of the values range between 30 to 80, odd values such as 3 or 425 need to be verified for accuracy.

ii. Unusual formats, relative to the other entered values. This includes negative values, decimal points when the recorded values weren’t collected with decimal points, too many digits after the decimal point, two decimal points, alphabetical or alphanumeric data where numerical data are expected and vice versa, etc. Look for anything that could be related to a typo, and refer to the original data sheets to verify the correct value.

The take-home message: Take time to ensure that the electronic data accurately reflect the information that was recorded during field data collection.

4. Protect the data!

After all the work of collecting data in the field and entering and checking the electronic data, be sure that you don’t “lose” it!   Photocopy or scan the hard copies of field data sheets (scanning saves paper), organize the original hard copies, and store them in a safe location. Similarly, be sure that electronic files are given appropriate and informative filenames to facilitate rapid retrieval, and make backup copies of electronic files that are stored on separate computers or servers to ensure that the files can be retrieved despite computer failures and crashes.


.

Data Analysis

Data analysis includes data conversion and manipulation, calculation of descriptive statistics and inferential statistical analyses.

A note of caution! You’ve gone to a lot of trouble to ensure that the electronic data accurately reflect the data that were collected in the field. The data is “raw”, or un-manipulated, at this point. You will probably need to manipulate the data somehow to prepare it for analysis, and this often involves multiple steps.  Rather than working directly on the “raw” data, you may want to copy the worksheet and rename it “working” data before you start to do your data conversions. As you manipulate your data, paste the converted data into new cells rather than pasting into the cells where the data are currently located. This protects your work in case you make a mistake, manipulate only a subset of the data, or a cat runs across your keyboard while your back is turned!

1. Data conversion

Some data need to be converted using mathematical formulas.

For example, line-point measurements are collected as COUNTS along a line, and need to be converted to units of PERCENT. Biomass data collected in units of grams/m2 may need to be converted to units of kg/ha. Whenever you use a mathematical formula to manipulate data, verify that the formula is correct. Don’t just trust to luck or “wing it”, because it is easy to make calculation errors.

Follow this approach to avoid errors:

a. Write the equation by hand, including all of the units in your equation.

b. Cancel units by hand to ensure that the end result is in the desired units (Figure 4).

Figure 4. Example showing how to write a conversion formula by hand to ensure that units cancel properly before using the equation in an electronic spreadsheet.

c. Test your equation with a known quantity.

d. Once you are certain that the equation is correct, enter it as a formula in the spreadsheet.

2. Data Manipulation

Depending on how you plan to analyze the data, you may need to rearrange the data to get it ready for statistical analysis. Again, take care while copying and moving data to ensure that you don’t accidentally omit or “double-copy” values from the data field.

3. Data Summarization

Calculate descriptive statistics and/or inferential statistics.

4. Store and Protect the Analyzed Data

Data analysis takes considerable time and effort, and this effort can be lost if the electronic spreadsheets are not well-managed.   Follow the guidelines for electronic file management provided above: be sure that electronic files are given appropriate and informative filenames to facilitate rapid retrieval, and make backup copies of electronic files that are stored on separate computers or servers to ensure that the files can be retrieved despite computer failures and crashes.


.

Interpretation

You have gone to a lot of effort to get to the point where you can interpret your results! Before you get too far with interpreting the results, be sure to do a reality check.

1. Carefully review the results/output for red flags.

Crosscheck the output to the input. Does the count or number of observations in the output match the count or number of observation in the input?

2. Does the output make sense?

This is your time for a reality check.

You should have a ballpark idea, based on your experience at the site (if you collected the data) or based on photos from the site of what a reasonable result would be.  This last reality check is important. If the output suggests a condition that does not seem reasonable based on your ballpark estimate, then check the data to find out why. Does the density estimate seem unrealistically high? Then figure out which sites or sampling units are responsible for the high estimate, and make sure that this reflects REALITY, not a data entry error.

For example, the students in Figure 5 are standing in a desert grassland site that was severely impacted by drought.

Figure 5. This photo was taken during a year when precipitation was below average, resulting in high spatial heterogeneity and low values of ground cover.

The vegetation is quite patchy, but there is obviously a high percentage of bare ground. Given the patchiness of the vegetation, you could develop a ballpark estimate of 65%-90% bare ground. If the data analysis indicates that there is 25% bare ground, this would raise a red flag because it is a substantial departure from your ballpark estimate.  Before interpreting this result, you should check the input, look for sampling units with low bare ground percentages, and check the electronic data and original data sheets to make sure that the results reflect REALITY, not a data entry error.


Self-Check

This activity is designed to test your knowledge and understanding of Best Data Management Practices. This self-check is provided for you to evaluate your own learning. Answers are not recorded.

This activity is designed to test your knowledge and understanding of Best Data Management Practices. This self-check is provided for you to evaluate your own learning. Answers are not recorded.

1. Which quadrat in the image below contains a suspicious data entry?

2. Which quadrat in the image below contains a data entry error?

3. There are multiple ways to check entered data for errors. Which of the following methods is not preferred for checking data that has been entered into a data file?

4. There are activities researchers can do before, during and after they go to the field to improve the quality and correctness of data collected. Which of the activities below could be done before going to the field in order to improve data quality? Mark all that apply.

5. Metadata is an essential part of data collection. Which of the following actions could happen if we are unable to identify where data were collected from, when they were collected, what data set it is, and to what project the data are related?

6. It is important to establish good protocols (i.e., ground rules) to ensure that… (mark all that apply)