Data Wrangling With eBird – Part 2

In my previous post I made a tutorial how to get data out of eBird and into a spreadsheet.  With the intent of using eBird to plan a trip to the Verde Valley Birding Festival in Cottonwood, AZ during the last week of April.  This tutorial will give the first steps in how to find out what species to expect on the upcoming trip.  Let’s get started.

Look at #1.  The number shown is 375.  But if you remember when we were downloading this information it only had 342 species, right?  Where did the extra species come from?  Ah, I am glad you asked.  Look at #2.  See the “(Domestic type)”?  Yeah.  That counts as “taxa”, but not as a species.  First on our list will be cleaning up our spreadsheet and filtering out the non species.

01 spreadsheet

Let’s add some column titles in the blank row #15 like this:

02 spreadsheet add column titles

After you have entered all the column titles in row #15, highlight all the data from row #15 down to the bottom of the spreadsheet.

03 spreadsheet highlight all cells from new col titles to end of spreadsheet

Select Data > Filter > Auto Filter

04 spreadsheet set up auto filter

When AutoFilter is turned “on” you will see little arrows at the right side of each cell in row #15.  Click on the dropdown arrow at the right side of the cell, A15.  Select “Standard Filter” and click “Ok”.

05 spreadsheet filter non species

The “Standard Filter” dialog box will appear.  We are going to filter out all the non species from our list.  Select the field name “Species”.  Choose “Contains” in the Conditions box.  In the Value box type in the characters “(Domestic type)”.  Click “Ok” to complete the dialog box.

06 spreadsheet filter non species 2

Three rows contain the string “(Domestic type)”.  Highlight the species column of the 3 rows.

07 spreadsheet highlight non species rows

Select Edit > Delete Cells… from the main menu of the spreadsheet.

08 spreadsheet delete cells

The “Delete Cells” dialog box appears.  Choose “Delete entire row(s)”.  Click on the “Ok” button to continue.

09 spreadsheet delete entire rows

Now you must carefully do the same thing to repeat deleting the non species rows.  Create a Special Filter for each of the following conditions to remove the rest of the non species:

  1. “/”
  2. “(hybrid)”
  3. “sp.”

Once you have deleted the rows containing “/” and the rows containing “(hybrid)” and the rows containing “sp.” then you are now read to do some fun stuff.  Be sure to turn off the AutoFilter before going to the next step (otherwise it will look like you deleted all the rows).

Wanna calculate the total number of species recorded for each week of the year in Yavapai County, Arizona?  Cool.  Here we go.

Go to the very bottom left of your spreadsheet and type in a description into cell A359 like “species count”.  In cell B359 type the following formula:  =countif(b16:b357,”>0″)  This formula makes use of the COUNTIF function.  We want to count all the rows in Column B where the value is greater than “0” (zero).  This will represent the total number of species ever recorded for that particular week of the year at the specified location.

10 spreadsheet add total species count

Satisfying, isn’t it?  This is our first cool calculation.  Go ahead and copy that formula to the rest of the cells so you have species counts for every week of the year.  Now go back and look at your totals.  Can you find the highest number?  It is 246 and it is column Q.  If you peek at the top of the spreadsheet you will find out that this is the 4th week of April.  Hey!  That is the week I am planning to go to the Valley Verde Birding Festival!

11 spreadsheet find highest species total

But before we get too optimistic we should think about what this number represents.  This is the species total for that week of the year for all checklists submitted to eBird since 1900.  I will be there for only one week.  There is no way that I should anticipate seeing all 246 species of birds recorded over the last 100+ years.  So how many should anyone expect to see?  That is the question of the hour.  Fortunately for you I am going to give you a good starting place to measure expectations–2 percent.  It’s a good round number.  A well planned trip by a birder with excellent identification skills (including birding by ear) may approach the 1% threshold.  A non-resident birder on a first time visit with no inside knowledge and less experience may only expect 3% threshold or higher.  But in my experience, 2% is a reasonable place to start.  You can tweak it once you have planned trips this way and given the list a real world test.

Go back to the bottom left of your spreadsheet.  Type in the formula =countif(b16:b357,”>=0.02)  Again we are using the powerful COUNTIF function for data in column B.  This formula will count all the rows that have a Frequency of Checklists value greater than or equal to 2% (0.02).  eBird uses Frequency of Checklists for their abundance charts.  It represents the percentage of checklists for which a species is recorded.

12 spreadsheet add expected species count

Now copy that formula for the remainder of the columns.  Go back and find the highest number of expected species.  It is actually column P, which is April week 3.  So even though more species have been recorded during April week 4, it is the 3rd week of April that has a greater number of expected species.

13 spreadsheet find highest expected species total

Currently, the only way to see which species this filter identifies is by running through the data manually and finding any values over 0.02.  Not very easy or pretty.  Let’s fix that.

Let’s create a new style that will highlight our target expected species in yellow.  Click on Format > Styles and Formatting on the spreadsheet main menu.

14 spreadsheet select styles and formatting

The Styles and Formatting dialog box will appear.  Click on the arrow at the bottom and select “Custom Styles”.  Up in the middle of the blank area of the main box right click and select “New…”.

15 spreadsheet create a new custom style

Next a “Cell Style” window will pop up.  In the “Organizer” tab enter the “Name” of your new style.  I have chosen to call this “yellow highlight”.

16 spreadsheet name new style

Do not click “Ok” yet.  We have to make the background color yellow.  Find the “Background” tab and click on it.  Select the “yellow” color and click on “Ok” to continue.

17 spreadsheet select yellow background for style

Close the styles and formatting pop up.  Now highlight the species data for April Week 4 in Column Q, rows 16 through 357.

18 spreadsheet highlight Apr_wk4

Once the data is highlighted, select Format > Conditional Formatting > Condition… from the spreadsheet main menu.

19 spreadsheet conditional formatting

Now it is time to set up our condition for the yellow highlight.  Select “greater than or equal to” and “0.02”.  For “Apply Style” select “yellow highlight”.  Click “Ok” to continue.

20 spreadsheet conditional formatting 2

Go to cell “B16” and single click there so that you see the border around it (it has focus). Then go to the spreadsheet main menu and select Window > Freeze.  This will freeze our custom column titles (with month_weeknum) on the top and Species names to the left when we scroll through the spreadsheet.

21 spreadsheet freeze panes

Now you should be able to scroll to the right and down without losing the references for species name or month and week of the year.  And all of your expected species should now be highlighted in yellow.

22 spreadsheet final view with highlighted data

You did an awesome job following along!  Give yourself a pat on the back.  Now you can do this for any area you want to travel to and plan your own trip using eBird to help you out.

In the next tutorial we will spend more time with the expected species list.  Wanna know what your top 25 most common species are?  Tune in next time for another fine adventure in eBird Geekdom.

Leave a Reply

Your email address will not be published. Required fields are marked *