Forest plots in Excel

A forest plot is an efficient figure for presenting several effect sizes and their confidence intervals (and when used in the context of a meta-analysis, the overall effect size) (.pdf). They can be created in a variety of tools, including R and meta-analytic software. Here I will describe how to create these plots using Excel.* Note: It is very possible (if not likely) that this entire task is easier with R or with other meta-analytic software. The purpose here is to show how this can be done with a tool that many of us are familiar with (if fact, I will assume that you have working knowledge about Excel).

Some people offer special Excel meta-analysis plugins to create forest plots, but these are not necessary. Instead, I will build on a short peer-reviewed paper that describes how to make forest plots in Excel (see their template). In short, I borrow their method and expand it to show how it can be very customizable. I used Excel 2007 and I know that Excel 2010 is very similar. I don’t know how either newer or older versions of Excel will work. [Update: These instructions seem to work well enough with Excel 2013 too.]

The key insight for making forest plots in Excel is that scatter plots do not need to be used to make scatter plots. Instead, they can be treated as a blank canvas where you place things using X-Y coordinates. I will make a forest plot of the association between religious fundamentalism and the feeling thermometers for 20+ groups from the 2012 ANES (higher numbers = more fundamentalism and more positive feelings). 

You can find the Excel file to use as a template here.

1. Setup your Excel spreadsheet like the figure below. The first column is the list of the target groups I looked at in the ANES. The second column is the order that I want them in for the forest plot. They are ordered so that in the forest plot they range from most positive at the top to the most negative at the bottom. The next two columns are the low and high values of the 95% confidence intervals of the correlation.** And the last two columns are the CIs that are used when making the figure (distance between the CI and the r).

setupForestPlotExcel

2. Create a scatter plot using the second column as your Y-axis values and the third column (with the correlations) as your X-axis values. Make several adjustments: (a) Adjust the y-axis to range from 0 to 24 (in the case of the current data) and remove all tick marks and labels. (b) Get rid of the gridlines. (c) Adjust the x-axis to a range that makes sense (for this example, -.80 to +.80 works well) with a major unit that makes sense (for this example, .2). (d) Make the data points look good. I prefer black squares, adjust the size as it makes sense of your application. This is the start.

firstlookForestPlotExcel

3. Now add in custom horizontal confidence intervals using the last two columns. (more info on adding custom error bars in Excel) This, more or less, gets you to the place where previous examples will get you, typically with an added suggestion to use text boxes to add labels. Don’t use text boxes. There is a better way…

4. Now you need to add a new series to the scatterplot that you will use to place the labels (in this case, the names of the groups). I added a new column with the entry -.58 for all of the groups (this number can be modified based on aesthetics). Then create the new series with this new column as the x-values and the same y-values as before. This will make a vertical line of markers on the left side of the figure and these new data points will be used to anchor your labels. Now, make those invisible by removing their fill and their border.

5. Right click on the now invisible markers and click “add data labels”. Right click again and click “format data labels”. Select “left” for label position.

6. Now select the “1” label, put an “=” in the formula box, and then click on the label for the 1 position (Atheists). Do this for each of the labels and soon you’ll have labels for each of the groups. (here is another explanation for how to add custom labels with clearer step-by-step instructions for this step)

customlabelsForestPlotExcel

7. Now, you can finish things up. Most concretely: Add a label at the bottom.

finishedForestPlotExcel

But you can also go further if you’d like. You can add vertical lines to indicate heuristic cutoffs for small, medium, and large effects. You can text information about the size of the effects. You can change the color or shape of the labels to highlight different comparisons (e.g., effects that are p < .05 different from zero). See below for an example with some of these additions.

fancyForestPlotExcel

You can find the Excel file for both of these forest plots here.

*Note that Cumming’s Excel files for his book can also help with forest plots, but they are not always as customizable and flexible as I’d like.
**The sample size here is >5000, so these CIs are preeeeetttttty small. 

21 thoughts on “Forest plots in Excel

    1. You just need to add another data series to the figure, like the series already used for the vertical lines (see the files that you can download in the post). You can adjust the location of the lines by adjusting the x-axis values.

    1. You can’t do it in a smooth and easy way, but you can do it.

      If you know what the size differences need to be, you first select an individual data point and then you change the size of that individual data point. You do this for each data point. It isn’t ideal because you’ll need to calculate the sizing of each point yourself, but it should do what you want.

      The nice thing about this variation on forest plots is that you can edit it the same ways you can edit any scatter plot in Excel.

      1. Thanks MB, the information in te post really helped!

        For reference, re: effect sizes, what I did was use the standard error to rank all the studies into deciles. The deciles could then be used as a marker for data point size (e.g. 1st decile = point size 1, 9th decile = point size 9, etc).

      1. Hi Dani — I’m sorry, but I don’t think I quite understand your question? Can you try asking it in a different way?

      2. I do not understand which numbers in the Table you used for the calculation of D12 and E12. Thanks for replies!

      3. Alternatively, you could also use Bias-corrected accelerated CI (calculated in SPSS via the bootstrap function)

    1. You can just replace the correlation coefficients with the OR. And then the upper and lower bounds of the OR in the 95% CI columns.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s