Guide to good graphs with Excel
Drawing graphs is an important part of presenting the results of your research. Here I describe the features of clear, effective graphs, and I outline techniques for generating good graphs using Excel (there's a similar page on generating good graphs with Calc, part of the free OpenOffice.org suite of programs). There are other, specialized graphing programs you can buy; you can even draw graphs with SAS, if you're a crazy person. But with a little tweaking, you can generate excellent, publication-quality graphs with the spreadsheet program you probably already have.
I don't like some of the default conditions in Excel, but you can get it to produce graphs that are good enough for publication or presentations. If you don't have Excel, you may want to consider OpenOffice.org, a free, open-source suite of programs that includes Calc, which does most of what Excel does. It takes even more tweaking to draw good graphs with Calc, but it is possible.
General tips for all graphs
- Don't clutter up your graph with unnecessary junk. Grid lines, background patterns, 3-D effects, unnecessary legends, excessive tick marks, etc. all distract from the message of your graph.
- Do include all necessary information. Both axes of your graph should be clearly labelled, including measurement units if appropriate. Symbols and patterns should be identified in a legend on the graph, or in the caption. If the graph has "error bars," the caption should explain whether they're 95 percent confidence interval, standard error, standard deviation, comparison interval, or something else.
- Don't use color in graphs for publication. If your paper is a success, many people will be reading photocopies or will print it on a black-and-white printer. If the caption of a graph says "Red bars are mean HDL levels for patients taking 2000 mg niacin/day, while blue bars are patients taking the placebo," some of your readers will just see gray bars and will be confused and angry. For bars, use solid black, empty, gray, cross-hatching, vertical stripes, horizontal stripes, etc. Don't use different shades of gray, they may be hard to distinguish in photocopies. There are enough different symbols that you shouldn't need to use colors.
- Do use color in graphs for presentations. It's pretty, and it makes it easier to distinguish different categories of bars or symbols. But don't use red type on a blue background (or vice-versa), as the eye has a hard time focusing on both colors at once and it creates a distracting 3-D effect. And don't use both red and green bars or symbols on the same graph; from 5 to 10 percent of the males in your audience (and less than 1 percent of the females) have red-green colorblindness and can't distinguish red from green.
Exporting graphs to other formats
Once you've drawn a graph, you'll probably want to export it to some other program. For a print document or a presentation, you should be able to copy the graph and paste it directly into the file. For other uses, copy the graph into a graphics program (such as Adobe Illustrator, Adobe Photoshop, GIMP, or OpenOffice.org Draw) and save the file in .gif format (for use on the web) or .eps or .tiff formats (which some journals require for submitted manuscripts).
Choosing the right kind of graph
There are many kinds of graphs--bubble graphs, pie graphs, doughnut graphs, radar graphs--and each may be the best for some kinds of data. By far the most common graphs in scientific publications are scatter graphs and bar graphs.
A scatter graph (also known as an X-Y graph) is used for graphing data sets consisting of pairs of numbers. These could be measurement variables, or they could be nominal variables summarized as percentages. The independent variable is plotted on the X-axis (the horizontal axis), and the dependent variable is plotted on the Y-axis.
The independent variable is the one that you manipulate, and the dependent variable is the one that you observe. For example, you might manipulate salt content in the diet and observe the effect this has on blood pressure. Sometimes you don't really manipulate either variable, you observe them both. In that case, if you are testing the hypothesis that changes in one variable cause changes in the other, put the variable that you think causes the changes on the X-axis. For example, you might plot "height, in cm" on the X-axis and "number of head-bumps per week" on the Y-axis if you are investigating whether being tall causes people to bump their heads more often. Finally, there are times when there is no cause-and-effect relationship, in which case you can plot either variable on the X-axis; an example would be a graph showing the correlation between arm length and leg length.
There are a few situations where it is common to put the independent variable on the Y-axis. For example, in oceanography it is traditional to put "distance below the surface of the ocean" on the Y-axis, with the top of the ocean at the top of the graph, and the dependent variable (such as chlorophyll concentration, salinity, fish abundance, etc.) on the X-axis. Don't do this unless you're really sure that it's a strong tradition in your field.
A bar graph is used for plotting means or percentages for different values of a nominal variable, such as mean blood pressure for people on four different diets. Usually, the mean or percentage is on the Y-axis, and the different values of the nominal variable are on the X-axis, yielding vertical bars.
Sometimes it is not clear whether the variable on the X-axis is a measurement or nominal variable, and thus whether the graph should be a scatter graph or a bar graph. This is most common with measurements taken at different times. In this case, I think a good rule is that if you could have had additional data points in between the values on your X-axis, then you should use a scatter graph; if you couldn't have additional data points, a bar graph is appropriate. For example, if you sample the pollen content of the air on January 15, February 15, March 15, etc., you should use a scatter graph, with "day of the year" on the X-axis. Each point represents the pollen content on a single day, and you could have sampled on other days. When you look at the points for January 15 and February 15, you connect them with a line (even if there isn't a line on the graph, you mentally connect them), and that implies that on days in between January 15 and February 15, the pollen content was intermediate between the values on those days. However, if you sampled the pollen every day of the year and then calculated the mean pollen content for each month, you should plot a bar graph, with a separate bar for each month. This is because the mental connect-the-dots of a scatter graph of these data would imply that the months in between January and February would have intermediate pollen levels, and of course there are no months between January and February.
Drawing scatter graphs with Excel
- Put your independent variable in one column, with the dependent variable in the column to its right. You can have more than one dependent variable, each in its own column; each will be plotted with a different symbol.
- If you are plotting 95 percent confidence intervals, standard errors, or some other kind of error bar, put the values in the next column. These should be confidence intervals, not confidence limits; thus if your first data point has an X-value of 7 and a Y-value of 4 ±1.5, you'd have 7 in the first column, 4 in the second column, and 1.5 in the third column. For confidence limits that are asymmetrical, such as the confidence limits on a binomial percentage, you'll need two columns, one for the difference between the percentage and the lower confidence limit, and one for the difference between the percentage and the upper confidence limit.

An Excel spreadsheet set up for a scatter graph including confidence intervals. - Select the cells that have the data in them. Don't select the cells that contain the confidence intervals.
- From the "Insert" menu, choose "Chart" (or click on the little picture of a graph in the task bar). Choose "XY (Scatter)" as your chart type. Do not choose "Line"; the little picture may look like a scatter graph, but it isn't.
- The next screen shows the "Data range," the cells that contain your data; you shouldn't need to change anything here.
- On the "Titles" tab of the "Chart Options" screen, enter titles for the X axis and Y axis, including the units. A chart title is essential for a graph used in a presentation, but optional in a graph used for a publication (since it will have a detailed caption).
- On the "Gridlines" tab of the "Chart Options" screen, get rid of the gridlines; they're ugly and unnecessary.
- On the "Legend" tab of the "Chart Options" screen, get rid of the legend if you only have one set of Y values. If you have more than one set of Y values, get rid of the legend if you're going to explain the different symbols in the figure caption; leave the legend on if you think that's the most effective way to explain the symbols.
- Click the "Finish" button, but you're far from done. Click on the white area outside the graph to select the whole image, then drag the sides or corners to make the graph the size you want.
- Click in the gray area inside the graph, choose "Selected Plot Area" from the "Format" menu, and then choose "None" under "Area." This will get rid of the ugly gray background. Under "Border," make the color of the border black instead of gray.
- Click on the Y-axis, choose "Selected Axis" from the "Format" menu, and make modifications to the tick marks, font and number format. Most publications recommend sans-serif fonts (such as Arial, Geneva, or Helvetica) for figures. On the "Font" tab, unselect "Auto scale," otherwise the font size will change when you change the graph size. On the "Scale" tab, set the minimum and maximum values of Y. The maximum should be a nice round number, somewhat larger than the highest point on the graph. If you're plotting a binomial percentage, don't make the Y-scale greater than 100 percent. If you're going to be adding error bars, the maximum Y should be high enough to include them. The minimum value on the Y scale should usually be zero, unless your observed values vary over a fairly narrow range. A good rule of thumb (that I just made up, so don't take it too seriously) is that if your maximum observed Y is more than twice as large as your minimum observed Y, your Y scale should go down to zero. If you're plotting multiple graphs of similar data, they should all have the same scales for easier comparison.
- Format your X-axis the same way you formatted your Y-axis.
- Click on the Y-axis title, choose "Selected Axis Title" from the "Format" menu, and adjust the font. Unselect "Auto scale" so it won't change the font size if you adjust the size of the graph. Do the same for the X-axis title.
- Pick one of the symbols, click on it, and choose "Selected Data Series" from the "Format" menu. On the "Patterns" tab, choose the symbol you want (choose "No Color" for the background to get a better view of the symbol choices). If you want to connect the dots with a line, choose that.
- If you want to add error bars, go to the "Y Error Bars" tab and choose "Both" under "Display." Ignore the buttons for "standard deviation" and "standard error," even if they sound like what you want. Instead, choose "Custom." Click on the icon next to the box labelled "+" and then select the cells in your spreadsheet containing the upper confidence interval. Do the same for the box labelled "-" and the lower confidence interval (you'll be selecting the same cells, unless you have asymmetrical confidence intervals).


Adding error bars to a graph. Repeat steps 3, 4 and 5 for the box labelled "-". - Repeat the above for each set of symbols.
- If you've added error bars, click on one of them and choose "Selected Error Bars" from the "Format" menu. On the "Patterns" tab, you can adjust the look of the error bars.
- If you want to add a regression line, click on one of the symbols and choose "Add Trendline" from the "Chart" menu. Choose which kind you want (choose "Linear" unless you really know what you're doing).
- Click in the graph area, outside the graph, to select the whole box that includes the graph and the labels. Choose "Selected Chart Area" from the "Format" menu. On the "Patterns" tab, you'll probably want to make the border be "None." On the "Properties" tab, choose "Don't move or size with cells," so the graph won't change size if you adjust the column widths of the spreadsheet.
- You should now have a beautiful, beautiful graph. You can click once on the graph area (in the blank area outside the actual graph), copy it, and paste it into a word processing document, graphics program or presentation.

The number of bird species observed in the Christmas Bird Count vs. latitude at seven locations in Delaware. Data points are the mean number of species for the counts in 2001 through 2006, with 95 percent confidence intervals.
Back-transformed axis labels
If you have transformed your data, don't plot the untransformed data; instead, plot the transformed data. For example, if your Y-variable ranges from 1 to 1000 and you've log-transformed it, you would plot the logs on the Y-axis, which would range from 0 to 3 (if you're using base-10 logs). If you square-root transformed those data, you'd plot the square roots, which would range from 1 to about 32. However, you should put the back-transformed numbers (1 to 1000, in this case) on the axes, to keep your readers from having to do squaring or exponentiation in their heads.
I've put together three spreadsheets with graphs that you can use as templates: a spreadsheet graph with log-transformed or square-root transformed X values, a spreadsheet graph with log-transformed or square-root transformed Y values, or a spreadsheet graph with log-transformed or square-root transformed X and Y values. While they're set up for log-transformed or square-root transformed data, it should be pretty obvious how to modify them for any other transformation.
![]() |
| Abundance of the longnose dace, in number of fish per 75 linear meters of stream, versus nitrate concentration. Fish abundance was square-root transformed for the linear regression. |
Drawing bar graphs with Excel
- Put the values of the independent variable (the nominal variable) in one column, with the dependent variable in the column to its right. The first column will be used to label the bars or clusters of bars. You can have more than one dependent variable, each in its own column; each will be plotted with a different pattern of bar.
- If you are plotting 95 percent confidence intervals or some other kind of error bar, put the values in the next column. These should be confidence intervals, not confidence limits; thus if your first data point has an X-value of 7 and a Y-value of 4 ±1.5, you'd have 7 in the first column, 4 in the second column, and 1.5 in the third column. For confidence limits that are asymmetrical, such as the confidence limits on a binomial percentage, you'll need two columns, one for the difference between the percentage and the lower confidence limit, and one for the difference between the percentage and the upper confidence limit.

An Excel spreadsheet set up for a bar graph including confidence intervals. - Select the cells that have the data in them. Do include the first column, with the values of the nominal variable, but don't select cells that contain confidence intervals.
- From the "Insert" menu, choose "Chart" (or click on the little picture of a graph in the task bar). Choose "Column" as your chart type, and the picture of bars next to each other (not on top of each other) as the "Chart sub-type." Do not choose the three-dimensional bars, as they just add a bunch of clutter to your graph without conveying any additional information.
- The next screen shows the "Data range," the cells that contain your data; you shouldn't need to change anything here.
- On the "Titles" tab of the "Chart Options" screen, enter titles for the X-axis and Y-axis, including the units for the Y-axis. A chart title is essential for a graph used in a presentation, but optional in a graph used for a publication (since it will have a detailed caption). Because each bar or cluster of bars will be labelled on the X-axis, you may not need an X-axis title.
- On the "Gridlines" tab of the "Chart Options" screen, get rid of the gridlines; they're ugly and unnecessary.
- On the "Legend" tab of the "Chart Options" screen, get rid of the legend if you only have one set of Y values. If you have more than one set of Y values, get rid of the legend if you're going to explain the different bar patterns in the figure caption; leave the legend on if you think that's the most effective way to explain the bar patterns.
- Click the "Finish" button, but you're not done yet. Click on the white area outside the graph to select the whole image, then drag the sides or corners to make the graph the size you want.
- Click in the gray area inside the graph, choose "Selected Plot Area" from the "Format" menu, and then choose "None" under "Area." This will get rid of the ugly gray background. Under "Border," make the color of the border black instead of gray.
- Click on the Y-axis, choose "Selected Axis" from the "Format" menu, and make modifications to the tick marks, font, and number format. Most publications recommend sans-serif fonts (such as Arial, Geneva, or Helvetica) for figures. On the "Font" tab, unselect "Auto scale," otherwise the font size will change when you change the graph size. On the "Scale" tab, set the minimum and maximum values of Y. The maximum should be a nice round number, somewhat larger than the highest point on the graph. If you're plotting a binomial percentage, don't make the Y-scale greater than 100 percent. If you're going to be adding error bars, the maximum Y should be high enough to include them. The minimum value on the Y scale should usually be zero, unless your observed values vary over a fairly narrow range. A good rule of thumb (that I just made up, so don't take it too seriously) is that if your maximum observed Y is more than twice as large as your minimum observed Y, your Y scale should go down to zero. If you're plotting multiple graphs of similar data, they should all have the same scales for easier comparison.
- Format your X-axis the same way you formatted your Y-axis. It doesn't have a scale, of course. You may want to get rid of the tick marks, they don't really serve a purpose.
- Click on the Y-axis title, choose "Selected Axis Title" from the "Format" menu, and adjust the font. Unselect "Auto scale" so it won't change the font size if you adjust the size of the graph. Do the same for the X-axis title.
- Pick one of the bars, click on it, and choose "Selected Data Series" from the "Format" menu. On the "Patterns" tab, choose the color you want. Click on "Fill effects," then the "Pattern" tab to get halftone grays (little black dots), hatching, and other patterns that work well in black-and-white. On the "Options" tab, you can adjust the width of the bars by changing the "Gap width."
- If you want to add error bars, go to the "Y Error Bars" tab and choose "Both" under "Display." Ignore the buttons for "standard deviation" and "standard error," even if they sound like what you want. Instead, choose "Custom." Click on the icon next to the box labelled "+" and then select the cells in your spreadsheet containing the upper confidence interval. Do the same for the box labelled "-" and the lower confidence interval (you'll be selecting the same cells, unless you have asymmetrical confidence intervals).
- Repeat the above for each set of bars.
- If you've added error bars, click on one of them and choose "Selected Error Bars" from the "Format" menu. On the "Patterns" tab, you can adjust the look of the error bars.
- Click in the graph area, outside the graph, to select the whole box that includes the graph and the labels. Choose "Selected Chart Area" from the "Format" menu. On the "Patterns" tab, you'll probably want to make the border be "None." On the "Properties" tab, choose "Don't move or size with cells," so the graph won't change size if you adjust the column widths of the spreadsheet.
- You should now have a beautiful, beautiful graph. You can click once on the graph area (in the blank area outside the actual graph), copy it, and paste it into a word processing document, graphics program or presentation.

The number of bird species observed in the Christmas Bird Count at seven locations in Delaware. Data points are the mean number of species for the counts in 2001 through 2006, with 95 percent confidence intervals.
Back-transformed axis labels in bar graphs
If you have transformed your data, don't plot the untransformed data; instead, plot the transformed data. For example, if you've done an anova of log-transformed data, the bars should represent the means of the log-transformed values. Excel has an option to make the X-axis of a bar graph on a log scale, but it's pretty useless, as it only labels the tick marks at 1, 10, 100, 1000…. The only way I know of to get the labels at the right place is to format the axis to not have labels or tick marks, then use the drawing and text tools to put tick marks and labels at the right positions. Get the graph formatted and sized the way you want it, then put in dummy values for the first bar to help you position the tick marks. For example, if you've log-transformed the data and want to have 10, 20, 50, 100, 200, on the Y-axis, give the first bar a value of LOG(10), then use the drawing tools to draw a tick mark even with the top of the bar, then use the text tool to label it "10". Change the dummy value to LOG(20), draw another tick mark, and so on.
⇐ Previous topic | Next topic ⇒
This page was last revised August 13, 2008. Its address is http://udel.edu/~mcdonald/statgraph.html.
©2007-2008 by John H. McDonald. You can probably do what you want with this content; see the permissions page at http://udel.edu/~mcdonald/statpermissions.html for details.
