Beer Evaluation and the Judging Process

by Edward W. Wolfe

This information was written before the Judging Procedures Manual was developed. Much of the information here has been superseded by that document. There are also other judging references available on the BJCP website.

Beer Evaluation

Product evaluation is an important part of brewing, whether performed informally or formally and whether the product is from a commercial or home brewery. Formal beer evaluation serves three primary purposes in the context of brewing competitions. First, the beer evaluations provide feedback to the brewer concerning how well an individual recipe represents its intended beer style. This feedback can be useful as recipes are fine-tuned and attempts are made to improve the beer. Second, beer evaluations may provide brewers with troubleshooting advice. These diagnostic suggestions are particularly helpful when the brewer cannot identify the source of off-flavors or aromas. A knowledgeable beer evaluator can provide the brewer with suggestions for changing procedures and equipment that can help eliminate undesirable flavor and aroma components. Third, beer evaluation provides a fairly unbiased method for selecting and recognizing outstanding beers in brewing competitions.

Environment

One important condition that is necessary for accurate beer evaluation is the establishment of a suitable environment. The environment should be well- lit, odor-free, and distractions should be minimized. Natural, diffuse lighting is best, with incandescent lighting preferred over fluorescent lighting. Table cloths and walls should be free of patterns that might obscure visual inspection of the beer, and light colored or white walls and tablecloths are ideal. The room in which evaluation takes place should be as free of odors as possible. Restaurants and breweries can be particularly troublesome locations for evaluating beers because food and brewing odors are likely to interfere with a beer judge’s ability to smell the beers being evaluated. Smoking and perfumes should also be eliminated as much as possible. In addition, the evaluation environment should be as free from other distractions. Noise should be kept to a minimum, and privacy should be preserved to the greatest extent possible. Every effort should be made to make the beer judges comfortable by carefully selecting chairs and tables, monitoring the temperature of the evaluation room, and providing assistance to judges during the evaluation process (e.g., stewards).

Equipment

A second important condition that is necessary for effective beer evaluation is suitable equipment. That is, judges need sharp mechanical pencils with erasers — mechanical so that the aroma of wood does not interfere with detecting beer aromas and erasers so that comments and scores can be changed. Beer judges also need suitable cups for sampling the beer — impeccably clean plastic or glass, odor-free, and clear. Also, judges need access to style guidelines. Tables should be equipped with water and bread or crackers for palate cleansing, buckets and towels for cleaning spills or gushes, bottle openers and cork screws, and coolers and temporary caps for temporary storage of opened bottles.

Presentation

As for the presentation of beers, two methods are common, each with positive and negative points. One method of presentation permits judges open and pour the beer into their own cups. A second method of presentation requires stewards to pour beer into pitchers, and the beer is transferred from the pitcher into judges’ cups. When judges are allowed to pour their own beers, there is some danger that moving bottles to the evaluation table will stir up yeast and that judges’ opinions of a beer’s quality will be influenced by the appearance of the bottles that it comes in. On the other hand, when judges transfer beer from a pitcher, it is more difficult to capture many of the fleeting aromas that might dissipate between the time the bottle is opened and the time that judges are presented with the beer. Another problem with using pitchers is that it is more difficult to temporarily store beer samples so that judges can taste them again at a later time.

The Judging Process

Decision Strategies

There are two general decision making strategies that judges use when evaluating a beer. In a top-down decision making strategy, the judge forms an overall impression about the quality of the beer, decides what overall score to assign that beer, and deducts points for each deficient characteristic of the beer based on the overall impression. The problem with this top-down approach to beer evaluation is that it is difficult to ensure that the points allocated to each subcategory (e.g., aroma, appearance, flavor, body) agree with the comments that were made about that feature of the beer. In a bottom-up decision making strategy, the judge scores each subcategory of the beer, deducting points for each deficient characteristic. The overall score is determined by summing the points for each subcategory. The problem with this bottom-up approach to beer evaluation is that it easy to arrive at an overall score for the beer that does not agree with the overall impression of the beer. In short, judges who use a top-down approach to judging beers may “miss the trees for the forest,” while judges who use a bottom-up approach to judging beers may “miss the forest for the trees.”

Most judges use a combination of these two extremes. Regardless of which approach seems more comfortable to an individual beer judge, there are several general guidelines that judges should follow when assigning scores to beers. In the current BJCP scoring systems, each beer is evaluated on a 50- point scale, allocating 12 points for Aroma, 3 for Appearance, 20 for Flavor, 5 for Mouthfeel and 10 for Overall Impression. In addition, there are sliding scales on the bottom right hand corner for rating the stylistic accuracy, technical merit and intangibles of each beer.

Overall scores should conform to the descriptions given at the bottom of each scoresheet. Excellent ratings (38-44) should be assigned to beers that are excellent representations of the style. Very Good ratings (30-37) should be assigned to good representations of the style that have only minor flaws. Good ratings (21-29) should be assigned to good representations of the style that have significant flaws. Drinkable ratings (14-20) should be assigned to beers that do not adequately represent the style because of serious flaws. A problem rating (13 or lower) is typically assigned to beers that contain flaws that are so serious that the beer is rendered undrinkable. The scoresheet reserves the 45-50 range for outstanding beers that are truly world-class.

In general, the best beers at a competition should be assigned scores in the 40+ range, with real evaluations of the beer identifying some characteristics of the beer that make it non-perfect. A beer receiving a perfect score of 50 must indeed be perfect; it must have absolutely no flaws, exemplify the style as well as or better than the best commercial examples, be perfectly brewery-fresh, and be well-handled and presented. These conditions might not all be under the brewer’s control, so achieving a perfect beer at the point of presentation to judges is extremely rare.

When providing feedback about very good beers, it is important to identify ways in which the beer can be improved and mention these characteristics on the scoresheet. Any serious flaw or missing aspect of a particular beer style (such as lack of clove character in a Bavarian weizen) generally results in a maximum score around 30. Also, note the cut-off score of 21 determines if a beer adequately represents a particular style.

A beer that is strongly infected or that contains a flaw so severe that it makes the beer undrinkable can be assigned a score of 13. However, this is simply a guideline. If the flaws are so bad that even a 13 is generous, judges can score lower. Simply justify your score using a bottoms-up method; assign points for positive attributes that are present. Give the benefit of the doubt for low-scoring beers. A score of 13 makes the point that the beer is essentially undrinkable; lower scores can be taken as spiteful. If you do score lower than 13, strive to make as many useful comments as possible on how the brewer can improve the beer. Always look for positive comments to make about a beer, and then let the brewer know what aspects of the beer need attention and how to correct any flaws.

Procedure

Beers should be evaluated using the following procedure:

  • Prepare a scoresheet. Write the entry number, style category and subcategory names and numbers, your name, and any other necessary information (e.g., judge rank, your phone number of e-mail, etc.) on a scoresheet or apply a pre-printed label.
  • Visually inspect the bottle (if given the bottle). Check the bottle for fill level, clarity, sediment, and signs of problems (e.g., a ring around the neck of the bottle). Identification of such characteristics may be helpful in describing flaws that are discovered during the formal evaluation process. However, be careful not to prejudge the beer based on a visual inspection of the bottle.
  • Pour the beer into clean sampling cup, making an effort to agitate the beer enough to produce a generous head (but not enough to produce a head large enough to interfere with drinking the beer). For highly carbonated beers (large enough to interfere with drinking the beer), this may require pouring carefully into a tilted cup. For beers with low carbonation, this may require pouring directly into the center of the cup, with a 6 inch drop from the bottle. Pour each entry in a manner that gives it its optimum appearance, keeping in mind that some entries may be over- or under-carbonated.
  • Smell the beer. As soon as the beer is poured, swirl the cup, bring it to your nose, and inhale the beer’s aroma several times. When a beer is cold, it may be necessary to swirl the beer in the cup, warm the beer by holding it between your hands, or putting your hand on the top of the cup to allow the volatiles to accumulate in a great enough concentration to be detected. Write your impressions of the beers aromas. Particularly, note any off aromas that you detect. Do not assign scores for aroma yet.
  • Visually inspect the beer. Give your nose a rest, and score the appearance of the beer. Tilt the cup, and examine it through backlighting. For darker beers, it may be necessary to use a small flashlight to adequately illuminate the beer. Examine the beer’s color, clarity, and head retention. Write comments about the degree to which the color, clarity, and head retention are appropriate for the intended style and record a score. Score the beer for appearance, allocating a maximum of two (one of the new scoresheet) points for each of these characteristics.
  • Smell the beer again. Again, swirl the cup, bring it to your nose, and inhale the beer’s aromas several times. Note how the beer’s aroma changes as the beer warms and the volatiles begin to dissipate. Write your impressions of the beers aromas, noting particularly the appropriateness of the malt, hops, yeast, and fermentation byproduct aromas. Also, note any lingering off aromas. Do not assign scores for aroma yet.
  • Taste the beer. Take about 1 ounce of beer into your mouth, and coat the inside of your mouth with it. Be sure to allow the beer to make contact with your lips, gums, teeth, palate, and the top, bottom, and sides of your tongue. Swallow the beer, and exhale through your nose. Write down your impressions of the initial flavors of the beer (malt, hops, alcohol, sweetness), intermediate flavors (additional hop/malt flavor, fruitiness, diacetyl, sourness), aftertaste (hop bitterness, oxidation, astringency), and conditioning (appropriateness of level for style). Do not assign scores for flavor yet.
  • Score the beer for body (mouthfeel on the new). Take another mouthful of beer and note the appropriateness of the beer’s viscosity for the intended style. Write comments concerning your impression and assign between 2 and 5 points with higher numbers reflecting appropriate mouthfeel and lower numbers indicating increasing levels of lightness or heaviness for the intended style.
  • Evaluate the beer for overall impression. Relax. Take a deep breath. Smell the beer again, and taste it again. Pause to consider where the beer belongs in the overall range of scores (e.g., excellent, very good, good, drinkable, problem) and where similar beers are ranked within the judging flight. If you use a top-down decision making strategy, assign an overall score to the beer, then mentally subtract points from the remaining subcategories (i.e., aroma and flavor), consistent with your impressions of how the beer is deficient. Use the overall impression category to adjust your final score to the level you feel is appropriate for this beer. If you use a bottom-up decision making strategy, assign scores to each of the remaining subcategories (i.e., aroma and flavor), and assign a score for overall impression. Finally, write prescriptive suggestions for improving the beer in light of any deficiencies you noted in your evaluation. Also, check any boxes on the left side of the scoresheet that are consistent with your comments.
  • Check your scoresheet. Add your category scores. If you use a bottom up approach, double check to make sure you added correctly. If you use a top down approach, make sure that your subcategory scores sum to equal your overall score. When the other judges have finished scoring the beer, discuss the technical and stylistic merits of the beer and arrive at a consensus score. Be prepared to adjust your scores to make them fall within 7 points of the other judges at your table.

Smelling Beer

When a beer judge smells a beer, the judge is literally inhaling small particles of the beer. The sense of smell works by detecting molecules that are diffused into the air. These molecules are inhaled into the sinus cavity where receptors (olfactory cells) detect and translate the chemical information contained in the molecules into information that the brain can interpret. Several things influence a judge’s ability to detect the variety of aromas in beer.

  • First, there are different densities of the receptors found in different people. Hence, some judges may simply be more sensitive to odors than are other judges.
  • Second, the receptor cells can be damaged through exposure to strong substances (e.g., ammonia, nasal drugs), and this damage may take several weeks to heal.
  • Third, changes in the thickness of the mucus that lines the nasal cavity may influence a judge’s sensitivity. Any molecules that are detected by the olfactory cells must pass through a mucus lining, so daily changes in the thickness of that lining influences our sensitivity from day to day. The thickness of the lining can be influenced by sickness (e.g., colds), or exposure to a variety of allergens or irritants (e.g., pet dander, dust, smoke, perfume, spicy foods). Therefore, judges need to take into account their current levels of sensitivity, given their health and exposure to substances that could interfere with their sense of smell.
  • Finally, the olfactory cells become desensitized to repeated exposure to the same odors. As a result, a beer judge may be less able to detect subtle aromas as a judging session progresses. One way to remedy this problem is to occasionally take deep inhales of fresh air to flush the nasal cavity. Another way to lessen desensitization to certain odors is to sniff something that has a completely different odor (e.g., sniffing your sleeve) (Eby, 1993; Palamand, 1993).

Regardless of a judge’s ability to detect various odors in beer, that ability is useless if the judge cannot use accurately descriptive terms to communicate information to the brewer. Hence, it is important for beer judges to build a vocabulary for describing the variety of odors (and knowledge of the source of those odors). Meilgaard (1993) presents a useful taxonomy of beer-related odors. His organizational scheme categorizes 33 aromas into 9 overall categories (oxidized, sulfury, fatty, phenolic, caramelized, cereal, resinous, aromatic, and sour). Beer judges should make efforts to expand their scent recognition and vocabulary.

Tasting Beer

The sense of taste is very similar to the sense of smell. Taste is the sense through which the chemical constituents of a solid are detected and information about them is transmitted to the brain. The molecules are detected by five types of taste buds that are on the tongue and throat; some areas of the tongue are more sensitive to certain basic flavors than others, but the commonly-referenced Tongue Taste Map has been debunked. For example, you can taste bitterness more towards the back of your tongue, but the entire tongue can taste it. The five basic tastes detected by the tongue are sweetness, sourness, saltiness, bitterness and umami (savoriness).

Since all of these flavors are present in beer, it is important that beer judges completely coat the inside of their mouths with beer when evaluating it and that the beer be swallowed. As is true for the scent receptors in the nose, different people have different densities of taste buds and, thus, have different sensitivities to various flavors. Also, taste buds can be damaged (e.g., being burnt by hot food or through exposure to irritants like spicy foods, smoking, or other chemicals), so a judge’s sensitivity may be diminished until tastebuds can regenerate (about 10 days). Judges need to be aware of their own sensitivities and take into account recent potential sources of damage when evaluating beers. In addition, taste buds can be desensitized to certain flavors because of residual traces of other substances in the mouth. Therefore, it is best for judges to rinse their mouths between beers and to cleanse their palates with bread or salt-free crackers (Eby, 1993; Palamand, 1993).

Of course, as is true for the sense of smell, a judge’s ability to taste substances in beer is useless unless that judge can accurately identify the substance and use appropriate vocabulary to communicate that information to a brewer. Meilgaard’s (1993) categorization system for beer flavors includes 6 general categories (fullness, mouthfeel, bitter, salt, sweet, and sour) consisting of 14 flavors that may be present in beer. Judges should continually improve their abilities to detect flavors that are in beer, their abilities to use appropriate words to describe those perceptions, and their knowledge of the sources of those flavors so that brewers can be provided with accurate and informative feedback concerning how to improve recipes and brewing procedures.

Making Comments About Beer

There are five things to keep in mind as you write comments about the beers you judge:

  • First, your comments should be as positive as possible. Acknowledge the good aspects of the beer rather than focusing only on the negative characteristics. Not only does this help make any negative comments easier to take as a brewer, but it gives your evaluation more credibility.
  • Second, and related, be polite in everything that you write about a beer. Sarcastic and deprecating remarks should never be made on a scoresheet.
  • Third, be descriptive and avoid using ambiguous terms like “nice.” Instead, use words to describe the aroma, appearance, and flavors of the beer.
  • Fourth, be diagnostic. Provide the brewer with possible causes for undesirable characteristics, and describe how the recipe or brewing procedure could be adjusted to eliminate those characteristics.
  • Finally, be humble. Do not speculate about things that you do not know (e.g. whether the beer is extract or all-grain), and apologize if you cannot adequately describe (or diagnose) characteristics of the beer that are undesirable.

Other Considerations

Before the Event

Before a judging event, you should take steps to mentally and physically prepare yourself. Thoroughly familiarize yourself with the style(s) that you will judge if you know what those styles are ahead of time. Sample a few commercial examples and review the style guidelines and brewing procedures for those styles. Also, come to the event prepared to judge. Bring a mechanical pencil, a bottle opener, a flashlight, and any references that you might need to evaluate the beers. Also, make sure to come to the event in the right frame of mind. Get adequate rest the night before; shower; avoid heavily scented soaps, shampoos, and perfumes; avoid eating spicy foods and drinking excessively; and avoid taking medication that might influence your ability to judge (e.g., decongestants). You can also prepare your stomach for a day of beer drinking by drinking plenty of water and eating a dinner that contains foods that contain fats the night before the event and by eating extra sugar the morning of the event (e.g., donuts) (Harper, 1997).

Fatigue & Errors

During a judging flight, it is important to keep in mind that errors can creep into your judging decisions as a result of fatigue (palate or physical), distractions, or the order in which beers are presented. More specifically, judges may tend to assign scores (central scoring) in a much narrower range as time progresses simply because palate fatigue causes the beers to taste more and more similar over time. Conversely, judges may assign one or two beers much higher scores than other beers simply because they stand out as being much more flavorful (extreme scoring). In addition, as judges become tired (and possibly intoxicated) during long flights, they may allow impressions of some very noticeable characteristics of particular beers to overly influence their perceptions (and scores) of other characteristics of the beers (halo effect). For example, a weizen that is too dark may (falsely) also seem too heavy and caramel-flavored. Also during long flights, judges need to be mindful of the fact that proximity errors (e.g., assigning scores that are too high to a beer that follows a poor example of the style) and drift (e.g., assigning progressively lower (or higher) scores to beers as time progresses) may influence the validity of the scores that they assign (Wolfe, 1996; Wolfe & Wolfe, 1997).

Unfortunately, it is nearly impossible to know when errors such as these have crept into your judgments. Therefore, it is extremely important to retaste all of the beers in a flight, especially the ones in the top half of the flight. In general, most flights should contain less than 12 beers, so this would entail retasting at least the 6 that receive the highest scores. Each beer should be carefully reevaluated to make sure that the rank ordering of the assigned scores reflects your overall impression of the actual quality of the beers. Only after retasting and discussion of these impressions should awards be assigned to beers within the flight. Note that the competition coordinator may request that you readjust your scores to reflect any discrepancies between the ordering of awards and the ordering of assigned scores.

Finishing a Flight

When you have finished judging a flight of beers, make sure that your scoresheets are complete, that the scoresheets have been organized in a way that the competition organizer can identify the scores and the awards that you assigned, and that the table at which you judged is ready to for another judging flight or that (following the final flight of the day) it is cleaned. Most importantly, avoid causing distractions to other judges who have not yet finished judging their flights (e.g., loud conversations, interrupting judges who are still making decisions, etc.). In fact, this would be a good time to leave the judging area for a beer or a breath of fresh air. Also, be conscientious in what you say to others about the beers that you judged. It is often tempting to tell others about the worst beer in your flight or to make remarks about the overall poor quality of entries that you judged. Not only are comments such as these in poor taste, but since you do not know who entered the beers that you judged, you may offend the person to whom you are talking (or judges who are still judging).

Practicing

Of course, one of the best (and most enjoyable) things that you can do to maintain your judging skills is to continually practice by sampling a variety of beers and brewing your own beers. In addition to visiting pubs and microbreweries, you can sample homebrew regularly by attending homebrew club meetings. Entering beers in competitions is also a practical way to compare your flavor perception and troubleshooting skills with those of experienced judges. You can also brush up on your judging skills by coordinate tasting sessions and mini-competitions with other judges or by sampling beers that have been “doctored” to simulate common flavors and flaws in beer (Wolfe & Leith, 1997). Dr. Beer ® is a commercial example of this program, but several authors have described methods for preparing beers using readily- available ingredients (Guinard & Robertson, 1993; Papazian & Noonan, 1993; Papazian, 1993). Guidelines for a doctored beer session are contained within this guide.

The BJCP provides sensory kits through the BJCP Education and Training program. Please see their references for the current details of this resource.

References and Additional Reading

  • Eby, D.W., “Sensory aspects of zymological evaluation” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 39-54.
  • Guinard, J.X. and Robertson, I., “Sensory evaluation for brewers” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 55-74.
  • Harper, T., “Scrutinize. Swirl. Sniff. Sip. Swallow. Scribble.: The Six Habits of Highly Effective Great American Beer Festival Judges'” Sky (September, 29-31, 1997).
  • Konis, T., “Origins of normal and abnormal flavors” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 91-104.
  • Meilgaard, M.C., “The flavor of beer” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 15-38.
  • Palamand, R., “Training ourselves in flavor perception and tasting” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 115-131.
  • Papazian, C., “Evaluating beer” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 3-14.
  • Papazian, C., “Testing yourself” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 215-223.
  • Papazian, C. and Noonan, G., “Aroma identification” in Evaluating Beer (Brewers Publications, Boulder, CO, 1993), pp. 199-214.
  • Wolfe, E.W., “Unbeknownst to the right honourable judge: Or how common judging errors creep into organized beer evaluations” Brewing Techniques, v. 4(2), 56-59 (1996).
  • Wolfe, E.W. and Wolfe, C.L., “Questioning order in the court of beer judging: A study of the effect of presentation order in beer competitions,” Brewing Techniques, v. 5(2), 44-49 (1997).
  • Wolfe, E.W. and Leith, T., “Calibrating judges at remote locations: The Palate Calibration Project,” submitted to Brewing Techniques (1997).