The Institute for Perception

Mountain Road, VA, United States

The Institute for Perception

Mountain Road, VA, United States
SEARCH FILTERS
Time filter
Source Type

Ennis J.M.,The Institute for Perception | Jesionka V.,O. P. and P. Product Research B.V.
Journal of Sensory Studies | Year: 2011

"The power of sensory discrimination methods" (PSDM) was published in this journal in 1993. PSDM clarified the need for power considerations in the interpretation of testing results while providing a series of sample size tables. Despite the fact that the data considered in PSDM were binomially distributed, a normal approximation was used that both overestimated power and underestimated sample sizes. Although exact power functions have been examined in the sensory literature, the unusual behavior of these functions has not been embraced; the fact that increasing sample size can decrease power has not yet been incorporated into stable sample size recommendations. In this paper, we provide sample size recommendations with the property that any larger sample sizes also have the desired level of power. These recommendations are given in the form of tables updating those found in PSDM. In addition, a relatively new discrimination testing method known as the tetrad test has grown in popularity recently and this test now needs to be examined from a power perspective. We show that the tetrad test is remarkably powerful for an unspecified test and in some cases only requires one third the sample size as that required by the triangle test. PRACTICAL APPLICATIONS: This paper contains three main practical applications. First, we provide sample size recommendations, including tables, based on the exact power function as determined by the binomial distribution. In particular, this paper is the first to provide exact sample size recommendations such that all larger sample sizes continue to have the desired level of power. Next, we use the exact power analysis to recommend that only the 2-alternative forced choice (AFC); instead of, for example, the 3-AFC or the specified tetrad test be used for forced choice testing in which an attribute of interest is specified to distinguish the samples. Finally, we provide a power analysis of the unspecified tetrad test for the first time in the sensory literature and show that in some cases, the tetrad test only requires one third the sample size as the triangle test. This last point could lead to both significant resource savings and improved confidence for researchers throughout sensory science. © 2011 Wiley Periodicals, Inc.


Jesionka V.,Skim Inc | Rousseau B.,The Institute for Perception | Ennis J.M.,The Institute for Perception
Food Quality and Preference | Year: 2014

A commonly used approach for quantifying effect sizes in sensory difference testing is the so-called "Proportion of Discriminators" or "Proportion of Distinguishers" model. Such effect sizes are quantified by determining the proportion of discriminators in the population via a transformation of the proportion of correct responses in the difference test. This model has intuitive appeal as it promises researchers the ability to gauge the meaningfulness of results - experiments that yield a high Proportion of Discriminators are supposed to reflect meaningful sensory differences.Despite the intuitive appeal of this approach, in this article we highlight that the Proportion of Discriminators model has several limitations and, in fact, does not actually have many of the properties that a measure of underlying effect size should have. Moreover we show that these limitations can lead to important errors - as a result we conclude that the Proportion of Discriminators model not be used.As an alternative, we recommend Thurstonian analysis, and we show how Thurstonian analysis offers many of the intuitive properties that Proportion of Discriminators lacks. Nonetheless, communication challenges remain, especially between researchers and management. To address these challenges we provide suggestions and tables to help guide the transition away from Proportion of Discriminators towards a Thurstonian perspective. This transition, once complete, will reward sensory researchers with more reliable and meaningful information from their difference testing programs. © 2013 Elsevier Ltd.


Ennis D.M.,The Institute for Perception | Ennis J.M.,The Institute for Perception
Food Quality and Preference | Year: 2010

In statistical applications, such as a comparison of two items, it is useful to know whether one item is equivalent to another. Similarly it is often desirable to know whether one item can act as a substitute for another. Applications of the concept of equivalence include blend and flavor modifications of products, substitution of generic drugs for brand-name drugs, modifications of products in response to government regulations, or component substitutions with more healthful or lower cost components. In addition, some companies develop products that are direct substitutes for those of their competitors and make advertising claims concerning their equivalence. In a recent paper, Ennis and Ennis [Ennis, D. M., & Ennis, J. M. (2009). Hypothesis testing for equivalence based on symmetric open intervals. Communications in Statistics - Theory and Methods, 38(11), 1792-1803] used an open interval to define equivalence and provided exact and approximate methods for testing a null hypothesis of nonequivalence. In this paper, a discussion of this newly developed theory of equivalence testing is presented along with a comparison to existing methods such as the "two one-sided tests" (TOST) method. We provide numerical examples to illustrate this new theory and we demonstrate that although the TOST is a convenient approximation it is fundamentally inconsistent with the specification of the null hypothesis. © 2009.


Ennis J.M.,The Institute for Perception | Ennis D.M.,The Institute for Perception
Journal of Sensory Studies | Year: 2012

The treatment of no preference votes continues to be an issue in sensory science, especially as the proper treatment of these votes has recently gained importance in advertising claims support. There are currently three main methods in common use: dropping the no preference votes, splitting the votes equally and splitting the votes proportionally according to the results among those who expressed a preference. The analyses then proceed as if the data were binomially distributed. In this paper, we compare these methods with respect to power and type I error. We show that proportional splitting returns more false alarms than expected and hence should not be used. We then discuss the meaningful interpretation of statistical significance in the presence of large numbers of no preference votes before providing general recommendations and indicating a promising direction of future research in this area. © 2012 Wiley Periodicals, Inc.


Ennis J.M.,The Institute for Perception
Journal of Sensory Studies | Year: 2012

Tetrad testing is theoretically more powerful than Triangle testing, yet the addition of a fourth stimulus raises questions - it is possible that the addition of a fourth stimulus places such an additional demand on subjects that the theoretical advantage of the Tetrad test is lost. In this paper, we provide a guideline to compare results of Tetrad and Triangle. Specifically, it is roughly correct to say that as long as the effect sizes do not drop by more than one third for the same stimuli, then the Tetrad test remains more powerful than the Triangle test. We explain this guideline in terms of perceptual noise, illustrate its use in several examples and discuss the statistical considerations that accompany its use. To assist with statistical evaluation, we provide a table for finding the variance in the Tetrad-based measurement of the effect size. Finally, we show how the Thurstonian framework helps us to improve discrimination testing efficiency even when we do not seek additional power. © 2012 Wiley Periodicals, Inc.


Ennis J.M.,The Institute for Perception | Christensen R.H.B.,Technical University of Denmark
Food Quality and Preference | Year: 2014

Interest in the Tetrad test has increased recently as it has become apparent that this methodology can be a more powerful alternative to the Triangle test within the standard difference testing paradigm. But when products are tested following an ingredient or process change, a pressing question is whether a sensory difference is large enough to be meaningful. To this end, in this paper we examine the precision of measurement offered by the Tetrad test as compared to two other standard forced-choice discrimination testing procedures - the Triangle and 2-AFC tests. This comparison is made from a Thurstonian perspective. In particular, for all three methods we compare: (1) The variances in the maximum-likelihood estimates of the Thurstonian measure of sensory difference, (2) The expected widths of the corresponding likelihood-based confidence intervals, and (3) The power of the tests when used for equivalence testing. We find that the Tetrad test is consistently more precise than the Triangle test and is sometimes even more precise than the 2-AFC. As a result of this precision, we discover that the Tetrad test is typically more powerful than the Triangle test for equivalence testing purposes and can, under certain conditions, even be more powerful than the 2-AFC. © 2013 Elsevier Ltd.


Ennis D.M.,The Institute for Perception | Ennis J.M.,The Institute for Perception
Food Quality and Preference | Year: 2012

The analysis of choice data in which no difference/preference responses, or ties, occur is considered in this paper. A key issue addressed in the paper is the need for " identicality norms" for difference and preference tests. These norms reflect the researcher's expectation for the number of ties that would have occurred in the experiment had the products tested been putatively identical. Without these norms, the issue of how to account for ties can never be fully resolved. After this idea is developed, some methods from the statistics literature to account for ties are reviewed and the Thurstonian 2-AC (2-Alternative Choice) model is discussed. Common practices of equal or proportional redistribution of ties are noted to be either conservative or liberal, respectively, when the binomial distribution is used to evaluate results. In particular, the exact probability function for the equal allocation method is given as a particular type of mixing distribution, known as a convolution, of binomial probability functions. Regardless of which statistical method is used to test tied data, however, none of the current methods of analysis can account for segmentation or the effect of heterogeneity in individual assessors. To study the possible effect of heterogeneity, the data could first be tested against an identicality norm. Thus, this research clarifies the assumptions that are made when conducting tests on paired comparison data with ties and provides guidance on the choice of analytic method. © 2011 Elsevier Ltd.


Ennis J.M.,The Institute for Perception | Christensen R.,Technical University of Denmark
Food Quality and Preference | Year: 2015

The recurring need to assess product reformulations has kept difference testing at the forefront of sensory science. Within the realm of difference testing, the Tetrad test has risen in popularity recently as its superiority over the Triangle test has been demonstrated both in theory and in practice. But it remains to compare the Tetrad test in detail with other commonly used testing methods such as the Degree of Difference (DOD) test. In this paper, we provide such a comparison by considering, from a theoretical perspective, the differences between both power and precision for the Tetrad and DOD tests. In particular we show that, theoretically and for the range of sensory effect sizes likely to be of interest in consumer research, the Tetrad test is more powerful and more precise than the DOD test. Even so, if there is substantially more perceptual noise in the Tetrad test from the two additional stimuli, it is possible that performance of the DOD could surpass the performance of the Tetrad test in practice. To investigate this last statement, we quantify the additional noise required to negate the theoretical advantage of the Tetrad test. © 2014 Elsevier Ltd.


Ennis D.M.,The Institute for Perception | Ennis J.M.,The Institute for Perception
Journal of Sensory Studies | Year: 2013

Check-all-that-apply (CATA) lists are commonly used in both survey research and sensory science. A related technique, referred to in this paper as applicability scoring, requires respondents to respond positively or negatively to the item of interest - applicability scoring differs from CATA as CATA only requires a check when the item applies to the object being scored. Both hypothesis testing and scale estimation for applicability scoring of sequentially tested products are considered in this paper. For the former, we demonstrate the use of McNemar's test and for the latter, we present a Thurstonian model. Using applicability scores for scale estimation is important because a connection can then be made to other methods through a common framework, allowing cross-comparison and validation. In addition, applicability scoring provides a sensitive method for assessing product differences and may be particularly useful when an attribute cannot be conveniently expressed in a rating or 2-alternative forced choice (2-AFC) format. Practical Applications: The first use of applicability scoring in survey research was by Sudman and Bradbury and its first use in sensory science was by Loh and Ennis. In survey research, the method is more appropriate than CATA lists in telephone surveys and may lead to deeper processing of the items. In sensory science, the method offers a convenient way of collecting and analyzing data on product differences for attributes that may not be easily expressed as ratings or in a 2-AFC format. Since the method is used sequentially, it can be used for more than two products. When the attribute is liking (the item scored is I like this product), the method allows the separation of like both from like neither which is not provided using a preference question with a no-preference option. This capability therefore provides more information about the acceptability of both products than can be obtained from a preference test. © 2013 Wiley Periodicals, Inc.


Ennis J.M.,The Institute for Perception
Journal of Sensory Studies | Year: 2013

The Two-Out-of-Five test is a method of unspecified difference testing. Although its low guessing probability (1/10) gives promise that it might have high power, the theoretical underpinnings of the method have not yet been investigated. In this article, we offer the first such investigation, via Thurstonian analysis. This investigation reveals that the standard form of the Two-Out-of-Five test is more statistically powerful than the Triangle test, but not as powerful as the Tetrad test. We then propose a new way of scoring Two-Out-of-Five data that yields a test with higher power and lower sample size requirements than the Tetrad test, under the assumption that there is no additional noise from the evaluation of an additional stimulus. This last result is achieved without any experimental modification of the Two-Out-of-Five protocol. Tables for estimating the Thurstonian measure of sensory effect size, δ, for calculating the error in such estimates, and for recommended sample sizes are given. Finally, caution is given against incorrect instructions in the Two-Out-of-Five test - if respondents are asked simply to identify the two most similar samples, the resulting test has almost no power. © 2013 Wiley Periodicals, Inc.

Loading The Institute for Perception collaborators
Loading The Institute for Perception collaborators