What Is the Psychology of Concepts a Psychology of ?
What Is the Psychology of Concepts a Psychology of ? There is a methodological pitfall masquerading as an advantage that accounts for some part of the problem I have described. In order to study concepts, all I have to do is to make up two sets of entities (which can be anything, although I will usually call them objects because that is usually what they are) and persuade subjects to give one response to one set and a diVerent response to the other set. The problem with this is that I can make up any arbitrary sets, use any procedure to try to get subjects to learn, and use any response that isatalldistinctive.Hull (1920),whowas facedwiththe taskof developing a methodology for studying concepts experimentally, listed a number of desiderata for studying concepts. The Desiderata included the use of distinct classes, each receiving diVerent responses. However, they also included constraints on the concepts themselves, namely that each concept should contain an element that is unique to it. This desideratum reflected Hull’s assumption about the structure of categories in the world, what has come to be known as the classical view of categories (Smith & Medin, 1981). Let us imagine for a moment that Hull had been right about categories, that each category has a unique element or some set of defining features that determine category membership. What would we then think about the vast majority of modern experiments on concepts, which lack such defining features? These experiments might be interesting as studies of abstract learning, but they would simply not be about how people learn concepts. A study of how people learn nonlinearly separable categories might have some interest regarding the nature of memory and learning in general, but it would tell us little about how people learn real categories because nonlinearly separable categories by definition do not have definitions (sic). Studies of family resemblance concepts (Rosch & Mervis, 1975), in which category members tend to share features but have no feature common to the whole category, would also not be telling us how people learn real categories. These studies would be uninformative because the requirements involved in learning a well-defined category are diVerent from those involved in learning family resemblance or NLS categories. Indeed, the study of logically defined concepts that was ushered in by Bruner, Goodnow, and Austin (1956) was essentially dropped when Rosch published her studies of the structure of natural categories (e.g., Rosch, 1973, 1975). The studies of concept attainment that Bruner et al. and many others carried out are now viewed as studies of a particular kind of reasoning or problem solving rather than studies of concept learning, precisely because we believe that real concepts are not like Bruner et al.’s concept. In order to answer how people learn categories and form concepts, we cannot operate in a vacuum of knowledge about the real structure of Ecological Validity and the Study of Concepts 7categories. For Hull, it would have been pointless to study how people learn family resemblance categories because this could not tell you how people learn ‘‘real’’ categories. We are less certain now, perhaps, what the real categories are and therefore are less willing to reject any particular experiment as being irrelevant. But perhaps we have erred on the side of liberality and acceptingness. Perhaps some of the categories we have studied do not tell us about how people learn real categories, just as the studies of Bruner et al. do not tell us how people learn family resemblance categories. A. The Logic of Hypothesis Testing in Categorization Research Although categorization experiments themselves form a family-resemblance category, there are some characteristics that are widespread throughout the domain. In particular, the experimental logic described in many articles is of the following sort: (a) Two or more theories of concepts are reviewed. (b) The theories turn out to make very similar predictions for simple categories. (c) However, there is a categorical structure that distinguishes the two theories. In particular, one theory says that the structure should be fairly easy, whereas the other says that it should be diYcult. (More generally, there is a variable that one theory claims is important but the other does not.) (d) Therefore, the article presents experiments that test the critical structure or variable to see which theory is correct. This logic, which seems perfectly straightforward as a form of scientific hypothesis testing, can be found in studies such as comparisons of exemplar and prototype theory, studies of feature frequency, examinations of knowledge eVects and causal structure, and others. Clearly, if one theory predicts the structural eVect (or the eVect of the tested variable) correctly and the other does not, then strong support is given to the first theory. The diYculty with this logic arises from the considerations raised in the previous section. What happens if the critical structure is one that is not present (or rarely present) in nature or if the variable is something that does not really vary in the domain of most concept learning? I call this the problem of unconstrained concept construction. The problem is that anyone can make up any old set of things and call it a concept. This concept can then serve as the critical test of one’s theory. For example, my theory, let us say, predicts that concept (1) below should be easier to learn than concept (2), whereas your theory makes no such prediction: (1) a horse, the Mona Lisa, Bill Clinton, a red telephone, and a pile of quartz (2) three roaches, a hair dryer, a postcard of Dayton, Ohio, and a retirement party 8 Gregory L. MurphySuppose that I run the experiment and find that in fact (1) is easier to learn than (2). How likely are you to exchange your theory for mine? Unless you are remarkably easygoing, I would guess that my experiment will have little eVect on your theorizing. And although I would take the opportunity to lambaste you and your theory in the usual outlets, I think you would be well justified in suggesting that this comparison is weird and unnatural and that its results simply cannot tell us much about how people learn concepts such as mammals, ball games, or pencils. The ability of a theory to distinguish these two categories simply does not give useful information about its ability to describe normal concepts. This example is obviously exaggerated for purposes of illustration. However, the same question truly does arise in less exaggerated form in other cases. Suppose, for the sake of argument, that people almost always have some general knowledge of the domain of categorization when they learn a new concept. That is, after very early childhood, people seldom learn about an animal without already knowing some similar animals and some facts about animal behavior and biology; they seldom learn about a new sport without already knowing what sports are like; and they seldom learn about an electronic device without knowing a lot of consumer electronic products. If that is the case, then do studies of concept learning in which learners have no knowledge of the domain whatsoever tell us about real concept learning? Or consider the linear separability debate. Medin and Schwanenflugel (1981) found no marked diVerence between people’s learning of LS and NLS categories, which was contrary to the prediction of prototype theory. However, an examination of the categories used in their experiments (see Murphy, 2002; Smith et al., 1997) raises various concerns with them. For example, in every study of NLS categories that I know of, each category contains two objects that are exact opposites of one another. This is the simplest way to ensure that no independent weighting of features can correctly categorize all the items in the category. One category might have a single small blue triangle and another item with two large red circles. If each dimension has only two values (blue–red, circle–triangle, etc.), then these two items are true opposites. But what kind of category contains items that have no features in common whatsoever? It is as if we included trout in the category of birds and bluejays in the category of fish, keeping everything else the same. However, such opposites are put into NLS categories without apology. Smith et al. focus on the low degree of overall category diVerentiation in many experiments making these contrasts, arguing that the experimental categories are much less coherent than real categories. Furthermore, they claim that some subjects use prototypes when there is considerable category diVerentiation, even for NLS categories. Thus, past Ecological Validity and the Study of Concepts 9findings strongly supporting exemplar theory may apply only to categories that are poorly structured. A very similar problem comes about in interpreting the results of one of the most classic of all category-learning experiments, that of Shepard, Hovland, and Jenkins (1961). Shepard et al. developed six diVerent categorization problems, each dividing eight stimuli into two categories of four items. These problems comprised all the logical possibilities of dividing up eight items based on three binary stimulus dimensions. The categories ranged froma simple single-dimensional categorization (e.g., separating large from small items) to a two-dimensional conjunctive rule to a categorization that used all three dimensions orthogonally. Figure 1 illustrates the easiest and hardest category structures. Shepard et al. and a large number of subsequent researchers (e.g., Kruschke, 1992) found a reliable ordering of learning diYculty of these six types, with type I easier than type II, types III– V being about the same, and type VI the hardest. A detailed analysis of their results led Shepard et al. (1961, p. 33) to the important conclusion that subjects were focusing attention on diVerent stimulus dimensions, forming hypotheses about what rule separated the two categories. I have no problem with Shepard et al.’s analysis (1961) of their experiment as a critique of stimulus-response (S-R) learning theories. Indeed, such theories make the claim that learning is an unconstrained process of S-R associations, and so testing them on arbitrarily constructed Fig. 1. Logical structures of Shepard et al.’s type I (easiest) category problem (left) and the type VI (hardest) problem (right). 10 Gregory L. Murphycategories is well within the rights of the investigator. The issue I would like to raise is the treatment of Shepard et al.’s data in the subsequent literature. It has become an important criterion in recent models of category learning that they reproduce the Shepard et al. data in some detail. For example, Kruschke (1992) contrasted the ability of his model, ALCOVE, to produce the correct ordering of Shepard et al.’s conditions compared to Gluck and Bower’s (1988) configural cue model. Other researchers have also attempted to account for the relative diYculty of these six problems (e.g., Estes, 1994; Nosofsky, 1984). The question raised by my earlier comments is what weight we should give to the ordering of Shepard et al.’s (1961) conditions. Compared to object and event categories, the type I category is grossly simplistic: There are no real object categories that are defined by a single stimulus value. (There are, of course, adjectival categories, such as red things or large things, although they are usually somewhat more complex than a single stimulus value. But I am talking about categories of whole objects or events, such as dog, funeral, zip disk drive, and movie.) Shepard et al.’s higher-level category types seem overly diYcult and arbitrary. For example, the type VI problem contains a number of very diVerent object pairs: A large black square and a small white square are in one category, and a large white square and a small black square are in the other. There is absolutely no family resemblance in the type VI categories—all the properties are equally frequent in the two categories (e.g., half the items are white and half are black in both categories). In the type V categories, two of the three dimensions are completely nondiagnostic and the other follows a three-out-of-four rule. For example, a large black heart, a small black heart, a large black square, and a small white square might all be in one category. Why the small white square (instead of the obvious small black square) is in the category is, of course, completely unclear to the subjects—it is just the arbitrary requirement of the experimental design. Indeed, I think one could argue that only one of Shepard et al.’s (1961) rules is likely to correspond to the structure of natural categories: Bob Rehder pointed out to me that rule IV is essentially a family-resemblance category, in which each dimension is predictive of category membership. (Family-resemblance structure is actually a bit diYcult to detect with only four items in a category.) The question, then, is to what degree we should use the diVerences in learning such categories as a criterion for evaluating theories of concepts. When I asked rhetorically whether the diVerence between categories (1) and (2) given earlier (the ones with the Mona Lisa, Bill Clinton, some quartz, etc.) should inform our theories of categorization, the reader answered rhetorically, ‘‘No, they are too weird.’’ Butwhy shouldn’t the reader give the same answer for categories like the Shepard et al. set? If real-life categories are not orthogonal variations of stimulus dimensions, Ecological Validity and the Study of Concepts 11or unidimensional splits, then why has the relative diYculty of learning such categories been of such importance to model testing in the field? Note that I am not saying that Shepard et al.’s results are not important from a number of respects, such as telling us about selective attention and certain learning processes. What I am saying is that the ability of a theory to distinguish diVerent category structures that do not actually exist in real life may not be an appropriate test of a model of concepts. B. Defensive Replies Let me quickly address three defensive replies to this sort of argument that I have heard from researchers, often after a drink or two at a conference poster session. One reply is something like, ‘‘That category structure [whichever one I am criticizing as unnatural] is extremely important. It has been studied in a dozen labs. How can you just ignore all those data?’’ However, the fact that something has been studied in the laboratory does not mean that it is relevant to a particular issue. If the problem (which I will expand on later) is how people learn and represent real categories, then the number of times a structure or paradigm is used in the laboratory simply does not speak to the question of whether the structure or paradigm tells us about real-life category learning. A second reply is the same as the first, but with an emphasis on the fact that there are data out there, and every theory must account for published data. So, the finding that NLS and LS categories are learned equally easily (in certain circumstances) simply must be accounted for by any adequate theory of concepts because it is a documented finding. Although this reply is more reasonable than the first, I find it to be unconvincing as well. After all, my hypothetical finding that category (1) is easier to learn than category (2) is also a datum, and why shouldn’t that be used to evaluate theories of concepts? If people’s concepts do not include categories of the sort that are tested in these experiments, then it is simply hard to see how the theory’s success within those unrealistic categories is a test of its account of real category learning. The question is not whether theories should have to account for data, but rather which data are relevant. A third reply is to make a distinction between acquisition of everyday concepts and perceptual classification. I am not sure whether this distinction has been proposed explicitly, but a number of researchers working on mathematical models of categorization seem to be calling their topic ‘‘perceptual classification’’ (e.g., Cohen, Nosofsky, & Zaki, 2001; Lamberts, 2000; Maddox & Bohil, 2000; Nosofsky & Johansen, 2000). One could therefore interpret them as suggesting that there is a separate psychological process of perceptual classification, which may or may not be the same 12 Gregory L. Murphyprocess as that used to learn about real objects in knowledge-rich domains. Perhaps perceptual classification is a fairly low-level process by which items are associated to responses, which applies across a number of diVerent domains, and which must be very flexible so that any possible distinction can be learned. Thus, criticisms of the sort I have been making based on word learning or the apparent structure of natural categories would not apply to the study of perceptual categorization because word meanings and object categories are not formed from (or only from) the perceptual classification process. In short, although this argument necessarily limits the interest of studies of perceptual classification (if they are not studying the mechanisms of real category learning), it also insulates it from ecological validity arguments. To repeat, I am not sure that anyone has made this argument explicitly. However, it is certainly an option available to those who do experiments on very simple stimuli, with category structures that are far removed from those of everyday life. This reply has two problems, however. The first is that one cannot simply say, ‘‘I am studying perceptual classification and not object concepts,’’ without some empirical evidence that there is a distinction between the two. By the same token, one could say, ‘‘I am working on dot patterns, whereas your categories are geometric shapes, and so my theory cannot be expected to explain your results.’’ Is there evidence that object concepts do not involve perceptual associative learning? Unless the distinction between perceptual concepts and object concepts is proposed explicitly (not assumed) and supported empirically, use of the distinction to isolate perceptual classification from my criticisms is ad hoc. Second, if there is such a process of perceptual classification, it must receive its own justification as a topic of study. If it is not the process involved in children’s learning of word meanings, of adults’ learning of novel concepts in familiar domains, and so on, then why should one study this instead of real object learning? The reply to my objections seems to condemn the topic to irrelevance. A better strategy, in my opinion, would be to attempt to incorporate the perceptual learning processes into a broader theory of concept acquisition, which can apply to complex concepts, in knowledge- rich domains, and so on. C. Summary Let me summarize the argument so far. The problem of unconstrained concept construction is that one can make up anything and call it a concept, test subjects on it, and then use the results to evaluate theories of concepts. This can lead (and in fact has led) to the construction of some very peculiar categories that are then used to discriminate theories of concepts. My Ecological Validity and the Study of Concepts 13argument is that when these categories are outside the domain of natural categories, the logic of hypothesis testing breaks down. Yes, we want a critical test in which theories make diVerent predictions. But if a theory is of behavior in a certain domain, then people’s behavior in a diVerent domain may not be an adequate test of it.
108 times read
|