Xem mẫu

S E C T i o n IV LEARningfRomSAmPLEDATA 342 CHAPTER 7 An Overview of Statistical Inference—Learning from Data 7 Preview Chapter Learning Objectives 7.1 Statistical Inference—What You Can Learn from Data 7.2 Selecting an Appropriate Method—Four Key Questions 7.3 A Five-Step Process for Statistical Inference Chapter Activities Are You Ready to Move On? Chapter 7 Review Exercises An Overview of Statistical Inference— Learning from Data PREviEw Whether data are collected by sampling from populations or result from an experiment to compare treatments, the ultimate goal is to learn from the data. This chapter introduces the inferential process shared by all of the methods for learning from data that are covered in the chapters that follow. You will also see how the answers to four key questions guide the selection of an appropriate inference method. 342 www.downloadslide.com 343 CHAPTER LEARning objECTivES Conceptual Understanding After completing this chapter, you should be able to C1 Understand the difference between questions that can be answered by using sample data to estimate population characteristics and those that can be answered by testing hypotheses about population characteristics. C2 Understand that there is risk involved in drawing conclusions from sample data—that when generalizing from sample data, the sample may not always provide an accurate picture of the population. C3 Understand that there is risk involved with drawing conclusions from experiment data—that when generalizing from experiment data, the observed difference in treatment effects may sometimes be due to variability in the response variable and the random assignment to treatments. C4 Understand that data type distinguishes between questions that involve proportions and those that involve means. C5 Know that different methods are used to draw conclusions based on categorical data and to draw conclusions based on numerical data. C6 Know the four key questions that help identify an appropriate inferential method. C7 Know the five-step process for estimation problems. C8 Know the five-step process for hypothesis testing problems. mastering the mechanics After completing this chapter, you should be able to M1 Distinguish between estimation problems and hypothesis testing problems. M2 Distinguish between problems that involve proportions and those that involve means. Putting it into Practice After completing this chapter, you should be able to P1 Given a scenario, answer the four key questions that help identify an appropriate inference method. PREviEw ExAmPLE Deception in online Dating Profiles With the increasing popularity of online dating services, the truthfulness of information in the personal profiles provided by users is a topic of interest. The authors of the paper “Self-PresentationinOnlinePersonals:TheRoleofAnticipatedFutureInteraction,Self-Disclosure,andPerceivedSuccessinInternetDating” (Communication Research [2006]: 152–177) designed a statistical study to investigate misrepresentation of personal characteristics. The researchers hoped to answer three questions: 1. What proportion of online daters believe they have misrepresented themselves in an online profile? 2. What proportion of online daters believe that others frequently misrepresent themselves? 3. Are people who place a greater importance on developing a long-term, face-to-face relationship more honest in their online profiles? What did the researchers learn? Based on the data, they estimated that only about 6% of online daters believe that they have intentionally misrepresented themselves in online profiles. In spite of the fact that most users believe themselves to be honest, about 86% believed that others frequently misrepresented characteristics such as physical appearance in the online profile. The researchers also found that the data supported the claim that those who placed greater importance on developing a long-term, face-to-face relationship were more honest in their online profiles. 343 www.downloadslide.com 344 CHAPTER 7 An Overview of Statistical Inference—Learning from Data How were these researchers able to reach these conclusions? The estimates (the 6% and the 86% in the previous statements) are based on sample data. Do these estimates provide an accurate picture of the entire population of online daters? The researchers concluded that the data supported the claim that those who placed greater importance on developing a long-term, face-to-face relationship were more honest in the way they represented themselves online, but how did they reach this conclusion, and should you be convinced? These are important questions. In this chapter and those that follow, you will see how questions like these can be answered. SECTion 7.1 Statistical inference—what You Can Learn from Data Statistical inference is all about learning from data. You can begin by considering learning from data that arise when a sample is selected from a population of interest. Learning from Sample Data When you obtain information from a sample selected from some population, it is usually because 1. You want to learn something about characteristics of the population. This results in an estimationproblem.Itinvolvesusingsampledatatoestimatepopulationcharacteristics. OR 2. You want to use the sample data to decide whether there is support for some claim or statement about the population. This results in a hypothesis testing problem. It involves testing a claim (hypothesis) about the population. Example 7.1 Deception in online Dating Profiles Revisited Let’s revisit the online dating example of the chapter preview. In that example, the popula-tion of interest was all online daters. Three questions about this population were identified: 1. What proportion of online daters believe they have misrepresented themselves in an online profile? 2. What proportion of online daters believe that others frequently misrepresent them-selves? 3. Are people who place a greater importance on developing a long-term, face-to-face relationship more honest in their online profiles? The first two of these questions are estimation problems because they involve using sample data to learn something about a population characteristic. The popula-tion characteristic of interest in the first question is the proportion of all online daters who believe they have misrepresented themselves online. In the second question, the population characteristic of interest is the proportion of all online daters who believe that others frequently misrepresent themselves. The third question is a hypothesis test-ing problem because it involves determining if sample data support a claim about the population of online daters. An estimation problem involves using sample data to estimate the value of a population characteristic. A hypothesis testing problem involves using sample data to test a claim about a population. Methods for estimation and hypothesis testing are called statistical inference methods because they involve generalizing (making an inference) from a sample to the population from which the sample was selected. www.downloadslide.com 7.1 Statistical Inference—What You Can Learn from Data 345 Statistical inference involves generalizing from a sample to a population. Example 7.2 whose Reality? The article “Who’s Afraid of Reality Shows?” (Communication Research [2008]:382–397) considers social concern over reality television shows. Researchers conducted tele-phone interviews with 606 individuals in a sample designed to represent the adult popu-lation of Israel. One of the things that the researchers hoped to learn was whether the data supported the theory that a majority of Israeli adults believed they were much less affected by reality shows than other people. They concluded that the sample data did provide support for this theory. This study involves generalizing from the sample to the population of Israeli adults and it is a hypothesis testing problem because it uses sample data to test a claim (that a majority consider themselves less affected than others). Learning from Data when There Are Two or more Populations Sometimes sample data are obtained from two or more populations of interest, and the goal is to learn about differences between the populations. Consider the following two examples. Example 7.3 Tuned-in babies The director of the Kaiser Family Foundation’s Program for the Study of Entertainment Media and Health said, “It’s not just teenagers who are wired up and tuned in, its babies in diapers as well.” A study by Kaiser Foundation provided one of the first looks at media use among the very youngest children—those from 6 months to 6 years of age (KaiserFamilyFoundation,2003,www.kff.org). Because previous research indicated that children who have a TV in their bedroom spend less time reading than other children, the authors of the Foundation study were interested in learning about the proportion of kids who have a TV in their bedroom. They collected data from two samples of parents. One sample consisted of parents of children 6 months to 3 years of age. The second sample consisted of parents of children 3 to 6 years of age. Based on the resulting data, they were able to estimate the proportion of children who had a TV in their bedroom for each of the two age groups (0.30 or 30% for children in the younger group and 0.43 or 43% for children in the older group). From this informa-tion, they also estimated the difference in the proportions for the two populations (the two age groups). The proportion with TVs in the bedroom is 0.13 or 13 percentage points higher for children in the older group. This study illustrates statistical inference because it involves generalizing from samples to corresponding populations. It is an estimation problem because sample data were used to estimate the values of popula-tion characteristics. Because there were two samples, one from each of two different populations, it was also possible to learn something about how the two populations differ with respect to a population characteristic. Example 7.4 Do facebook members Spend more Time online? College students spend a lot of time online, but do members of Facebook spend more time online than non-members? The authors of the paper “Spatially Bounded Online Social Networks and Social Capital: The Role of Facebook” (Annual Conference of the International Communication Association, 2006) collected data from two samples of college students. One sample consisted of Facebook members and the other consisted www.downloadslide.com 346 CHAPTER 7 An Overview of Statistical Inference—Learning from Data of non-members. One of the variables studied was the amount of time spent on the Internet in a typical day. Based on the resulting data, the authors concluded that there was no support for the claim that the mean time spent online for Facebook members was greater than the mean time for non-members. This study involves generalizing from samples, and it is a hypothesis testing problem because it involves testing a claim about the difference between the two groups (the claim that the mean time spent online is greater for Facebook members than for non-members). Learning from Experiment Data Statistical inference methods are also used to learn from experiment data. When data are obtained from an experiment, it is usually because 1. You want to learn about the effect of the different experimental conditions (treat-ments) on the measured response. This is an estimation problem because it involves using sample data to estimate a characteristic of the treatments, such as the mean response for a treatment or the difference in mean response for two treatments. OR 2. You want to determine if experiment data provide support for a claim about how the effects of two or more treatments differ. This is a hypothesis testing problem because it involves testing a claim (hypothesis) about treatment effects. The following two examples illustrate an estimation problem and a hypothesis testing problem in the context of learning from experiment data. Example 7.5 Do U Smoke After Txt? Researchers in New Zealand investigated whether mobile phone text messaging could be used to help people stop smoking. The article “Do U Smoke After Txt? Results of a Randomized Trial of Smoking Cessation Using Mobile Phone Text Messaging” (Tobacco Control [2005]: 255–261) describes an experiment designed to compare two experi-mental conditions (treatments). Subjects for the experiment were 1,705 smokers who were older than 15 years and owned a mobile phone and who wanted to quit smoking. The subjects were assigned at random to one of two groups. People in the first group received personalized text messages providing support and advice on stopping smok-ing. The second group was a control group, and people in this group did not receive any of these text messages. After 6 weeks, each person participating in the study was contacted and asked if he or she had smoked during the previous week. Data from the experiment were used to estimate the difference in the proportion who had quit for those who received the text messages and those who did not. Using statistical infer-ence methods that you will learn in Chapters 11 and 14, the researchers estimated that the proportion of those who successfully quit smoking was higher by 0.15 (15 percentage points) for those who received the text messages. This is an example of an estimation problem. It involves generalizing from experiment data to treatment characteristics—in this case, the difference in the proportion of favorable responses for the two treatments. Example 7.6 Cell Phones Can Slow You Down The previous example illustrated a positive use of cell phone technology, but many believe that using a cell phone while driving isn’t a good idea. In many states it is now illegal to use cell phones while driving. Is this justified? Researchers at the University of Utah designed an experiment to test the claim that people talking on a cell phone respond more slowly to a traffic signal change (“Driven to Distraction,” Psychological Science [2001]: ... - tailieumienphi.vn
nguon tai.lieu . vn