The testing effect refers to the benefits to retention that result from structuring learning activities in the form of a test. As educators consider implementing test-enhanced learning paradigms in real classroom environments, we think it is critical to consider how an array of factors affecting test-enhanced learning in laboratory studies bear on test-enhanced learning in real-world classroom environments. As such, this review discusses the degree to which test feedback, test format (of formative tests), number of tests, level of the test questions, timing of tests (relative to initial learning), and retention duration have import for testing effects in ecologically valid contexts (e.g., classroom studies). Attention is also devoted to characteristics of much laboratory testing-effect research that may limit translation to classroom environments, such as the complexity of the material being learned, the value of the testing effect relative to other generative learning activities in classrooms, an educational orientation that favors criterial tests focused on transfer of learning, and online instructional modalities. We consider how student-centric variables present in the classroom (e.g., cognitive abilities, motivation) may have bearing on the effects of testing-effect techniques implemented in the classroom. We conclude that the testing effect is a robust phenomenon that benefits a wide variety of learners in a broad array of learning domains. Still, studies are needed to compare the benefit of testing to other learning strategies, to further characterize how individual differences relate to testing benefits, and to examine whether testing benefits learners at advanced levels.
Due to their recent increases in performance, machine learning and deep learning models are being increasingly adopted across many domains for visual processing tasks. One such domain is international nuclear safeguards, which seeks to verify the peaceful use of commercial nuclear energy across the globe. Despite recent impressive performance results from machine learning and deep learning algorithms, there is always at least some small level of error. Given the significant consequences of international nuclear safeguards conclusions, we sought to characterize how incorrect responses from a machine or deep learning-assisted visual search task would cognitively impact users. We found that not only do some types of model errors have larger negative impacts on human performance than other errors, the scale of those impacts change depending on the accuracy of the model with which they are presented and they persist in scenarios of evenly distributed errors and single-error presentations. Further, we found that experiments conducted using a common visual search dataset from the psychology community has similar implications to a safeguards- relevant dataset of images containing hyperboloid cooling towers when the cooling tower images are presented to expert participants. While novice performance was considerably different (and worse) on the cooling tower task, we saw increased novice reliance on the most challenging cooling tower images compared to experts. These findings are relevant not just to the cognitive science community, but also for developers of machine and deep learning that will be implemented in multiple domains. For safeguards, this research provides key insights into how machine and deep learning projects should be implemented considering their special requirements that information not be missed.
As the ability to collect and store data grows, so does the need to efficiently analyze that data. As human-machine teams that use machine learning (ML) algorithms as a way to inform human decision-making grow in popularity it becomes increasingly critical to understand the optimal methods of implementing algorithm assisted search. In order to better understand how algorithm confidence values associated with object identification can influence participant accuracy and response times during a visual search task, we compared models that provided appropriate confidence, random confidence, and no confidence, as well as a model biased toward over confidence and a model biased toward under confidence. Results indicate that randomized confidence is likely harmful to performance while non-random confidence values are likely better than no confidence value for maintaining accuracy over time. Providing participants with appropriate confidence values did not seem to benefit performance any more than providing participants with under or over confident models.
Cross Reality (XR) immersive environments offer challenges and opportunities in designing for cognitive aspects (e.g. learning, memory, attention, etc.) of information design and interactions. Information design is a multidisciplinary endeavor involving data science, communication science, cognitive science, media, and technology. In the present paper the holodeck metaphor is extended to illustrate how information design practices and some of the qualities of this imaginary computationally augmented environment (a.k.a. the holodeck) may be achieved in XR environments to support information-rich storytelling and real life, face-to-face, and virtual collaborative interactions. The Simulation Experience Design Framework & Method is introduced to organize challenges and opportunities in the design of information for XR. The notion of carefully blending both real and virtual spaces to achieve total immersion is discussed as the reader moves through the elements of the cyclical framework. A solution space leveraging cognitive science, information design, and transmedia learning highlights key challenges facing contemporary XR designers. Challenges include but are not limited to interleaving information, technology, and media into the human storytelling process, and supporting narratives in a way that is memorable, robust, and extendable.
The Tularosa study was designed to understand how defensive deception-including both cyber and psychological-affects cyber attackers. Over 130 red teamers participated in a network penetration task over two days in which we controlled both the presence of and explicit mention of deceptive defensive techniques. To our knowledge, this represents the largest study of its kind ever conducted on a professional red team population. The design was conducted with a battery of questionnaires (e.g., experience, personality, etc.) and cognitive tasks (e.g., fluid intelligence, working memory, etc.), allowing for the characterization of a “typical” red teamer, as well as physiological measures (e.g., galvanic skin response, heart rate, etc.) to be correlated with the cyber events. This paper focuses on the design, implementation, data, population characteristics, and begins to examine preliminary results.