Dynamic Information-Theoretic Measures for Security Informatics
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This paper has three goals. The first is to review Shannon's theory of information and the subsequent advances leading to today's statistics-based text analysis algorithms, showing that the semantics of the text is neglected. The second goal is to propose an extension of Shannon's original model that can take into account semantics, where the 'semantics' of a message is understood in terms of the intended or actual changes on the recipient of a message. The third goal is to propose several lines of research that naturally fall out of the proposed model. Each computational approach to solving some problem rests on an underlying model or set of models that describe how key phenomena in the real world are represented and how they are manipulated. These models are both liberating and constraining. They are liberating in that they suggest a path of development for new tools and algorithms. They are constraining in that they intentionally ignore other potential paths of development. Modern statistical-based text analysis algorithms have a specific intellectual history and set of underlying models rooted in Shannon's theory of communication. For Shannon, language is treated as a stochastic generator of symbol sequences. Shannon himself, subsequently Weaver, and at least one of his predecessors are all explicit in their decision to exclude semantics from their models. This rejection of semantics as 'irrelevant to the engineering problem' is elegant and combined with developments particularly by Salton and subsequently by Latent Semantic Analysis, has led to a whole collection of powerful algorithms and an industry for data mining technologies. However, the kinds of problems currently facing us go beyond what can be accounted for by this stochastic model. Today's problems increasingly focus on the semantics of specific pieces of information. And although progress is being made with the old models, it seems natural to develop or extend information theory to account for semantics. By developing such theory, we can improve the quality of the next generation analytical tools. Far from being a mere intellectual curiosity, a new theory can provide the means for us to take into account information that has been to date ignored by the algorithms and technologies we develop. This paper will begin with an examination of Shannon's theory of communication, discussing the contributions and the limitations of the theory and how that theory gets expanded into today's statistical text analysis algorithms. Next, we will expand Shannon's model. We'll suggest a transactional definition of semantics that focuses on the intended and actual change that messages are intended to have on the recipient. Finally, we will examine implications of the model for algorithm development.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
This report describes the Licensing Support Network (LSN) Assistant--a set of tools for categorizing e-mail messages and documents, and investigating and correcting existing archives of categorized e-mail messages and documents. The two main tools in the LSN Assistant are the LSN Archive Assistant (LSNAA) tool for recategorizing manually labeled e-mail messages and documents and the LSN Realtime Assistant (LSNRA) tool for categorizing new e-mail messages and documents. This report focuses on the LSNAA tool. There are two main components of the LSNAA tool. The first is the Sandia Categorization Framework, which is responsible for providing categorizations for documents in an archive and storing them in an appropriate Categorization Database. The second is the actual user interface, which primarily interacts with the Categorization Database, providing a way for finding and correcting categorizations errors in the database. A procedure for applying the LSNAA tool and an example use case of the LSNAA tool applied to a set of e-mail messages are provided. Performance results of the categorization model designed for this example use case are presented.
Abstract not provided.
Abstract not provided.
An experiment was conducted comparing the effectiveness of individual versus group electronic brainstorming in order to address difficult, real world challenges. While industrial reliance on electronic communications has become ubiquitous, empirical and theoretical understanding of the bounds of its effectiveness have been limited. Previous research using short-term, laboratory experiments have engaged small groups of students in answering questions irrelevant to an industrial setting. The current experiment extends current findings beyond the laboratory to larger groups of real-world employees addressing organization-relevant challenges over the course of four days. Findings are twofold. First, the data demonstrate that (for this design) individuals perform at least as well as groups in producing quantity of electronic ideas, regardless of brainstorming duration. However, when judged with respect to quality along three dimensions (originality, feasibility, and effectiveness), the individuals significantly (p<0.05) out performed the group working together. The theoretical and applied (e.g., cost effectiveness) implications of this finding are discussed. Second, the current experiment yielded several viable solutions to the wickedly difficult problem that was posed.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.