Browsing by Author "Ting Yu, Committee Member"
Now showing 1 - 16 of 16
- Results Per Page
- Sort Options
- Efficient Algorithms for Querying Large-Scale Data in Relational, XML, and Graph-Structured Data Repositories(2008-08-18) Gou, Gang; Xiaohui Gu, Committee Member; Ting Yu, Committee Member; Jon Doyle, Committee Member; Rada Chirkova, Committee ChairWe live in an information age, and data are ubiquitous today. Various applications, ranging from scientific computing, medical research, and bioinformatics to administrative management, commercial sales, and financial marketing, generate and utilize data every day. Many of these applications are data intensive, with the amount of data involved potentially reaching hundreds of thousands of gigabytes. Further, different applications store data using different data models. For example, applications could store and manage structured data using a flat (relational) model, semi-structured data using a hierarchical (XML) model, and less-structured data using a more general and flexible graph model. In this thesis, I report my research results on efficiently querying large-scale data in relational, XML, and graph-structured data repositories. Specifically, this thesis covers three research projects, which I have been invited to present in the ACM SIGMOD conference in 2006, 2007, and 2008, respectively. The first project concerns efficient querying of relational data using materialized views and introduces our efficient view-based query-optimization algorithms that support a large and practically important subset of SQL queries. The second project focuses on efficiently querying XML data and presents efficient algorithms for evaluating XPath queries over XML streams, which are the first ones that achieve the O(|D||Q|) time performance, where |D| is the XML data size and |Q| is the XPath query size. Meanwhile, our algorithm EQ also achieves optimal space performance. The third project addresses efficient querying of graph-structured data, by introducing efficient algorithms for retrieving top-ranked tree-pattern matches from large graphs. While a tree-pattern query could have an extremely large, potentially exponential, number of answer matches in a graph, our algorithms exhibit time and space performance that is linear or sub-linear in the size of the input data. Our algorithms are the first ones that have this excellent performance property.
- Efficient Subsequence Matching with LCS(2007-10-08) Han, Tae Sik; Jaewoo Kang, Committee Co-Chair; Rada Chirkova, Committee Co-Chair; Xiaosong Ma, Committee Member; Ting Yu, Committee Member
- EMFS: An Email Based Distributed File System.(2010-08-17) Srinivasan, Jagannath; Xiaosong Ma, Committee Chair; Ting Yu, Committee Member; Vincent Freeh, Committee Member
- Evidence-Based Trust in Distributed Agent Systems(2009-01-21) Wang, Yonghong; Greg Byrd, Committee Member; Ting Yu, Committee Member; Dennis Bahler, Committee Member; Munindar Singh, Committee ChairTrust is a crucial basis for interactions among parties in large, open distributed systems. Yet, the scale and dynamism of such systems make it infeasible for each party to have a direct basis for trusting another party. For this reason, the participants in an open system must share information about trust. Traditional models of trust employ simple heuristics and ad hoc formulas, without adequate mathematical justification. These models fail to properly address the challenges of combining trust from conflicting sources, dealing with malicious agents, and updating trust. This dissertation understands an agent Alice's trust in an agent Bob in terms of Alice's certainty in her belief that Bob is trustworthy. Unlike previous approaches, this dissertation formulates certainty in terms of a statistical measure defined over a probability distribution of the probability of positive outcomes. Specifically this dissertation makes the following contributions. It 1. Develops a mathematically well-formulated approach for an evidence-based account of trust; proves desirable properties of certainty; and establishes a bijection between evidence and trust. 2. Defines a concatenation, an aggregation, and a selection operator to propagate trust, and proves desirable properties of these operators. 3. Develops trust update mechanisms and formally analyzes their properties. 4. Extends the definition of certainty from binary events to multivalued events. Establishes a bijection between Dempster-Shafer belief space and evidence space, and defines a novel combination operator, which is commutative and associative. In contrast with traditional combination operators, which ignore conflict and sometimes yield counterintuitive results, the proposed operator treats conflict naturally.
- Grid Service Data Needed for Estimation of Reliability in Scientific Workflow Systems(2004-06-11) Colonnese, Daniel; Mladen Vouk, Committee Chair; Xiaosong Ma, Committee Member; Ting Yu, Committee Member; Gregor von Laszewski, Committee MemberThe emerging technologies of grid computing, web services, and service-oriented workflows will enable scientific projects to be conducted on a larger scale than ever before. Scientific workflows are often very dynamic in both structure and persistency, complex, and may involve a lot of interaction, very large data flows, and may change on very short notice. Given modern information technology tools, they can be constructed by combining dispersed network accessible services into virtual organizations. Within a scientific workflow environment, metadata or grid service data is necessary for service consumer application to discover services and for services to publish their properties. Software reliability engineering within service oriented workflow systems is still a largely unexplored area. Emerging standards such as WS-Reliability and WS-Agreement are not yet practical to function in grid service environment. Therefore, methods of leveraging reliability service data with the existing aggregation mechanisms in Globus Toolkit 3, need explored further exploration. The Open Grid Service Architecture (OGSA) allows registries of services. Service data information in the registries can be used for soft-state management, keeping track of metadata for service instances created from application factories, and so on. A service aggregation registry subscribes to services produced from a number of factories. Service data is express through a proposed XSD namespace shared vocabulary. Discovery policy is expressed in XPath queries. This thesis proposes one of the first architectures and implementations of a grid service registry information specifically intended for support of fault tolerance and service replica selection in scientific workflow application domain. The registry uses the Globus Toolkit 3 and is made available as an OGSI compliant grid service.
- Information Integration: The Semantic-Model Approach(2008-09-15) Chen, Dongfeng; Ting Yu, Committee Member; Rudra Dutta, Committee Member; Rada Chirkova, Committee Chair; Fereidoon Fred Sadri, Committee Member
- Integrating Multiple Information Resource to Analyze Intrusion Alerts(2006-12-22) Zhai, Yan; Ting Yu, Committee Member; Peng Ning, Committee Chair; Douglas Reeves, Committee Member; Purushothaman Iyer, Committee MemberIntrusion detection systems (IDSs) are important components of network security. However, it is well known that current IDSs generate large amount of alerts, including both true and false alerts. Other than proposing new techniques to detect intrusions without such problems, this thesis presents some work we have done in improving the study of IDS alerts by incorporating other sources of relevant information. In particular, the work covers four issues. The first issue is to integrate and reason about IDS alerts as well as reports by system monitoring or vulnerability scanning tools (discussed in Chapter 3). To facilitate the modeling of intrusion evidence, this approach classifies intrusion evidence into either event-based evidence or state-based evidence. Event-based evidence refers to observations (or detections) of intrusive actions (e.g., IDS alerts), while state-based evidence refers to observations of the effects of intrusions on system states. Based on the interdependency between event-based and state-based evidence, we developed techniques to automatically integrate complementary evidence into Bayesian networks, and reason about uncertain or unknown intrusion evidence based on verified evidence. The second issue is the study of the robustness of the Bayesian analysis framework toward inaccuracies in the assignments of prior confidence with sensitivity analysis and qualitative analysis (discussed in Chapter 4). By performing sensitivity analysis and qualitative analysis on the Bayesian networks used to reason about intrusion evidence, we can measure or approximate individual evidence's influence on the reasoning results. Such study on the framework's robustness properties can provide guide line for evidence collection and analyses. The third issue is to improve alert correlation by integrating alert correlation techniques with OS-level object dependency tracking (discussed in Chapter 5). With the support of more detailed and precise information from OS-level event logs, higher accuracy in alert correlation can be achieved. The chapter also discusses the application of such integration in making hypotheses about possibly missed attacks. The fourth issue is to correlate intrusion alert and other security event information from multiple heterogeneous sources while protecting the privacy for each participating parties (discussed in Chapter 6). Based on a sanitization scheme utilizing both generalization and randomization, we proposed several techniques to flexibly balance between the privacy protection and the analysis capability of the sanitized data. We also studied the various analyses supported by the sharing framework and its security against some different types of attacks. Finally, the conclusion of my dissertation is provided and future work is pointed out.
- Mitigating Voice over IP Spam Using Computational Puzzles(2008-08-31) Zhou, Yuzheng; S. Purushothaman Iyer, Committee Member; Peng Ning, Committee Chair; Ting Yu, Committee Member
- Modeling Service Engagements in Dynamic Organizations: Multiagent Model and Architecture for Policy-Based Governance(2008-03-28) Udupi, Yathiraj Bhat; Munindar P. Singh, Committee Chair; Gregory T. Byrd, Committee Member; James C. Lester, Committee Member; Ting Yu, Committee MemberService engagements arise commonly in business and scientific computing. A service engagement is characterized by autonomous parties coming together in a contractual arrangement to share resources or carry out tasks for one another. The autonomy of the participants is key, meaning that there is no unique locus for policy application. Yet, autonomy is not properly treated by current approaches for designing service engagements, which typically take the perspective of one of the participants. We provide an agent-based conceptual model for specifying service engagements as arising within dynamic service organizations or Orgs. The atoms of a service engagement are formalized as commitments among the participants, to be created and manipulated as the engagement progresses. An Org scopes the commitments formed in an engagement. Orgs and their members are modeled as agents. A service contract provides a natural arms-length abstraction for modeling service engagements and is formalized as comprising a set of commitments among the contracting agents. An institution is a kind of an Org that acts as a social and legal context, within which Orgs arise modeling different service engagements. We provide a policy-based architecture for the governance of service engagements. We propose innovations in policy-based architecture to model and govern complex Orgs. Traditional policy-based frameworks emphasize reactive behaviors wherein an external request causes a policy engine to compute a response. However, business service settings require richer policies and call for proactive behaviors. A business must not only respond to explicit requests, but also monitor its environment, collate events, and potentially act in anticipation of events in order to ensure that its policies are satisfied. The core of this research is an approach to formalize service engagements based on commitments and to study their dynamics in the presence of policies specified by each of the participants. We describe a methodology for service engagement design using a set of design patterns that capture commonly occurring elements of service engagements. We have implemented MAVOS, a prototype of a policy-based multiagent system to model Orgs that demonstrates and evaluates our approach on realistic scenarios involving Orgs.
- Network and Host Based Countermeasures against Large-scale Networked Compromised Systems or Malicious Software.(2010-07-14) Park, Young Hee; Douglas Reeves, Committee Chair; Peng Ning, Committee Member; Ting Yu, Committee Member; Xuxian Jiang, Committee Member
- PaRaM: Path-Sensitive Monitoring of Web Applications against SQL Injection Attacks.(2010-07-07) Marri, Madhuri; Tao Xie, Committee Chair; Ting Yu, Committee Member; Laurie Williams, Committee Member
- Requirements-Based Access Control Analysis and Policy Specification(2005-08-15) He, Qingfeng; Ting Yu, Committee Member; Laurie Williams, Committee Member; Julie Earp, Committee Member; Annie I. Anton, Committee ChairAccess control is a mechanism for achieving confidentiality and integrity in software systems. Access control policies (ACPs) define how access is managed and the high-level rules of who can access what information under certain conditions. Traditionally, access control policies have been specified in an ad-hoc manner, leaving systems vulnerable to security breaches. ACP specification is often isolated from requirements analysis, resulting in policies that are not in compliance with system requirements. This dissertation introduces the Requirements-based Access Control Analysis and Policy Specification (ReCAPS) method for deriving access control policies from various sources, including software requirements specifications (SRS), software designs, and high-level security/privacy policies. The ReCAPS method is essentially an analysis method supported by a set of heuristics and a software tool: the Security and Privacy Requirements Analysis Tool (SPRAT). The method was developed in two formative case studies and validated in two summative case studies. All four case studies involved operational systems, and ReCAPS evolved as a result of the lessons learned from applying the method to these case studies. Further validation of the method was performed via an empirical study to evaluate the usefulness and effectiveness of the approach. Results from these evaluations indicate that the process and heuristics provided by the ReCAPS method are useful for specifying database-level and application-level ACPs. Additionally, ReCAPS integrates policy specification into software development, thus providing a basic framework for ensuring compliance between different levels of policies, system requirements and software design. The method also improves the quality of requirements specifications and system designs by clarifying ambiguities and resolving conflicts across these artifacts.
- Resilient Data Aggregation in Wireless Sensor Networks(2005-07-29) Anantharaju, Srinath; Peng Ning, Committee Chair; Douglas Reeves, Committee Member; Ting Yu, Committee MemberSensor nodes are low-cost and low-power devices that are prone to node compromises, communication failures and malfunctioning of sensing hardware. As a result, some nodes may report outlying data values, introducing significant deviations in the aggregated sensor readings. This thesis presents a practical resilient outlier detection technique to filter out the influence of the outlying data reported by faulty or compromised nodes. The proposed outlier detection algorithm is based on event localization using minimum mean squared error (MMSE) estimation combined with threshold-based consistency checking to detect outliers. Data aggregation is one of the key techniques commonly used to develop lightweight communication protocols applicable to wireless sensor networks. The proposed approach handles localization of multiple events by grouping the sensor readings into spatially correlated clusters and performing an event-centric detection of outliers. In the entire process of data aggregation, the outlier detection technique fits as a preprocessing stage for reducing the effect of outliers on the aggregated result. Suitable extensions to the basic outlier detection algorithm are proposed to effectively apply the algorithm to both centralized and decentralized sensor network architectures. This thesis further includes studies that test the effectiveness of the proposed approach, including the detection rate, the false positive rate, degree of damage and the resilience to malicious readings introduced by the attackers. The experimental results show that on average the proposed approach detects as high as 80-90% of the outliers while resulting in 5-15% false positive rate when the network consists of 40-45% outliers. The experiments also show that the extent of damage on the aggregated result is below 50% due to the elimination of outliers before aggregation. Finally, the resilient data aggregation process requires modest computational and memory requirements with zero communication overhead in the centralized case and about 20% overhead in the decentralized settings.
- Security Mechanisms for Protecting Foundational Services in Wireless Sensor Networks.(2010-08-09) Liu, An; Xiaogang Wang, Committee Chair; Peng Ning, Committee Chair; Douglas Reeves, Committee Member; Ting Yu, Committee Member; Xuxian Jiang, Committee Member
- Towards the Preservation of Privacy and Legal Compliance in Healthcare Systems(2006-05-04) Vail, Matthew; Annie Antón, Committee Chair; Julia Earp, Committee Member; Ting Yu, Committee MemberGiven the introduction of United States legislation that governs the collection, use, and disclosure of sensitive patient information, there is a need for mechanisms to preserve the privacy of sensitive information in software systems and to ensure these systems comply with law. One such piece of legislation is the Health and Human Services' (HHS) Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule. The introductions of such legislation poses many challenges to organizations seeking to comply with the law, and thereby avoid severe penalties. A study was conduct by Antón et. al, prior to the enactment of the HIPAA (pre-HIPAA), to examine the content of online privacy policies. This thesis expounds upon this work by replicating the analysis, after the enactment of the HIPAA (post-HIPAA), in order to evaluate the evolution of privacy policies in the presence of legislation. We discovered that since the introduction of HIPAA, the privacy policies of healthcare organizations have evolved significantly. One of the most noteworthy discoveries made during this post-HIPAA study was the lack of clarity and readability of healthcare enterprises' privacy policies. To address the need for more clear and concise privacy policies, we conducted an experiment using an empirical survey instrument that we developed to investigate user perception and comprehension of alternatives to natural language privacy policies. Some of the more compelling observations we made were: • Users felt more secure and protected by natural language privacy policies. • Users comprehend alternatives to natural language policies better than the original natural language privacy policies. • User perception and comprehension of privacy policies are not in alignment with one another. • Human Computer Interaction (HCI) factors play a significant role in the perception and comprehension of privacy policies. In addition to evaluating how privacy policies evolve with the introduction of legislation, we attempted to explore whether organizations were actually in compliance with legislation. We developed a methodology for extracting rights and obligations from regulatory texts in order to determine stakeholder obligations. This information can be used to perform a comparative analysis by the organization to ensure compliance, or by external parties to detect potential non-compliance.
- Trust and Reputation in Multiagent Systems: Strategies and Dynamics with Reference to Electronic Commerce.(2010-05-04) Hazard, Christopher; Munindar Singh, Committee Chair; Robert Young, Committee Member; Jon Doyle, Committee Member; Ting Yu, Committee Member