In this paper, M-estimation and inference methods are developed for spatial dynamic panel data models with correlated random effects, based on short panels. The unobserved individual-specific effects are assumed to be correlated with the observed time-varying regressors linearly or in a linearizable way, giving the so-called correlated random effects model, which allows the estimation of effects of time-invariant regressors. The unbiased estimating functions are obtained by adjusting the conditional quasi-scores given the initial observations, leading to M-estimators that are consistent, asymptotically normal, and free from the initial conditions except the process starting time. By decomposing the estimating functions into sums of terms uncorrelated given idiosyncratic errors, a hybrid method is developed for consistently estimating the variance-covariance matrix of the M-estimators, which again depends only on the process starting time. Monte Carlo results demonstrate that the proposed methods perform well in finite sample.
Essays on Empirical Asset Pricing
The dissertation consists of four chapters on empirical asset pricing. The first chapter reexamines the existence of time-series momentum. Time-series momentum (TSM) refers to the predictability of the past 12-month return on the next one month return. Using the same data set as Moskowitz, Ooi, and Pedersen (2012) (MOP, henceforth), we show that asset-by-asset time-series regressions reveal little evidence of TSM, both in- and out-of-sample. While the t -statistic in a pooled regression appears large, it is not statistically reliable as it is less than the critical values of parametric and nonparametric bootstraps. From an investment perspective, the performance of the TSM strategy is virtually the same as that of a similar strategy that is based on the historical sample mean and does not require predictability. Overall, the evidence on TSM is weak, particularly for the large cross-section of assets.
The second chapter focuses on disagreement, which is regarded as the best horse for behavioral finance to obtain as many insights as classic asset pricing theories. Existing disagreement measures are known to predict cross-sectional stock returns but fail to predict market returns. We propose a disagreement index by aggregating information across individual measures using the partial least squares (PLS) method. This index significantly predicts market returns both in- and out-of-sample. Consistent with the theory in Atmaz and Basak (2018), the disagreement index asymmetrically predicts market returns with greater power in high sentiment periods, is positively associated with investor expectations of market returns, predicts market returns through a cash flow channel, and can explain the positive volume-volatility relationship.
The third and fourth chapters investigate the impacts of political uncertainty. We focus on one type of political uncertainty, partisan conflict, which is caused by the dispute or disagreement among party members or policymakers. Chapter 3 finds that partisan conflict positively predicts stock market returns, controlling for economic predictors and proxies for uncertainty, disagreement, geopolitical risk, and political sentiment. A one-standard-deviation increase in partisan conflict is associated with a 0.54% increase in next month's market return. The forecasting power is symmetric across political cycles and operates via a discount rate channel. Increased partisan conflict is associated with increased fiscal policy and healthcare policy uncertainties, and leads investors to switch their investments from equities to bonds.
Chapter 4 shows that intensified partisan conflict widens corporate credit spreads. A one standard deviation increase in partisan conflict is associated with a 0.91% increase in the next one-month corporate credit spreads after controlling for bond-issue information, firm characteristics, macroeconomic variables, uncertainty measures, and sentiment measures. The result holds when using the instrumental variables to resolve endogeneity concerns. I further find that partisan conflict has a greater impact on corporate credit spreads for firms with higher exposure to government policies, including government spending policy and tax policy, and for firms with higher dependence on external finance. Firms that are actively involved in political activities are also more sensitive to changes in political polarization.
The widespread availability of sensors on personal devices (e.g., smartphones, smartwatches) and other cheap, commoditized IoT devices in the environment has opened up the opportunity for developing applications that capture and enhance various lifestyle-driven daily activities of individuals. Moreover, there is a growing trend of leveraging ubiquitous computing technologies to improve physical health and wellbeing. Several of the lifestyle monitoring applications rely primarily on the capability of recognizing contextually relevant human movements, actions and gestures. As such, gesture recognition techniques, and gesture-based analytics have emerged as a fundamental component for realizing personalized lifestyle applications.
This thesis explores how such wealth of data sensed from ubiquitously available devices can be utilized for inferring fine-grained gestures. Subsequently, it explores how gestures can be used to profile user behavior during daily activities and outlines mechanisms to tackle various real-world challenges. With two daily activities (shopping and exercising) as examples, it then demonstrates that unobtrusive, accurate and robust monitoring of various aspects of these activities is indeed possible with minimal overhead. Such monitoring can then, in future, enable useful applications (e.g., smart reminder in a retail store or digital personal coach in a gym).
First, this thesis presents the IRIS platform, which explores how appropriate mining of sensors available in personal devices such as a smartphone and a smartwatch can be used to infer micro-gestural activities, and how such activities help reveal latent behavioral attributes of individual consumers inside a retail store. It first investigates how inertial sensor data (e.g., accelerometer, gyroscope) from a smartphone can be used to appropriately decompose an entire store visit into a series of modular and hierarchical individual interactions, modeled as a sequence of in-aisle interactions, interspersed with non-aisle movement. Further, by combining such sensor data from a wrist-worn smartwatch and by deriving discriminative features, the IRIS platform demonstrates that different facets of a shopper’s interaction with individual items (e.g., picking an item, putting an item in trolley), as well as attributes of the overall shopping episode or the store, can be inferred.
This thesis next investigates the possibility of using a wearable-free sensing modality for fine-grained and unobtrusive monitoring of multiple aspects of individuals’ gym exercises. It describes the W8-Scope approach that requires no on-body instrumentation and leverages only simple accelerometer and magnetometer sensors (on a cheap IoT device) attached to the weight stack of an exercise machine to infer various exercise gestures, and thereby identify related novel attributes such as the amount of weight lifted, the correctness of exercise execution and identify the user who is performing the exercise. It then also experimentally demonstrates the feasibility of evolving W8-Scope’s machine learning-based classifiers to accommodate the medium-time scale (e.g., across weeks or months) changes in an individual’s exercise behavior (an issue that has received insufficient attention to date).
Finally, this thesis explores the possibility of accurately inferring complex activities and gestures performed concurrently by multiple individuals in an indoor gym environment. It introduces a system that utilizes a hybrid architecture, combining sensor data from ‘earables’ with non-personal IoT sensors attached to gym equipment, for individual-specific fine-grained monitoring of weight-based exercises in a gym. Using real-world studies conducted with multiple concurrent gym-goers, this thesis validates that accurate association of “user-equipment” pairings is indeed possible, for a majority of common exercises, in spite of the significant signal dampening on the earable. Moreover, it demonstrates how features from the earable and IoT sensors can be combined to significantly increase the accuracy and robustness of exercise recognition. In future, the real-time exercise analytics capabilities developed in this thesis can be used to enable targeted and individualized real-time feedback on user dynamics and increase user engagement.
Online Spatio Temporal Demand Supply Matching
This dissertation consists of three papers which contribute to the estimation and inference theory of the heterogeneous large panel data models. The first chapter studies a panel threshold model with interactive fixed effects. The least-squares estimators in the shrinking-threshold-effect framework are explored. The inference theory on both slope coefficients and the threshold parameter is derived, and a test for the presence of the threshold effect is proposed. The second chapter considers the least-squares estimation of a panel structure threshold regression (PSTR) model, where parameters may exhibit latent group structures. Under some regularity conditions, the latent group structure can be correctly estimated with probability approaching one. A likelihood-ratio-based test on the group-specific threshold parameters is studied. Two specification tests are proposed: one tests whether the threshold parameters are homogeneous across groups, and the other tests whether the threshold effects are present. The third chapter studies high-dimensional vector autoregressions (VARs) augmented with common factors. An L1-nuclear-norm regularized estimator is considered. A singular value thresholding procedure is used to determine the correct number of factors with probability approaching one. Both a LASSO estimator and a conservative LASSO estimator are employed to improve estimation. The conservative LASSO estimates of the non-zero coefficients are shown to be asymptotically equivalent to the oracle least squares estimates. Monte Carlo studies are conducted to check the finite sample performance of the proposed test and estimators. Empirical applications are conducted in each chapter to illustrate the usefulness of the proposed methods.
Exploting Approximation, Caching and Specialization to Accelerate Vision Sensing Applications
Over the past few years, deep learning has emerged as state-of-the-art solutions for many challenging computer vision tasks such as face recognition, object detection, etc. Despite of its outstanding performance, deep neural networks (DNNs) are computational intensive, which prevent them to be widely adopted on billions of mobile and embedded devices with scarce resources. To address that limitation, we
focus on building systems and optimization algorithms to accelerate those models, making them more computational-efficient.
First, this thesis explores the computational capabilities of different existing processors (or co-processors) on modern mobile devices. It recognizes that by leveraging the mobile Graphics Processing Units (mGPUs), we can reduce the time consumed in the deep learning inference pipeline by an order of magnitude. We further investigated variety of optimizations that work on the mGPUs for more accelerations and built the DeepSense framework to demonstrate their uses.
Second, we also discovered that video streams often contain invariant regions (e.g., background, static objects) across multiple video frames. Processing those regions from frame to frame would waste a lot of computational power. We proposed a convolutional caching technique and built a DeepMon framework that quickly determines the static regions and intelligently skips the computations on those regions during the deep neural network processing pipeline.
The thesis also explores how to make deep learning models more computational-efficient by pruning unnecessary parameters. Many studies have shown that most of the computations occurred within convolutional layers, which are widely used in convolutional neural networks (CNNs) for many computer vision tasks. We designed a novel D-Pruner algorithm that allows us to score the parameters based on how important they are to the final performance. Parameters with little impacts will be removed for smaller, faster and more computational-efficient models.
Finally, we investigated the feasibility of using multi-exit models (MXNs), which consist many neural networks with shared-layers, as an efficient implementation to accelerate many existing computer vision tasks. We show that applying techniques such as aggregating results cross exits, threshold-based early exiting with MXNs can significantly speed up the inference latency in indexed video querying and face
recognition systems.
Sensing systems for monitoring physiological and psychological states have been studied extensively in both academic and industry research for different applications across various domains. However, most of the studies have been done in the lab environment with controlled and complicated sensor setup, which is only suitable for serious healthcare applications in which the obtrusiveness and immobility can be compromised in a trade-off for accurate clinical screening or diagnosing. The recent substantial development of mobile devices with embedded miniaturized sensors are now allowing new opportunities to adapt and develop such sensing systems in the mobile context. The ability to sense physiological and psychological state using mobile (and wearable) sensors would make its applications much more feasible and accessible for daily use in different domains such as healthcare, education, security, media and entertainment. Still, there are several research challenges remain in order to develop mobile sensing systems that can monitor users’ physiological signals and psychological conditions accurately and effectively.
This thesis will address three key aspects related to realizing the multimodal mobile sensing systems for physiological and psychological state assessment. First, as the mobile embedded sensors are not designed exclusively for physiological sensing purpose, we attempt to improve the sensing capabilities of mobile devices to acquire the vital physiological signals. Specifically, we study the feasibility of using mobile sensors to measure a set of vital physiological signals, in particular, the cardiovascular metrics including blood volume, heartbeat-to-heart beat interval, heart rate, and heart rate variability. The changes in those physiological signals are essential in detecting many psychological states. Second, we validate the importance of assessing the physiological and psychological states in mobile context across various domains. Lastly, we develop and evaluate a multimodal sensing system to measure engagement level of mobile gamers. While the focus of our study was on mobile gaming scenario, we believe the concept of such sensing system is applicable to improve user experience in other mobile activities, including playing games, watching advertisements, or studying using their mobile devices.
Innovative Business Models in Online Retailing
Internet has opened the door for e-commerce and created a business avenue, online retailing. E-commerce presently shapes the manner in which consumers shop for products. The online retailing markets have grown by 56% during the past five years, while traditional retailing markets are only grown by 2% during the same time. The noticeable growth of online retailing creates numerous opportunities as well as challenges for the context of operations management.
Extensive literature in this domain focus on the conventional inventory management and pricing problems as in traditional retailing. However, the rapid development of information technology threatens the established business models and creates opportunities for new business models. Companies may find it increasingly difficult to make strategic decisions, such as how to deal with the challenge associated with online retailing and how to adapt to the new retailing environment. This thesis aims to investigate innovative business models involved in online retailing, to capture trendy phenomena that are under-studied, and provide managerial insights.
The first chapter focuses on dealing with the logistics challenge caused by the booming e-commerce activities. An urban consolidation center (UCC) or a peer-to-peer platform may alleviate the economic, social and environmental pressure on well-being. We compare the performance of these two business models to guide a consolidator to make efficient operational decisions. The second chapter focuses on the channel management decisions of a retailer who operates an offline (brick-and-mortar) channel and an online channel. The two channels are either operated separately or integrated. We explore how the retailer can profitably integrate her offline and online channels, from a perspective of product descriptions and consumer reviews. The last chapter focuses on a seller's decisions in the process of entering the online market through online marketplaces. In addition to pure-play marketplaces, some marketplaces also sell their own products directly competing with sellers, which creates a new form of channel conflict. We analyze the optimal decisions for both the seller and the marketplaces to characterize the system equilibrium.
Although creative ideation requires deviating sufficiently from conventional thoughts, people tend to fixate on highly salient and accessible concepts when responding to idea generation tasks. Surmounting such a default tendency then, is crucial to generating creative ideas. Bridging creative cognition with self-regulation research, I hypothesized that inhibitory control over such a default response may require self-regulatory resources. This would suggest that interventions that increase people’s self-regulatory resources may also boost their creativity. However, results from Study 1 did not support this hypothesis. Specifically, there was no significant difference between ego-depleted versus non-depleted participants in terms of inhibitory control over salient concepts (assessed by the newly developed Concept Inhibition Task; CIT) or creative performance. Interestingly, post-hoc findings suggest a moderating relationship between ego-depletion status and inhibitory control, such that higher inhibitory control was associated with increased creativity only for non-depleted participants; the association was otherwise null for depleted participants. Study 2 replicated the null findings of Study 1 and did not support the utility of glucose consumption – an established ego-replenishing intervention – in increasing the creative performance of ego-depleted individuals. Study 3 examined the effectiveness of mindfulness meditation – an established self-regulation boosting intervention – in elevating people’s creativity. Results revealed no significant difference in inhibitory control and creativity between participants who meditated versus those who listened to music (a comparable control group) after a ten-day intervention period. Although improvements to both inhibitory control and creativity were found when comparing baseline to post-intervention levels, such improvements were not unique to those who meditated. Interestingly, Study 3 showed that inhibitory control was positively associated with creativity at both pre- and post-intervention assessments, whereas the association was null for Study 2 where most participants were subjected to ego-depletion. Together, these three studies suggest that self-regulatory resources may not exert a direct impact on inhibitory control over salient concepts and generating creative ideas. Instead, self-regulatory resource levels may modulate the relationship between inhibitory control and creativity, such that only non-depleted individuals may reap creative benefits from inhibiting salient concepts. For ego-depleted individuals, inhibitory control over salient concepts appear to be inconsequential towards their creative performance. This post-hoc finding is explained by considering the dual pathway theory of creative idea generation (Nijstad et al., 2010). Implications and future directions are discussed.
Essays on Corporate Finance
This dissertation has two essays on corporate finance. In the first chapter, I investigate the dual-class structure. The dual-class structure is often regarded as poor corporate governance and the source of agency problems. However, I find that, for companies with high information asymmetry and long investment horizon, dual-class structure delivers higher operating performance and valuation ratios. These performing dual-class companies tend to have a higher investment in intangibles, more innovations, less pay-out, and less CEO compensation. The findings suggest that dual-class structure could be optimal in empowering information-advantageous inside shareholders and ensuring corporate long-term goals.
In the second chapter of my dissertation, we study how air pollution influences firm performance. Air pollution is a growing hazard to human health. This study examines whether air pollution affects the formation of corporate human capital and thereby performance. We find that people exhibit an intention to look for jobs in less polluted areas on days when air pollution occurs in the area where they are located, suggesting that an individual’s sort in response to air pollution. Consistent with this sorting prediction, we find that the level of firms’ skilled executives and employees significantly drops when pollution information becomes real-time accessible and when the pollution level increases in their locations, especially in places where concerns for health is more sensitive to air pollution. Moreover, parallel reductions in firm productivity and value are found and become more salient when firms have a greater dependence on human capital.
Efficient sequential matching of supply and demand is a problem of interest in many online to offline services. For instance, Uber, Lyft, Grab for matching taxis to customers; Ubereats, Deliveroo, FoodPanda etc. for matching restaurants to customers. In these systems, a centralized entity (e.g., Uber) aggregates supply and assigns them to demand so as to optimize a central metric such as profit, number of requests, delay etc. However, individuals (e.g., drivers, delivery boys) in the system are self interested and they try to maximize their own long term profit. The central entity has the full view of the system and it can learn policies to maximize the overall payoff and suggest it to the individuals. However, due to the selfish nature of the individuals, they might not be interested in following the suggestion. Hence, in my thesis, I develop approaches that learn to guide these individuals such that their long term revenue is maximized. There are three key characteristics of the aggregation systems which make them unique from other multi-agent systems. First, there are thousands or tens of thousands of individuals present in the system. Second, the outcome of an interaction is anonymous, i.e., the outcome is dependent only on the number and not on the identities of the agents. And third, there is a centralized entity present which has the full view of the system, but its objective does not align with the objectives of the individuals. These characteristics of the aggregation systems make the use of the existing Multi-Agent Reinforcement Learning (MARL) methods challenging as they are either meant for just a few agents or assume some prior belief about others. A natural question to ask is whether individuals can utilize these features and learn efficient policies to maximize their own long term payoffs. My thesis research focuses on answering this question and provide scalable reinforcement learning methods in aggregation systems. Utilizing the presence of a centralized entity for decentralized learning in a non-cooperative setting is not new and existing MARL methods can be classified based on how much extra information related to the environment state and joint action is provided to the individual learners. However, presence of a self-interested centralized entity adds a new dimension to the learning problem. In the setting of an aggregation system, the centralized entity can learn from the overall experiences of the individuals and might want to reveal only those information which helps in achieving its own objective. Therefore, in my work I propose approaches by considering multiple combinations of levels of information sharing and levels of learning done by the centralized entity. My first contribution assumes that the individuals do not receive any extra information and learn from their local observation. It is a fully decentralized learning method where independent agents learn from the offline trajectories by considering that others are following stationary policies. In my next work, the individuals utilize the anonymity feature of the domain and consider the number of other agents present in their local observation to improve their learning. By increasing the level of learning done by the centralized entity, in my next contribution I provide an equilibrium learning method where the centralized entity suggests a variance minimization policy which is learned based on the values of actions estimated by the individuals. By further increasing the level of information shared and the level of learning done by the centralized entity, I next provide a learning method where the centralized entity acts as an correlation agent. In this method the centralized entity learns social welfare maximization policy directly from the experiences of the individuals and suggests it to the individual agents. The individuals in turn learn a best response policy to the suggested social welfare maximization policy. In my last contribution I propose an incentive based learning approach where the central agent provides incentives to the individuals such that their learning converges to a policy which maximizes overall system performance. Experimental results on real-world data sets and multiple synthetic data sets demonstrate that these approaches outperform other state-of-the-art approaches both in the terms of individual payoffs and overall social welfare payoff of the system.
How Does Status Affect Peformance and Learning from Failure? Evidence from Online Communities
This dissertation is composed of two essays. In the first essay, I investigate the factors that can alleviate the detrimental effect of hierarchy on team performance. I first show that hierarchy negatively impacts team performance, which is consistent with recent meta-analytic evidence. One mechanism that drives this negative effect is that hierarchy prevents low-ranking members from voicing their potentially valuable insights. Then I propose that team familiarity is one factor that can encourage low-ranking team members to speak up. I contend that team familiarity can be established either by team members’ prior experience in working with one another or can be built by team members’ prior experience in working in hierarchical teams, such that they are familiar with hierarchical working relationships. Using data collected from an online crowdsourcing contest community, I find that team members’ familiarity with each other and their familiarity with hierarchical working relationships can alleviate the detrimental effect of hierarchy on team performance. By illuminating the moderating effect of team familiarity on the hierarchy-performance relationship, this study advances current understandings of how to reduce the detrimental effect of hierarchy on performance, and offers insights about how teams should be organized to improve performance.
In the second essay, I examine what factors drive learning from failure. In answering this question, I bring status theory into the literature on learning from failure and propose that status can drive people’s learning from their failures. I propose that failure feedback given by a higher-status source is more likely to drive a focal individual to learn from her failures than failure feedback given by a lower-status source. This is because people pay more attention to and are more engaged with failure feedback given by a higher-status source than failure feedback given by a lower-status source. Data collected from an online programming contest community provides support to my prediction that failure feedback given by higher-status peers has stronger effect in driving learning from failure than failure feedback given by lower-status peers. By demonstrating that status is a driver of learning from failure, I expand experiential learning theories by incorporating status theory.
Chapter 1: How institutions enhance mindfulness: interactions between external regulators and front-line operators around safety rules (with Ravi S. Kudesia and Jochen Reb) How is it that some organizations can maintain nearly error-free performance, despite trying conditions? Within research on such high-reliability organizations, mindful organizing has been offered as a key explanation. It entails interaction patterns among front-line operators that keep them attentive to potential failures—and relies on them having the expertise and autonomy to address any such failures. In this study, we extend the mindful organizing literature, which emphasizes local interactions among operators, by considering the broader institutional context in which it occurs. Through interview, observational, and archival data of a high-reliability explosive demolitions firm in China, we find that external regulators can crucially enhance the mindful organizing of front-line operators as regulators and operators interact around safety rules. Regulators go beyond the interactions emphasized in institutional theory, whereby regulators help operators internalize the content of rules and follow the rules in practice. Rather, regulator interactions also help ensure the salience of rules, which enriches and distributes operator attention throughout the firm. We also find evidence of regulator learning, as interactions with operators help regulators improve rule content and the techniques by which rules remain salient. These findings expand our understanding of mindful organizing and the interactional dynamics of institutions. They also particularly speak to the debate over whether and how rules can enhance safety. Namely, through distinct practices that impact the content and salience of rules, regulators can increase standardization without diminishing operator autonomy.
Chapter 2: Entrainment and the temporal structuring of attention: insights from a high-reliability explosive demolitions firm (with Ravi S. Kudesia and Jochen Reb) Attention has always been central to organization theory. What has remained implicit is that attention is a temporal phenomenon. Attention accumulates and dissipates at multiple timescales: it oscillates wavelike within a performance episode, decays gradually over the course of a performance episode, and withdraws in a step-like manner across multiple performance episodes. Organizations attempt to regulate the attention of front-line employees. But to the extent that attention has been examined as a stable phenomenon, rather than a temporal one, metacognitive practices that stabilize attention remain unexamined in organization theory. And to the extent that fluctuations in attention on the front lines generate systemic risks, these unexamined stabilizing practices constitute a core part of organizational reliability. In this case study, we examine a high-reliability explosive demolitions firm. Going beyond past work that identifies best practices shared across organizations, we instead uncover the logic of how several practices are bundled together in a single organization to stabilize front-line attention across these timescales. We uncover distinct bundles of attention regulation practices designed to proactively encourage attention and discourage inattention and to reactively learn from problems, including problems resulting from inattention. We theorize that these practices are bundled according to a logic of entrainment. Practices that proactively regulate the fluctuations of attention over time are mapped onto existing work routines that repeat cyclically across concentrically nested timescales—and reactive practices enhance learning by extracting lessons from mindless behaviors and feeding them back into entrained practice.
Research and Development (R&D) is time consuming, expensive and risky; yet product life cycles are shortening and competition is fierce. Therefore, R&D often requires the collaboration and input of multiple stakeholders. This dissertation studies how collaborations involving multiple stakeholders can effectively make R&D project portfolio selection decisions to create the optimal social welfare. The two essays in the dissertation build stylized analytical models to examine R&D project portfolio selection in two different settings, academia and industry respectively. The models explicitly acknowledge the different information, goals and operational decisions of the stakeholders. In the first essay, we study a two-stage funding process for university research project selection, with bridge funding by the university first followed by government funding after. We consider different project selection mechanisms by the university corresponding to different strategic missions. We focus on the impact of the university-level selection on government funding and project success and provide recommendations for university funding in terms of policies, objectives and coverage. In the second essay, we look at strategic R&D alliances between two profit-maximizing firms. Specifically, we study how the payment structure and the contract timing affects the project selection decisions of the stakeholders in a strategic alliance, in the presence of an R&D budget constraint, market interactions, and varying levels of bargaining power. We provide recommendations for the effective formation of strategic alliances.
Password is a prevalent means used for user authentication in pervasive computing environments since it is simple to be deployed and convenient to use. However, the use of password has intrinsic problems due to the involvement of keystroke. Keystroke behaviors may emit various side-channel information, including timing, acoustic, and visual information, which can be easily collected by an adversary and leveraged for the keystroke inference. On the other hand, those keystroke-related information can also be used to protect a user's credentials via two-factor authentication and biometrics authentication schemes. This dissertation focuses on investigating the PIN inference due to the side-channel information disclosure and exploring the design of a new two-factor authentication system.
The first work in this dissertation proposes a user-independent inter-keystroke timing attack on PINs. Our attack method is based on an inter-keystroke timing dictionary built from a human cognitive model whose parameters can be determined by a small amount of training data on any users. Our attacks can thus be potentially launched in a large scale in real-world settings. We investigate inter-keystroke timing attacks in different online attack settings and evaluate their performance on PINs at different strength levels. Our experimental results show that the proposed attack performs significantly better than random guessing attacks. We further demonstrate that our attacks pose a serious threat to real-world applications and propose various ways to mitigate the threat.
We then propose a more accurate and practical PIN attack based on ultrasound, named UltraPIN, in the second work. It can be launched from commodity smartphones. As a target user enters a PIN on a PIN-based user authentication system, an attacker may use UltraPIN to infer the PIN from a short distance without a line of sight. In this process, UltraPIN leverages on smartphone speakers to issue human-inaudible ultrasound signals and uses smartphone microphones to keep recording acoustic signals. It applies a series of signal processing techniques to extract high-quality feature vectors from low-energy and high-noise signals. Taking the extracted feature vectors as input, UltraPIN applies a combination of machine learning models to classify finger movement patterns during PIN entry, and generates a ranked list of highly possible PINs as result. Rigorous experiments show that UltraPIN is highly effective in PIN inference and robust to different attacking settings.
Keystroke timing information and keystroke typing sounds can also be used to protect users' accounts. In the third work, we propose Typing-Proof, a usable, secure and low-cost two-factor authentication mechanism. Typing-Proof is similar to software token based 2FA in a sense that it uses password as the first factor and uses a registered phone to prove the second factor. During the second-factor authentication procedure, it requires a user to type any random code on a login computer and authenticates the user by comparing the keystroke timing sequence of the random code recorded by the login computer with the sounds of typing random code recorded by the user's registered phone. Typing-Proof achieves good performance in most settings and requires zero user-phone interaction in most cases. It is secure and immune to the existing attacks to recent 2FA mechanisms. In addition, Typing-Proof enables significant cost savings for both service providers and users.
This dissertation makes contributions to understanding the potential risk of side-channel information leaked by keystroke behaviors and designing a secure, usable and low-cost two-factor authentication systems. On the one hand, our proposed side-channel attacks make use of human cognitive model and ultrasound, which provides useful insights into the field of combining cognitive psychology and Doppler effect with human behavior related insecurity. On the other hand, our proposed two-factor authentication system eliminates the user-phone interaction in most cases and can effectively defend against the existing attacks to recent 2FA mechanisms.
Three Essays on International Trade Policies
This dissertation studies the empirical and quantitative implications of trade policies. The first chapter examines the effects of trade policies on quality specialization across cities within a country. Specifically, we complement the quality specialization literature in international trade and study how larger cities within a country produce goods with higher quality. We first establish three stylized facts on how product quality is related to agglomeration, firm productivity, and worker skills. We then rationalize these facts in a spatial equilibrium model where all the elements mentioned above are present and firms are free to choose their locations. Using firm-level data from China, we structurally estimate the model and find that agglomeration and spatial sorting of firms each accounts for about 50% of the spatial variation in quality specialization. A counterfactual to relax land use regulation in housing production raises product quality in big cities by 5.5% and indirect welfare of individuals by 6.2%. The second chapter zooms into distributional issues and studies the implication of rising income inequality on product price dispersion. Using big data on a broad set of goods sold in the US (Nielsen Retail Scanner Data) from 2006 to 2017, we find that in general there is a missing middle phenomenon, where the product price distribution loses its mass in the middle price support. In addition, we find that this pattern is more pronounced in the densely populated metropolitan areas. We further link this observation to changes in income inequality, which are measured from a panel of US households from 2006 to 2017 (IPUMS ACS). The results support our conjecture that demand-side demographics has a significant influence on the missing middle phenomenon. The third chapter examines the transition dynamics of trade liberalization. In particular, we develop a multi-country, multi-sector quantitative trade model with dynamic Roy elements such as occupational choice and occupation-specific human capital accumulation. Given an abrupt trade liberalization, a country that is relatively more productive in some sectors may not have comparative advantage initially, as it takes time to accumulate occupation-specific human capital which increases occupational skill supply endogenously. We quantify this transition dynamics and its distributional consequences by calibrating the model to a North-South setup.
Three Essays on Econometrics
The dissertation includes three chapters on econometrics. The first chapter is about treatment effects and its application in randomized control trial. The second chapter is about specification test. The third chapter is about panel data model with fixed effects.
In the first chapter, we study the estimation and inference of the quantile treatment effect under covariate-adaptive randomization. We propose two estimation methods: (1) the simple quantile regression and (2) the inverse propensity score weighted quantile regression. For the two estimators, we derive their asymptotic distributions uniformly over a compact set of quantile indexes, and show that, when the treatment assignment rule does not achieve strong balance, the inverse propensity score weighted estimator has a smaller asymptotic variance than the simple quantile regression estimator. For the inference of method (1), we show that the Wald test using a weighted bootstrap standard error under-rejects. But for method (2), its asymptotic size equals the nominal level. We also show that, for both methods, the asymptotic size of the Wald test using a covariate-adaptive bootstrap standard error equals the nominal level. We illustrate the finite sample performance of the new estimation and inference methods using both simulated and real datasets.
In the second chapter, we propose a novel consistent model specification test based on the martingale difference divergence (MDD) of the error term given the covariates. The MDD equals zero if and only if error term is conditionally mean independent of the covariates. Our MDD test does not require any nonparametric estimation under the null or alternative and it is applicable even if we have many covariates in the regression model. We have established the asymptotic distributions of our test statistic under the null and under a sequence of Pitman local alternatives converging to the null at the usual parametric rate. We have conducted simulations to evaluate the finite sample performance of our test and compare it with its competitors. We find that our MDD test has superb performance in terms of both size and power and it generally dominates its competitors. In particular, it’s the only test that has well controlled size in the presence of many covariates and reasonable power against high frequent alternatives as well. We apply our test to test for the correct specification of functional forms in gravity equations for four datasets. For all the datasets, we reject the log and level model coherently at 10% significance level. However, its competitors show mixed testing results for different datasets. The findings reveal the advantages of our test.
In the third chapter, we consider the Nickell bias problem in dynamic fixed effects multilevel panel data models with various kinds of multi-way error components. For some specifications of error components, there exist many different forms of within estimators which are shown to be of possibly different asymptotic properties. The forms of the estimators in our framework are given explicitly. We apply the split-sample jackknife approach to eliminate the bias. In practice, our results can be easily extended to multilevel panel data models with higher dimensions.
Deep Learning for Real-world Object Detection
Despite achieving significant progresses, most existing detectors are designed to detect objects in academic contexts but consider little in real-world scenarios. In real-world applications, the scale variance of objects can be significantly higher than objects in academic contexts; In addition, existing methods are designed for achieving localization with relatively low precision, however more precise localization is demanded in real-world scenarios; Existing methods are optimized with huge amount of annotated data, but in certain real-world scenarios, only a few samples are available. In this dissertation, we aim to explore novel techniques to address these research challenges to make object detection algorithms practical for real-world applications.
The first problem is scale-invariant detection. Detecting objects with multiple scales is covered in existing detection benchmarks. However, in real-world applications the scale variance of objects is extremely high and thus it requires more discriminative features. Face detection is a suitable benchmark to evaluate scale-invariant detection due to the vastly different scales of faces. In this dissertation, we propose a novel framework of ``Feature Agglomeration Networks" (FAN) to build a new single stage face detector. A novel feature agglomeration block is proposed to enhance low-level feature representation and the model is optimized in a hierarchical manner. FAN achieved state-of-the-art results in real world face detection benchmarks with real-time inference speed.
The second problem is high-quality detection. This challenge requires detectors to predict more precise localization. In this dissertation, we propose two novel detection frameworks for high-quality detection: ``Bidirectional Pyramid Networks'' (BPN) and ``KPNet''. In BPN, a Bidirectional Feature Pyramid structure is proposed for robust feature representations, and a Cascade Anchor Refinement is proposed to gradually refine the quality of pre-designed anchors. To eliminate the initial anchor design step in BPN, KPNet is proposed which automatically learns to optimize a dynamic set of high-quality keypoints without heuristic anchor design. Both BPN and KPNet show significant improvement over existing on MSCOCO dataset, especially in high quality detection settings.
The third problem is few-shot detection, where only a few training samples are available.
Inspired by the principle of meta-learning methods, we propose two novel meta-learning based few-shot detectors: ``Meta-RCNN" and ``Meta Constrastive Detector'' (MCD). Meta-RCNN learns an binary object detector in an episodic learning paradigm on the training data with a class-aware attention module, and it can be end-to-end meta-optimized. Based on Meta-RCNN, MCD follows the principle of contrastive learning to enhance the feature representation for few-shot detection, and a new hard negative sampling strategy is proposed to address imbalance of training samples. We demonstrate the effectiveness of Meta-RCNN and MCD in few-shot detection on Pascal VOC dataset and obtain promising results.
The proposed techniques address the problems discussed and show significant improvement on real-world utility.
Essays on Nonstationary Econometrics
My dissertation consists of three essays that contribute new theoretical results to robust inference procedures and machine learning algorithms in nonstationary models.
Chapter 2 compares OLS and GLS in autoregressions with integrated noise terms. Grenander and Rosenblatt (2008) gave sufficient conditions for the asymptotic equivalence of GLS and OLS in deterministic trend extraction. However when extending to univariate autoregression model yt = ρnyt−1 + ut , ρn = 1 + c nα , ut = ut−1 + t , and t is one iid disturbance term with zero expectation and σ 2 variance, the asymptotic equivalence no longer holds. Under the mildly explosive (c > 0, α ∈ (0, 1)) and pure explosive (c > 0, α = 0) cases, the limiting distributions of OLS and GLS estimates are identical as standard Cauchy distribution, and the OLS estimate has a slower convergence rate. Under the mildly stationary (c < 0, α ∈ (0, 1)) case, the limiting distribution of OLS is degenerate centered at −c, while the GLS estimate is Gaussian distributed. Under the local to unity (α = 1) case, when c ≥ c ∗ , the mean and variance of the asymptotic distribution of the OLS estimate are smaller than the GLS estimate, showing the efficiency gains in OLS.
Chapter 3 proposes novel mechanisms for identifying explosive bubbles in panel autoregressions with a latent group structure. Two post-classification panel data approaches are employed to test the explosiveness in time-series data. The first approach applies a recursive k-means clustering algorithm to explosive panel autoregressions. The second approach uses a modified k-means clustering algorithm for mixed-root panel autoregressions. We establish the uniform consistency of both clustering algorithms. The abovementioned k-means procedures achieve the oracle properties so that the post-classification estimators are asymptotically equivalent to the infeasible estimators that use the true group identities. Two right-tailed t-statistics, based on post-classification estimators, are introduced to detect explosiveness. A panel recursive procedure is proposed to estimate the origination date of explosiveness. The asymptotic theory is available for concentration inequalities, clustering algorithms, and right-tailed t-tests based on mixed-root panels. Extensive Monte Carlo simulations provide strong evidence that the proposed panel approaches lead to substantial power gains compared with the time-series approach.
Chapter 4 explores predictive regression models with stochastic unit root (STUR) components and robust inference procedures that encompass a wide class of persistent and time-varying stochastically nonstationary regressors. The paper extends the mechanism of endogenously generated instrumentation known as IVX, showing that these methods remain valid for short- and long-horizon predictive regressions in which the predictors have STUR and local STUR (LSTUR) generating mechanisms. Both mean regression and quantile regression methods are considered. The asymptotic distributions of the IVX estimators are new compared to previous work but again lead to pivotal limit distributions for Wald testing procedures that remain robust for both single and multiple regressors with various degrees of persistence and stochastic and fixed local departures from unity. Numerical experiments corroborate the asymptotic theory, and IVX testing shows good power and size control. The new methods are illustrated in an empirical application to evaluate the predictive capability of economic fundamentals in forecasting excess returns in the Dow Jones industrial average index.
Essays on Time Series and Financial Econometrics
This dissertation contains four essays in financial econometrics. In the first essay, some asymptotic results are derived for first-order autoregression with a root moderately deviating from unity and a nonzero drift. It is shown that the drift changes drastically the large sample properties of the least-squares (LS) estimator. The second essay is concerned with the joint test of predictability and stability in the context of predictive regression. The null hypothesis under investigation is that the potential predictors exhibit no predictability and incur no structural break during the sample period. We first show that the IVX estimator provides better finite sample performance than LS when they are used to test for a structural break in the slope coefficient. We then consider a new test by combining the IVX and sup-Wald statistics. The third essay considers the impact of level-shifts in the predicted variable on the performance of the conventional test for predictability when highly persistent predictors are used. It is shown that the limiting distribution of conventional t-statistic depends on the magnitude of break size. When the breaks are ignored, the t-statistic generates a too large type-I error. To alleviate this problem, we propose to base the inference on a sample-splitting procedure. Applications to the prediction of stock return volatility and housing price index are conducted. In the last essay, we consider a new multivariate stochastic volatility (MSV) model, applying a fully flexible parameterization of the correlation matrix, which generalizes Fisher’s z-transformation to the high-dimensional case. In the new model, we can separately model the dynamics in volatilities and correlations. To conduct statistical inference of the proposed model, we propose the Particle Gibbs Ancestor Sampling (PGAS) method. Extensive simulation studies are conducted to show the proposed method works well.
This dissertation comprises three papers that separately study different nonstationary time series models.
The first paper, titled as "The Grid Bootstrap for Continuous Time Models", is a joint work with Professor Jun Yu and Professor Weilin Xiao. It considers the grid bootstrap for constructing confidence intervals for the persistence parameter in a class of continuous-time models driven by a Lévy process. Its asymptotic validity is discussed under the assumption that the sampling interval (h) shrinks to zero, the time span (N) goes to infinity or both. Its improvement over the in-fill asymptotic theory is achieved by expanding the coefficient-based statistic around its in-fill asymptotic distribution which is non-pivotal and depends on the initial condition. Monte Carlo studies show that the grid bootstrap method performs better than the in-fill asymptotic theory and much better than the long-span asymptotic theory. Empirical applications to U.S. interest rate data and volatility data suggest significant differences between the bootstrap confidence intervals and the confidence intervals obtained from the in-fill and long-span asymptotic distributions.
The second paper, "Mildly Explosive Autoregression with Anti-persistent Errors" is another joint work with Professor Yu and Professor Xiao. It studies a mildly explosive autoregression model with Anti-persistent Errors. An asymptotic distribution is derived for the least squares (LS) estimate of a first-order autoregression with a mildly explosive root and anti-persistent errors. While the sample moments depend on the Hurst parameter asymptotically, the Cauchy limiting distribution theory remains valid for the LS estimates in the model without intercept and a model with an asymptotically negligible intercept. Monte Carlo studies are designed to check the precision of the Cauchy distribution in finite samples. An empirical study based on the monthly NASDAQ index highlights the usefulness of the model and the new limiting distribution.
The third paper "Testing for Rational Bubbles under Strongly Dependent Errors" considers testing procedures for rational bubbles under strongly dependent errors. A heteroskedasticity and autocorrelation robust (HAR) test statistic is proposed to detect the presence of rational bubbles in financial assets when errors are strongly dependent. The asymptotic theory of the test statistic is developed. Unlike conventional test statistics that lead to a too large type I error under strongly dependent errors, the new test does not suffer from the same size problem. In addition, it can consistently timestamp the origination and termination dates of a rational bubble. Monte Carlo studies are conducted to check the finite sample performance of the proposed test and estimators. An empirical application to the S&P 500 index highlights the usefulness of the proposed test statistic and estimators.
Followers’ Reactions to Leader Differentiation
Leaders generally differentiate their relationships with followers, for example, by providing some with more respect, trust, support, or information than others (Liden & Graen, 1980). However, the effects of such leader differentiation on followers remain inconclusive such that research suggests that leader differentiation may have negative, positive, or null effects on favorable employee work-related outcomes (for a recent review, see Martin et al., 2018). To better understand the effects of leader differentiation, utilizing leader-member exchange (LMX) theory, I considered three inherently connected properties in the leader differentiation process – LMX differentiation, LMX quality and LMX social comparison (Martin et al., 2018). I theorized that the three properties interact to influence followers’ supervisory interactional justice perceptions and subsequently their discretionary behaviors toward their leaders. Results from three studies with different research designs and conducted in different cultures, largely supported my hypothesized conditional moderated mediation model. When LMX quality and LMX social comparison were both high, the negative impact of LMX differentiation on followers’ supervisory interactional justice perceptions was the weakest. In addition, when LMX quality and LMX social comparison were both high, LMX differentiation’s positive indirect effect on followers’ supervisor-directed deviance and its negative indirect effect on followers’ supervisor-directed organizational citizenship behaviors via followers’ supervisory interactional justice perceptions were the weakest.
This dissertation studies the fixed effects (FE) spatial panel data (SPD) models with temporal heterogeneity (TH), where the regression coefficients and spatial coefficients are allowed to change with time. The FE-SPD model with time-varying coefficients renders the usual transformation method in dealing with the fixed effects inapplicable, and an adjusted quasi score (AQS) method is proposed, which adjusts the concentrated quasi score function with the fixed effects being concentrated out. AQS tests for the lack of temporal heterogeneity (TH) in slope and spatial parameters are first proposed. Then, a set of AQS estimation and inference methods for the FE-SPD model with temporal heterogeneity is developed, when the AQS tests reject the hypothesis of temporal homogeneity. Finally, an attempt is made to extend these methodologies to allow the idiosyncratic errors of the model to be heteroskedastic along the cross-section dimension, where a method called outer-product-of-martingale-differences is proposed to estimate the variance of the AQS functions which in turn gives a robust estimator of the variance-covariance matrix of the AQS estimators.
Asymptotic properties of the AQS tests are examined. Consistency and asymptotic normality of the AQS estimators are examined under both homoscedastic and heteroskedastic errors. Extensive Monte Carlo experiments are conducted and the results show excellent finite sample performance of the proposed AQS tests, the proposed AQS estimators of the full model, and the corresponding estimates of the standard errors. Empirical illustrations are provided.
Using Knowledge Bases for Question Answering
A knowledge base (KB) is a well-structured database, which contains many of entities and their relations. With the fast development of large-scale knowledge bases such as Freebase, DBpedia and YAGO, knowledge bases have become an important resource, which can serve many applications, such as dialogue system, textual entailment, question answering and so on. These applications play significant roles in real-world industry.
In this dissertation, we try to explore the entailment information and more general entity-relation information from the KBs. Recognizing textual entailment (RTE) is a task to infer the entailment relations between sentences. We need to decide whether a hypothesis can be inferred from a premise based on the text of two sentences. Such entailment relations could be potentially useful in applications like information retrieval and commonsense reasoning. It's necessary to develop automatic techniques to solve this problem. Another task is knowledge base question answering (KBQA). This task aims to automatically find answers to factoid questions from a knowledge base, where answers are usually entities in the KB. KBQA task has gained much attention in recent years and shown promising contribution to real-world problems. In this dissertation, we try to study the applications of knowledge bases in textual entailment and question answering:
We propose a general neural network based framework which can inject lexical entailment relations to RTE, and a novel model is developed to embed lexical entailment relations. The experiment results show that our method can benefit general textual entailment model. We design a KBQA method based on an existing reading comprehension model. This model achieves competitive results on several popular KBQA datasets. In addition, we make full use of contextual relations of entities in the KB. Such enriched information helps our model to attain state-of-art. We propose to perform topic unit linking where topic units cover a wider range of units of a KB. We use a generation-and-scoring approach to gradually refine the set of topic units. Furthermore, we use reinforcement learning to jointly learn the parameters for topic unit linking and answer candidate ranking in an end-to-end manner. Experiments on three commonly used benchmark datasets show that our method consistently works well and outperforms the previous state of the art on two datasets. We further investigate multi-hop KBQA task, i.e., question answering from KB where questions involve multiple hops of relations, and develop a novel model to solve such questions in an iterative and efficient way. The results demonstrate that our method consistently outperforms several multi-hop KBQA baselines.Over last few decades, the way software is developed has changed drastically. From being an activity performed by developers working individually to develop standalone programs, it has transformed into a highly collaborative and cooperative activity. Software development today can be considered as a participatory culture, where developers coordinate and engage together to develop software while continuously learning from one another and creating knowledge.
In order to support their communication and collaboration needs, software developers often use a variety of social media channels. These channels help software developers to connect with like-minded developers and explore collaborations on software projects of interest. However, developers face a lot of challenges while trying to make use of various social media channels. As the volume of content produced on social media is huge developers often face the problem of information overload while using these channels. Also creating and maintaining a relevant network among a huge number of possible connections is challenging for developers. The works performed in this dissertation focus on addressing the above challenges with respect to Twitter, a social media popular among developers to get the latest technology updates, as well as connect with other developers. The first three works performed as a part of this dissertation deal with understanding the software engineering content produced on Twitter and how it can be harnessed for automatic mining of software engineering related knowledge. The last work aims at understanding what kind of accounts software developers follow on Twitter, and then proposes an approach which can help developers to find software experts on Twitter. The following paragraphs briefly describe the works that have been completed as part of this dissertation and how they address the aforementioned challenges.
In the first work performed as part of the dissertation, an exploratory study was conducted to understand what kind of software engineering content is popular among developers in Twitter. The insights found in this work help to understand the content that is preferred by developers on Twitter and can guide future techniques or tools which aim to extract information or knowledge from software engineering content produced on Twitter. In the second work, a technique was developed which can automatically differentiate content related to software development on Twitter from other non-software content. This technique can help in creating a repository of software related content extracted from Twitter, that can be used to create downstream tools which can do tasks such as mining opinions about APIs, best practices, recommending relevant links to read, etc. In the third work, Twitter was leveraged to automatically find URLs related to a particular domain, as Twitter makes it possible to infer the network and popularity information of users who tweet a particular URL. 14 features were proposed to characterize each URL by considering webpage contents pointed by it, popularity and content of tweets mentioning it, and the popularity of users who shared the URL on Twitter.
In the final work of this dissertation, an approach has been proposed to address the challenge developers face in finding relevant developers to follow on Twitter. A survey was done with developers, and based on its analysis, an approach was proposed to identify software experts on Twitter, provided a given software engineering domain. The approach works by extracting 32 features related to Twitter users, with features belonging to the categories such as Content, Network, Profile, and GitHub. These features are then used to build a classifier which can identify a Twitter user as a software expert of a given domain or otherwise. The results show that our approach is able to achieve F-Measure scores of 0.522-0.820 on the task of identifying software experts, achieving an improvement of at-least 7.63% over the baselines.
The rapid advances of the Web have changed the ways information is distributed and exchanged among individuals and organizations. Various content from different domains are generated daily and contributed by users' daily activities, such as posting messages in a microblog platform, or collaborating in a question and answer site. To deal with such tremendous volume of user generated content, there is a need for approaches that are able to handle the mass amount of available data and to extract knowledge hidden in the user generated content. This dissertation attempts to make sense of the generated content to help in three concrete tasks.
In the first work performed as part of the dissertation, a machine learning approach was proposed to predict a customer's feedback behavior based on her first feedback tweet. First, a few categories of customers were observed based on their feedback frequency and the sentiment of the feedback. Three main categories were identified: spiteful, one-off, and kind. By using the Twitter API, user profile and content features were extracted. Next, a model was built to predict the category of a customer given his or her first feedback. The experiment results show that the prediction model performs better than a baseline approach in terms of precision, recall, and F-measure. In the second work, a method was proposed to predict readers' emotion distribution affected by a news article. The approach analyzed affective annotations provided by readers of news articles taken from a non-English online news site. A new corpus was created from the annotated articles. A domain-specific emotion lexicon was constructed along with word embedding features. Finally, a multi-target regression model was built from a set of features extracted from online news articles. By combining lexicon and word embedding features, the regression model is able to predict the emotion distribution with RMSE scores between 0.067 to 0.232. For the final work of this dissertation, an approach was proposed to improve the effectiveness of knowledge extraction tasks by performing cross-platform analysis. This approach is based on transfer representation learning and word embedding to leverage information extracted from a source platform which contains rich domain-related content to solve tasks in another platform (considered as target platform) with less domain-related content. We first build a word embedding model as a representation learned from the source platform, and use the model to improve the performance of knowledge extraction tasks in the target platform. We experiment with Software Engineering Stack Exchange and Stack Overflow as source platforms, and two different target platforms, i.e., Twitter and YouTube. Our experiments show that our approach improves performance of existing work for the tasks of finding software-related tweets and filtering informative YouTube comments.
The rising mental health illnesses of severe stress and depression is of increasing concern worldwide. Often associated by similarities in symptoms, severe stress can take a toll on a person’s productivity and result in depression if the stress is left unmanaged. Unfortunately, depression can occur without any feelings of stress. With depression growing as a leading cause of disability in economic productivity, there has been a sharp rise in mental health initiatives to improve stress and depression management. To offer such services conveniently and discreetly, recent efforts have focused on using mobile technologies. However, these initiatives usually require users to install dedicated apps or use a variety of sensors, making such solutions hard to scale. Moreover, they emphasise sensing individual factors and overlook ‘physical social interaction’ that plays a significant role in influencing stress and depression. This thesis presents StressMon, a monitoring system that can easily scale across entire campuses by passively sensing location information directly from the WiFi infrastructure.
This dissertation explores how, by using only single-attribute location information, mobility features can be comprehensively extracted to represent individual behaviours to detect stress and depression accurately; it is important to note that this is without requiring explicit user actions or software installation on client devices. To overcome the low-dimensional data, StressMon additionally infers physical group interaction patterns from a group detector system. First, I investigate how mobility features can be exploited to better capture the dynamism of natural human behaviours indicative of stress and depression. Then, I present the framework to detect stress and depression accurately, albeit separately. In a supplementary effort, I demonstrate how optimising StressMon with group-based mobility features greatly enhances the performance of stress detection, and conversely, individual-based features improve depression detection. To extensively validate the system, I conducted three different semester-long longitudinal studies with different groups of undergraduate students at separate times, totalling up to 108 participants. Finally, this dissertation documents the differences learned in understanding stress and depression from a qualitative perspective.
Static analysis is a common program analysis technique extensively used in the software security field. Widely-used static analysis tools for Android, e.g., Amandroid and FlowDroid, perform the whole-app analysis which is comprehensive yet at the cost of huge overheads. In this dissertation, we make a first attempt to explore a novel on-demand analysis that creatively leverages bytecode search to guide inter-procedural analysis on the fly or just in time, and develop such on-the-fly analysis into a tool, called BackDroid, for Android apps. We further explore how the core technique of on-the-fly static analysis in BackDroid can enable different vulnerability studies on Android and their corresponding new findings. To this end, we select three vulnerability analysis problems on Android as three representatives, since they require different extents of BackDroid customization in their methodology.
First, we explore how BackDroid can be applied to detect crypto and SSL/TLS misconfigurations in modern Android apps, and compare it with the state-of-the-art Amandroid tool. Second, we explore how an enhanced version of BackDroid and on-device crowdsourcing can facilitate a systematic security study of open ports in Android apps. Third, we explore how a lightweight version of BackDroid with SDK conditional statement checking can benefit a SDK-API inconsistency study that involves the control-flow analysis of multiple sink APIs. With all these works, this dissertation shows that on-the-fly Android static analysis guided by bytecode search can efficiently and effectively analyze the security of modern apps.
The dissertation explores the role of human capital, education, and political institutions in the process of economic and political development. The first chapter shows that economic development such as secondary school enrollment rates during the democratization period exerts long-lasting effects on growth, possibly by giving permanent birthmarks to newly minted democratic institutions. Specifically, democracies born in weak development tend to have weak institutions and slow growth, while in contrast, those with adequate development at the political transition time establish strong institutions and achieve faster growth. The second chapter explores the effect of curriculum control in schooling on national innovation and individual creativity. The evidence suggests that a more centralized curriculum control, as indicated by more centralized official curriculum design together with more frequent high-stakes achievement exams, tends to reduce individual creativity and weaken national innovation. The third chapter studies how state capacity affects the investment in human capital, economic growth and democratization. It shows that autocracy may not necessarily inhibit economic growth when a country is poor but the state capacity is strong, while democracy facilitates growth more when a country is rich. In particular, the relationship between state development and democratization follows an inverted U-shape.
In this thesis, we study reinforcement learning algorithms to collectively optimize decentralized policy in a large population of autonomous agents. We notice one of the main bottlenecks in large multi-agent system is the size of the joint trajectory of agents which quickly increases with the number of participating agents. Furthermore, the noiseof actions concurrently executed by different agents in a large system makes it difficult for each agent to estimate the value of its own actions, which is well-known as the multi-agent credit assignment problem. We propose a compact representation for multi-agent systems using the aggregate counts to address the high complexity of joint state-action and novel reinforcement learning algorithms based on value function decomposition to address the multi-agent credit assignment problem as follows: 1. Collective Representation: In many real-world systems such as urban traffic networks, the joint-reward and environment dynamics depend on only the number of agents (the count) involved in interactions rather than agent identity. We formulate this sub-class of multi-agent systems as a Collective Decentralized Partially Observable Markov Decision Process (CDEC-POMDP). We show that in CDEC-POMDP, the transition counts, which summarize the numbers of agents taking different local actions and transiting from their current local states to new local states, are sufficient-statistics for learning/optimizing the decentralized policy. Furthermore, the dimensions of the count variables are not affected by the population size. This allows us to transform the original planning problems to optimize the complex joint agent trajectory into optimizing compact count variables. In addition, samples of the counts can be efficiently obtained with multinomial distributions, which provide a faster way to simulate the multi-agent systems and evaluate the planning policy. 2. Collective Multi-agent Reinforcement Learning (MRL): Firstly, to address multi-agent credit assignment problem in {\cmdp}, we propose the collective decomposition principle in designing value function approximation and decentralized policy update. Under this principle, the decentralized policy of each agent is updated using an individualized value instead of a joint global value. We formulate a joint update for policies of all agents using the counts, which is much more scalable than independent policy update with joint trajectory. Secondly, based on the collective decomposition principle, we design 2 classes of MRL algorithms for domains with local rewards and for domains with global rewards respectively. i) When the reward is decomposable into local rewards among agents, by exploiting exchangeability in CDEC-POMDPs we propose a mechanism to estimate the individual value function by using the sampled values of the counts and average individual rewards. We use this count-based individual value function to derive a new actor critic algorithm called fAfC to learn effective individual policy for agents. ii) When the reward is non-decomposable, the system performance is evaluated by a single global value function instead of individual value functions. To follow the decomposition principle, we show how to estimate individual contribution value of agents using partial differentials of the joint value function with respect to the state-action counts. This is the basis for us to develop two algorithms called MCAC and CCAC to optimize individual policy under non-decomposable reward domains. Experimentally, we show the superiority of our proposed collective MRL algorithms in various testing domains: a real-world taxi supply-demand matching domain, a police patrolling game and a synthetic robot navigation domain, with population size up to 8000. They converge faster convergence and provide better solutions than other algorithms in the literature, i.e. average-flow based algorithms and standard actor critic algorithm.
Top-K recommendation is a typical task in Recommender Systems. In traditional approaches, it mainly relies on the modeling of user-item associations, which emphasizes the user-specific factor or personalization. Here, we investigate another direction that models item-item associations, especially with the notions of sequence-aware and basket-level adoptions . Sequences are created by sorting item adoptions chronologically. The associations between items along sequences, referred to as “sequential associations”, indicate the influence of the preceding adoptions on the following adoptions. Considering a basket of items consumed at the same time step (e.g., a session, a day), “basket-oriented associations” imply correlative dependencies among these items. In this dissertation, we present research works on modeling “sequential & basket-oriented associations” independently and jointly for the Top-K recommendation task.
Three Essays on Credit Default Swaps
Chapter 1: Credit Default Swaps Pricing Errors and Related Stock Returns
This article investigates the impacts of Credit Default Swaps (CDS) pricing errors on related stock returns. Using a parsimonious CDS valuation model, which produces an above average adjusted R2 of 90%, I find that its pricing errors significantly predict cross-section stock returns. Further investigation reveals that the cross-market return predictability channels via Merton (1974)'s structural prediction and primary dealers' capital risk. This paper provides a novel view of the complex interactions of capital markets and offers insights on the relative market efficiencies.
Chapter 2: CDS Markets Informativeness and Related Hard-to-Value Stock Returns
This research investigates the conundrum whether the Credit Default Swaps (CDS) market is informed relative to the equity market. To do this, we examine the impact of CDS price changes on stock returns calculated by transaction prices in various trading intervals within daily close-to-close. We find that stock returns overreact to credit news during trading hours and partially reverse after the market closes. The predictive effect of CDS news concentrates on ``hard-to-value stocks'' with high credit spreads. The reversal happens mainly because overconfident investors over-bet on credit news. Limit-to-arbitrage such as stock illiquidity and short-sale constraint cannot fully explain the predictive results. Overall, our empirical evidence suggests that CDS informed traders step into hard-to-value stocks with high credit spread levels.
Chapter 3: The Effect of CDS on Earnings Quality: The Role of CDS Information
This paper investigates whether the initiation of trading in credit default swaps (CDSs) on a borrowing firm's outstanding debt is associated with the decline in that firm's earnings quality. Using a differences-in-differences approach, we find that after CDS trade initiation, there is a significant reduction in intentional earnings manipulation of the underlying borrowing firms. The reduction of earnings management activities is channeled through trade credit exposures and corporate cash holdings. Further, we show that CDS prices convey distress risk information of firms with poor earnings quality and help to improve their risk fundamentals through conservative liquidity management strategy such as holding more cash, enhancing future operating cash flow, and increasing net working capital. Overall, our evidence suggests that an external monitoring role provided by CDS markets can reduce earnings management activities and mitigate the information asymmetry between corporate insiders and outsiders.
Advances in artificial intelligence are leading to many revolutions in robotics. How will the arrival of robots impact the growth of the economy, the workers' wage, consumption, and lifetime welfare? This dissertation attempts to answer this question by presenting a standard neoclassical growth model with two different kinds of robots, reflecting two ways that robots can transform the labor market. The first chapter introduces additive robots- a perfect substitution for human labor, while the second chapter employs multiplicative robots- a type of robots that augments human labor. The prevailing main result is that even in the case with no population growth and technical progress, the application of robots is enough to create a long term economic growth. Nevertheless, there is a difference in the behavior of real wage. The presence of additive robot solely makes wage jumps down and then stays constant throughout while utilization of multiplicative robots alone can increase productivity thus real wage increases fast over time.
In the last chapter, both types of robots are applied in the economy with a shrinking population, motivated by Japan. Under the perfect homogeneous labor market, there will be a shift of workers from jobs that can be substituted by additive robots to jobs that can be supported by multiplicative robots. This enables Japan to continue to enjoy the perpetual growth in real wage, consumption and wealth even after the labor market has finished its adjustment. However, as the interest rate would slowly decrease, proportionate to the decline of the population, there would be a point where it is no longer profitable to adopt robots although it would take a long time for the economy to face that issue.
.
In the past few decades, supervised machine learning approach is one of the most important methodologies in the Natural Language Processing (NLP) community. Although various kinds of supervised learning methods have been proposed to obtain the state-of-the-art performance across most NLP tasks, the bottleneck of them lies in the heavy reliance on the large amount of manually annotated data, which is not always available in our desired target domain/task. To alleviate the data sparsity issue in the target domain/task, an attractive solution is to find sufficient labeled data from a related source domain/task. However, for most NLP applications, due to the discrepancy between the distributions of the two domains/tasks, directly training any supervised models only based on labeled data in the source domain/task usually results in poor performance in the target domain/task. Therefore, it is necessary to develop effective transfer learning techniques to leverage rich annotations in the source domain/task to improve the model performance in the target domain/task.
There are generally two settings of transfer learning. We use supervised transfer learning to refer to the setting when a small amount of labeled target data is available during training, and when no such data is available we call it unsupervised transfer learning. In this thesis, we focus on proposing novel transfer learning methods for different NLP tasks in both settings, with the goal of inducing an invariant latent feature space across domains or tasks, where the knowledge gained from the source domain/task can be easily adapted to the target domain/task.
In the unsupervised transfer learning setting, we first propose a simple yet effective domain adaptation method by deriving shared representations with instance similarity features, which can be generally applied for different NLP tasks, and empirical evaluation on several NLP tasks shows that our method has indistinguishable or even better performance than a widely used domain adaptation method. Furthermore, we target at a specific NLP task, i.e., sentiment classification, and propose a neural domain adaptation framework, which performs joint learning of the actual sentiment classification task and several manually designed domain-independent auxiliary tasks to produce shared representations across domains. Extensive experiments on both sentence-level and document-level sentiment classification demonstrate that our proposed domain adaptation framework can achieve promising results.
In the supervised transfer learning setting, we first propose a neural domain adaptation approach for retrieval-based question answering systems by simultaneously learning shared feature representations and modelling inter-domain and intra-domain relationships in a unified model, followed by conducting both intrinsic and extrinsic evaluation to demonstrate the efficiency and effectiveness of our method. Moreover, we attempt to improve multi-label emotion classification with the help of sentiment classification by proposing a dual attention transfer network, where a shared feature space is employed to capture the general sentiment words, and another task-specific space is employed to capture the specific emotion words. Experimental results show that our method is able to outperform several highly competitive transfer learning methods.
Although the transfer learning methods proposed in this thesis are originally designed for natural language processing tasks, most of them can be potentially applied to classification tasks in the other research communities such as computer vision and speech processing.
This dissertation consists of three papers in mutual fund governance or market microstructure that analyze the causal effect of board independence on mutual fund performance or the trading behavior of institutional trading and informed trading.
Chapter I studies how board independence affects fund performance, in relation to investment experience of independent directors. Using the SEC amendment in 2001 as an exogenous shock, I find that board independence does not improve or damage fund performance on average. When a fund board has independent directors with investment experience, however, it boosts fund performance. I also find that a fund manager is less constrained and the management fee on a contract is more aligned with fund performance under such a fund board. My findings suggest that board independence is not always beneficial to mutual fund shareholders, but its effectiveness varies depending on independent directors' investment experience.
Chapter II estimates daily aggregate order flow of individual stocks from all institutional investors as well as for hedge funds and other institutions separately. This study is coauthored with my advisor, Prof. Jianfeng Hu. We achieve this by extrapolating the relation between quarterly institutional ownership in 13F filings, aggregate market order imbalance in TAQ, and a representative group of institutional investors' transaction data. We find that the estimated institutional order imbalance positively predicts stock return on the next day and outperforms other institutional order flow estimates. The institutional order flow from hedge funds creates smaller contemporaneous price pressure and generates greater and more persistent price impact than the order flow from all other institutions. We also find that hedge funds trade on well-known anomalies while the other institutions do not. Our findings suggest that the superior trading skills of institutional investors can be largely attributed to hedge funds.
Lastly, I propose a simple measure of informed trading based on the Kyle (1985) model in Chapter III. This study is also coauthored with my advisor, Prof. Jianfeng Hu. We first calculate implied order imbalance (IOI) as contemporaneous stock returns divided by low-frequency illiquidity measures. The implied informed trading (IIT) is the residual of IOI regressed on its components (returns and illiquidity). We find that IIT positively predicts short-term future stock returns without subsequent reversals in the cross-section between 1927 and 2016. This predictability is robust in subperiods, and strengthens in stocks with high information asymmetry and before corporate events. The predictability survives existing measures of informed trading including short selling activities, order imbalance, and institutional trading in recent periods. Finally, IIT has the same predictive ability in G10 equity markets.
As cities worldwide invest heavily in smart city infrastructure, it invites opportunities for a next wave of urban analytics. Unlike its predecessors, urban analytics applications and services can now be real-time and proactive -- they can (a) leverage situational data from large deployments of connected sensors, (b) capture attributes of a variety of entities that make up the urban fabric (e.g., people and their social relationships, transport nodes, utilities, etc.), and (c) use predictive insights to both proactively optimize urban operations (e.g., HVAC systems in smart buildings, buses in the transportation network, crowd-workers, etc.) and promote smarter policy decisions (e.g., land use decisions pertaining to the positioning of retail establishments, incentives and rebates for businesses).
Individual and collective mobility has been long-touted as a key enabler of urban planning studies. With everyday artefacts that a city's population interacts with being increasingly embedded with hardware (e.g., contact-less smart fare cards that people tap-in and out of buses and metro), and due to the sheer uptake of location-based social media platforms in recent years, a wealth of mobility information is made available for both online and offline processing. This thesis makes two principal contributions -- it explores how such abundantly available mobility information can be (a) integrated with other urban data to provide aggregated insights into demand for urban resources, and (b) used to understand relationships among people and predict their movement behavior (including deviations from normal patterns). Additionally, this thesis introduces opportunities and offers preliminary evidence of how mobility information can be used to support a more efficient urban sensing infrastructure.
First, the thesis explores how mobility can be combined with other urban data for better policy decisions and resource utilization prediction. It first investigates how aggregate mobility data from heterogeneous sources such as public transportation and social media, can aid in quantifying urban constructs (e.g., customer visitation patterns, mobility dynamics of neighborhoods) and then demonstrate their use, as an example, in predicting the survival chances of individual retailers, a key performance measure of land use decisions of a city.
In the past, studies have relied on the predictability of mobility to generate various urban insights. In a complementary effort, by demonstrating the ability to predict instances of unpredictability, sufficiently in advance, this thesis explores opportunities to proactively optimize urban operations by harnessing such unpredictability. First it looks at individual mobility at campus-scale, to discover and quantify social ties. It then describes a framework to detect episodes of future anomalous mobility using social tie-aware mobility information, and then use such early warnings to demonstrate its use in an exemplar smart campus application; task assignments of workers of a mobility-aware crowd-sourcing platform.
In a final exposition of emerging possibilities of using mobility for real-time, operational optimization, I introduce a paradigm for collaboration between co-located sensors in dense deployments that exploits human mobility, at short spatio-temporal scales. As preliminary work, this thesis investigates how associations between densely co-located cameras with partially overlapping views can reinforce inferences for better accuracy, and offers evidence of the feasibility to run adaptive, light-weight operations of deep learning networks that drastically cut down on processing latencies.
This thesis provides additional examples of real--time, in-situ, mobility-driven urban applications, and concludes with key future directions.
The explosive growth of the ecosystem of personal and ambient computing de- vices coupled with the proliferation of high-speed connectivity has enabled ex- tremely powerful and varied mobile computing applications that are used every- where. While such applications have tremendous potential to improve the lives of impaired users, most mobile applications have impoverished designs to be inclusive– lacking support for users with specific disabilities. Mobile app designers today haveinadequate support to design existing classes of apps to support users with specific disabilities, and more so, lack the support to design apps that specifically target these users. One way to resolve this is to use an empathetic computing system to let designer-developers step into the shoes of impaired users and experience the impairment while evaluating the designs of mobile apps.
A key challenge to enable this is in supporting real-time naturalistic interactions in an interaction environment that maintains consistency between the user’s tactile, visual and proprioceptive perceptions with no perceivable discontinuity. This has to be performed within the context of an immersive virtual environment, which allows control of any visual or auditory artefacts to simulate impairments. To achieve this, substantial considerations of the interaction experience and coordination between the various system components are required.
We designed Empath-D, an augmented virtuality system that addresses this chal- lenge. I show in this dissertation that through the use of naturalistic interaction in augmented virtuality, the immersive simulation of impairments can better support identifying and fixing impairment specific problems in the design of mobile appli- cations.
The dissertation was validated in the following way. I first demonstrate that the concept of immersive evaluation results in lower mental demands for designers in a design study. I then show that Empath-D despite the latencies introduced through creating the augmented virtuality, is usable, and has interaction performance closely matching physical interaction that is sufficient for most application uses, except where rapid interaction is required, such as in games. Next, I show that Empath-D is capable of simulating impairments such as to produce similar interaction perfor- mance. Finally, in an extensive user study, I demonstrate that Empath-D is able to identify more usability problems for specific impairments than with state of the art tools.
This thesis, to the best of my knowledge, is the first of its kind work to i) design and examine an augmented virtuality interface that supports naturalistic interaction with a mobile device, and ii) examine the impact of immersive simulations of im- pairments in evaluating the designs of mobile applications for accessibility.
Flexible Moral Behavior in the Workplace
In my dissertation, I systematically examine what it means to be morally flexible. I develop a scale to capture an individual’s willingness to adapt their moral behavior and examine both positive and negative consequences of this type of moral flexibility in the workplace. My dissertation consists of three studies. In Chapter 2, I draw from the personality strength literature and research on within-person variability in moral behavior to introduce the construct of moral adaptability (MA) defined as the willing to adjust moral behavior depending on the situation. I argue MA functions in a similar manner to personality strength (but in the moral domain), such that when individuals are low in MA, they are most likely to express their moral values, while being high in MA makes individuals much more susceptible to situational influences and less likely to express their moral preferences. I develop and validate a scale assessing the construct and demonstrate in six independent samples (four samples of undergraduate students at a university in Singapore and two samples of working adults in the United States) that the scale shows good psychometric properties. I also develop initial empirical evidence for how the scale functions and in two independent samples illustrate how low MA strengthens the positive relationship between moral character (i.e., internalization dimension of moral identity, Honesty-Humility) and moral behavior (i.e., charitable behavior) and explains both positive and negative employee outcomes (i.e., constructive deviance, unethical pro-organizational behavior).
Building from these results, in Chapter 3, I draw from feelings-as-information theory and propose that individuals high in MA use situations to justify their past unethical behavior and therefore experience less guilt and shame. An experience-sample study with a sample of 55 undergraduate students in Singapore shows that respondents high in MA experienced lower guilt and shame in the aftermath of their unethical behavior as compared to those low in MA and that MA explained the felt emotions above and beyond a wide array of other traditional moral constructs.
In Chapter 4, I integrate the concept of ethical leadership to examine the implications of MA for workplace interpersonal relationships and leader influence on subordinates. Drawing from research on role modeling and moral self-threat, I hypothesize that subordinates who perceive ethics to be highly relevant to them and work under an ethical leader with low MA are more likely to experience higher self-threat due to the leader and become less likely to perceive their ethical leader as a role model. A two-wave sample of 486 subordinate-supervisor dyads in organizations in India provides partial support that subordinates who perceive ethics to be highly relevant and work under an ethical leader with low MA experience higher self-threat due to the leader.
In each chapter, I discuss the contributions of this dissertation to the organizational ethics literature, practical implications for managers in organizations, limitations, and directions for future research.
In this dissertation, we address the subject of modeling and simulation of agents and their movement decision in a network environment. We emphasize the development of high quality agent-based simulation models as a prerequisite before utilization of the model as an evaluation tool for various recommender systems and policies. To achieve this, we propose a methodological framework for development of agent-based models, combining approaches such as discrete choice models and data-driven modeling.
The discrete choice model is widely used in the field of transportation, with a distinct utility function (e.g., demand or revenue-driven). Through discrete choice models, the movement decision of agents are dependent on a utility function, where every agent chooses a travel option (e.g., travel to a link) out of a finite set. In our work, not only do we demonstrate the effectiveness of this model in the field of transportation with a multiagent simulation model and a tiered decision model, we demonstrate our approach in other domains (i.e., leisure and migration). where the utility function might not be as clear, or involve various qualitative variables.
The contribution of this dissertation is therefore two-fold. We first propose a methodological framework for development of agent-based models under the conditions of varying data observability and network model scale. Thereafter, we demonstrate the applicability of the proposed framework through the use of three case studies, each representing a different problem domain.
Information Diffusion and Market Friction
How markets impound information into asset prices is one of the most important concerns of financial economics. Due to behavioural bias and transaction friction, information could be mispriced in the real world, thus driving market anomalies and return predictability of behavioural factors. My dissertation contributes to the literature by investigating how information can be quantified, acquired, disseminated and priced in the financial market with the existence of market frictions.
In Chapter 2, we propose an efficient method based on machine learning and textual analysis to quantify cross industry news and shed light on how news travels across different industries. The results show that cross-industry news contains valuable information about firm fundamentals that is not fully captured by firms’ own news or within-industry peers’ news. Stock prices do not promptly incorporate cross-industry news, generating return predictability. Moreover, underreaction to cross-industry news is more pronounced among smaller stocks that are more illiquid, more volatile, and have fewer analysts following. A long–short strategy exploiting cross-industry news yields annual alphas of over 10%.
In Chapter 3, we construct a novel measure of market wide investor attention by applying a social network analysis to aggregate the attention spillover effects among stocks that are co-mentioned by media news. Empirically, we find that the News Network Triggered Attention index (NNTA), negatively predicts market returns with a monthly in-sample (out-of-sample) R-square of 5.97% (5.80%). In the cross-section, a long-short portfolio based on a news co-occurrence generates a significant monthly alpha of 68 basis points. We further validate the attention spillover effect by showing that news co-mentioning significantly increases Google and Bloomberg search volumes than that of unconditional news coverage. The results hence suggest that attention spillover in a news-based network can lead to significant stock market overvaluations, especially when arbitrage is limited.
Besides behavioural bias, security analysts seem to also contribute to the market friction by issuing biased recommendations. In Chapter 4, we find that the biased recommendations of analysts could be a source of market friction that impede the efficient correction of mispricing. In particular, analysts tend to make more favourable recommendations to overvalued stocks, which have particularly negative abnormal returns ex-post. While analysts whose recommendations are better aligned with anomaly signals are more skilled and elicit stronger recommendation announcement returns.
Comparison Mining from Text
Online product reviews are important factors of consumers' purchase decisions. They invade more and more spheres of our life, we have reviews on books, electronics, groceries, entertainments, restaurants, travel experiences, etc. More than 90 percent of consumers read online reviews before they purchase products as reported by various consumers surveys. This observation suggests that product review information enhances consumer experience and helps them to make better-informed purchase decisions. There is an enormous amount of online reviews posted on e-commerce platforms, such as Amazon, Apple, Yelp, TripAdvisor. They vary in information and may be written with different experiences and preferences.
If online opinions are indeed important in many spheres of our lives, then their systematic analysis is a real-life problem. Due to an enormous amount of opinions scattered across the Web, a handcrafted analysis seems to carry an inadmissible cost of time and efforts. An alternative to consider is an automated or, more appropriately, semi-automated analysis conducted by computers as an assistance to human analysts. Text processing applications have received much attention in the past three decades and have been shown successful for language understanding.
Comparison mining aims at understanding opinion mining problems when multiple entities are present simultaneously. This includes, but not limited to deriving similarities and differences between entities and discovering information about the entity relations. The entities may be products, individuals, issues, etc. The notion of comparison tangles in in a form of joint evaluative statements, such as "I think A is better than B", "I think A is a good alternative to B", and introduces new research questions, similar and yet different from traditional opinion mining. How do we find these statements in a review? How do we interpret these statements? How do we make sense of thousands of such comparisons? In this study, we seek to answer these questions and propose a set of related computational solutions.
First, we investigate a comparison identification problem and cast it as a relation extraction problem. Within the relation extraction setup, we develop a new approach for identifying comparative relations. The formal investigation of the syntactic structure of comparative statements leads us to a kernel-based approach, which relies on the dependency structure of sentences. The proposed method shows state-of-the-art results for the comparison identification problem.
Second, we explore intrinsic properties of a comparative corpus to derive a joint model for comparison interpretation and aggregation. At the level of comparisons, the model seeks to derive the comparison outcome of a statement, i.e., which entity is preferred by the writer. At the aggregated level, it seeks to understand the overall ranking of the entities in a corpus of comparisons. The proposed model is shown to be superior to the approaches that tackle each level separately. An empirical evaluation demonstrates its effectiveness on real-world datasets.
Third, we look at the phenomenon of comparison disagreement, i.e., different users may have different preferences over the same set of entities. To capture this diversity, we propose a model for preference clustering and demonstrate its effectiveness and utility.
Fourth, we propose a method for explaining entity comparisons, when entities are identified by their textual representations. CompareLDA, a supervised topic model, is employed to align topics, distributions of co-occurring words, with comparisons, so that the topics are indicative of the "better" and "worse" entities. Through an empirical evaluation, we show that the proposed model is more effective for capturing comparisons than alternative supervised topic models.
All the proposed methods form substantial contribution within the comparison mining research and facilitate a better understanding of the opinion language.
Innovation and creativity are the engines of social and economic progress. What roles do women play in innovation? Emerging evidence reveals that fewer women than men enter and succeed in innovation-related fields. Tackling gender inequality at work has always been one of the grand societal challenges, however little is known about gender issues specific to innovation achievements. This dissertation attempts to explain gender gaps in the innovation and creativity context. Innovation typically involves generating multiple novel and useful ideas, selecting the most promising one for implementation, and persistently championing the idea through implementation. I theorize and unpack the gender effect situated in different stages of innovation, specifically in idea selection and idea championing.
I propose that although women are equally capable as men in generating highly novel ideas, there is greater “novelty avoidance” in women than men - the extent to which individuals refrain from pursuing the most novel ideas they have generated. In a series of studies designed to feature the innovation process (Studies 1-3), I showed the differential influence of gender on idea generation and idea selection. Furthermore, I tested three alternative explanations to the gender difference in novelty avoidance tendency, namely, risk aversion, interdependent self-construal, and fear of social backlash associated with novelty (Study 2). Results suggest that fear of social backlash associated with novelty explains the gendered novelty avoiding/seeking tendencies. I also proposed and showed that the gender difference in novelty avoidance tendency was alleviated when women were told that their innovation will be judged by other women (Study 3).
For idea championing (Studies 4-6), I theorize that women employees are less likely than their men colleagues to engage in autonomous idea championing - bypassing norms, rules, and established procedures to promote creative ideas. Drawing on the “creative prototype model”, I further theorized and showed that the more men employees autonomously champion their creative ideas, the more their supervisors perceived them as creative. In contrast, when women employees engaged in autonomous championing behaviors, they faced backlashes, especially from their women supervisors.
I conclude by discussing the implications of these findings and future research to help advance current understanding of the challenges and opportunities that surround women innovators.
Being born into a poorer family is associated with lower socioeconomic attainment even when people are provided with identical educational and job opportunities, a pattern known as the “class ceiling.” The class ceiling is generated within organizations, but specific reasons causing this effect are not well understood. I propose that one important explanation why employees from poorer families do not fare as well as their more fortunate co-workers concerns differences in families themselves. I integrate research from sociology and psychology explaining challenges faced by families with scarce resources with organizational research on specific pathways through which families can interfere with work activities of employees. This theoretical integration suggests that higher family demands (in terms of time and values) and lower family resources (instrumental support and behavioral scripts) among workers from poorer backgrounds cause a negative influence on employee personal resources, and thus act as a mechanism of disadvantage reproduction after workers join the organization. A large field study of early-career employees who managed to obtain a higher education and secure high-potential jobs conducted in Singapore provides support for the model. I propose and test both institutional as well as individual solutions to the problem. I show that higher organizational support can compensate for lower family resources, but I also find that, at present, most organizations fail to provide such support. Second, I develop and test a psychological intervention that helps workers from poorer backgrounds cope more effectively with higher family demands. A two-week field experiment utilizing a dairy study design provides evidence of the effectiveness of the intervention. Taken together, this research uncovers a fundamental process through which the class ceiling is generated and offers solutions to resolve the identified issues, with implications for socioeconomic mobility, employee wellbeing, organizational effectiveness, and a positive role of organizations in the society.
This dissertation addresses the empirical analysis on user-generated data from multiple online social platforms (OSPs) and modeling of latent user factors in multiple OSPs setting.
In the first part of this dissertation, we conducted cross-platform empirical studies to better understand user's social and work activities in multiple OSPs. In particular, we proposed new methodologies to analyze users' friendship maintenance and collaborative activities in multiple OSPs. We also apply the proposed methodologies on real-world OSP datasets, and the findings from our empirical studies have provided us with a better understanding on users' social and work activities which are previously not uncovered in single OSP studies.
In the second part of this dissertation, we developed user modeling techniques to learn latent user factors in multiple OSPs setting. In particular, we proposed generative models to learn the user topical interests, topic-specific platform preferences and influences in multiple OSPs setting. The proposed models are also applied to real-world OSPs datasets to profile user topical interests and identify influential users in multiple OSPs. The designed generative models are also generalizable and can be applied to different cross-OSP datasets.
Question answering (QA) is one of the most important applications in natural language processing. With the explosive text data from the Internet, intelligently getting answers of questions will help humans more efficiently collect useful information. My research in this thesis mainly focuses on solving question answering problem with textual sequence matching model which is to build vectorized representations for pairs of text sequences to enable better reasoning. And our thesis consists of three major parts.
In Part I, we propose two general models for building vectorized representations over a pair of sentences, which can be directly used to solve the tasks of answer selection, natural language inference, etc.. In Chapter 3, we propose a model named ``match-LSTM", which performs word-by-word matching followed by a LSTM to place more emphasis on important word-level matching representations. On the Stanford Natural Language Inference (SNLI) corpus, our model achieved the state of the art. Next in Chapter 4, we present a general ``compare-aggregate'' framework that performs word-level matching followed by aggregation using Convolutional Neural Networks. We focus on exploring 6 different comparison functions we can use for word-level matching, and find that some simple comparison functions based on element-wise operations work better than standard neural network and neural tensor network based comparison.
In Part II, we make use of the sequence matching model to address the task of machine reading comprehension, where the models need to answer the question based on a specific passage. In Chapter 5, we explore the power of word-level matching for better locating the answer span from the given passage for each question in the task of machine reading comprehension. We propose an end-to-end neural architecture for the task. The architecture is based on match-LSTM and Pointer Net which constrains the output tokens coming from the given passage. We further propose two ways of using Pointer Net for our tasks. Our experiments show that both of our two models substantially outperform the best result using logistic regression and manually crafted features. Besides, our boundary model also achieved the best performance on the SQuAD and MSMARCO dataset. In Chapter 6, we will explore another challenging task, multi-choice reading comprehension, where several candidate answers are also given besides the question related passage. We propose a new co-matching approach to this problem, which jointly models whether a passage can match both a question and a candidate answer.
In Part III, we focus on solving the problem of open-domain question answering, where no specific passage is given any more comparing to the reading comprehension task. Our models for solving this problem still rely on the textual sequence matching model to build ranking and reading comprehension models. In Chapter 7, we present a novel open-domain QA system called Reinforced Ranker-Reader (R3), which jointly trains the Ranker along with an answer-extraction Readermodel, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets. As this work can only make use of a single retrieved passage to answer the question, in the next Chapter 8, we propose two models, strength-based re-ranking and coverage-based re-ranking, which make use of multiple passages to generate their answers. Our models have achieved state-of-the-art results on three public open-domain QA datasets: Quasar-T, SearchQA and the open-domain version of TriviaQA, with about 8 percentage points of improvement over the former two datasets.
Financial Market Implications of Marketing Actions and the Disclosure of Marketing Information
This dissertation contributes to marketing literature by examining the effects of marketing actions and the disclosure of marketing information through the financial market perspective that has high managerial relevance and policy implications. Specifically, the first essay provides the first empirical examination of the effects of the disclosure of advertising spending on investors’ and analysts’ uncertainty. It is responsive to calls by Marketing Science Institute and Marketing Accountability Standard Board to examine the consequences of disclosure of marketing metrics. The second essay examines the effect of advertising spending on firm value and explores firm and market level contingency effects. By doing so, the second essay identifies relevant firm financial conditions and market environments, and provides managerial implications. Importantly, for the identification of the proposed effects in both essays, I apply the instrumentation strategy to address the potential endogeneity concerns related to disclosure of advertising spending and the level of advertising spending. Specifically, I develop and propose instruments alternative to those frequently used in extant literature in marketing. Accordingly, this dissertation contributes to marketing literature by proposing alternative instruments that can mitigate the critiques by Rossi (2014) and Angrist (2014) and potentially be used in other contexts.
This dissertation discusses two questions in international economics. The first two chapters focus on the matter of global value chains. We first explore the participation of Singapore in the global value chains by characterizing the position of Singapore in the global network and identifying Singapore’s key upstream and downstream trade partners. This is done at both the country aggregate and at the sector level. We trace how the country’s position in global value chains has changed in the past two decades: whether it has moved upstream or downstream, how involved it is in global value chains, how its trend compares with other major Asian exporters (including Japan, Korea, China, Taiwan, and Hong Kong). In addition, the paper identifies the key sectors of Singapore which play a major role in the global trade networks.
The second chapter expands the analysis to a larger trading block – the Comprehensive and Progressive Agreement for Trans-Pacific Partnership (CPTPP). This is an example of a “mega-regional” free trade agreement, whose provisions on the rules of origin and trade facilitation can have potentially large impacts on the CPTPP-wide supply chains. We investigate whether the CPTPP members are key upstream and downstream trade partners to one another in the global value chains. In doing this, we hope to evaluate how closely connected the CPTPP members were with one another in the global value chains before the formation of CPTPP. Would alternative groupings with the addition of some third country enhance the tightness of the network? We develop formula of bilateral upstreamness and downstreamness, based on the gross-export decomposition framework of Koopman, Wang, and Wei (2014) and Borin and Mancini (2014). The chapter demonstrates how the decomposition of gross exports can be used to construct informative measures of the position of countries in the global value chains.
The final chapter explores a question on the effects of fluctuations in the crude oil market on various external accounts. We employ a structural Vector Autoregression model to investigate the impact of oil price changes on the external balances of oil-exporting and oil-importing countries. We look deeper into the non-oil trade balance of each country to determine the dynamics of the durable and non-durable trades in response to both demand and supply oil price shocks. We find that the source of crude oil price fluctuations lead to diverse effects on both the macroeconomic aggregates as well as the exports and the imports of goods. The paper reaffirms the importance of distinguishing shocks in the energy market when studying their effects and formulating appropriate policies.
Essays in Asset Pricing
My dissertation centers on two areas related to market microstructure. First, the role of retail traders, or the information they have, for financial markets. I use retail short selling as an example to narrow down the topic and study their trading patterns and trading strategies. Second, the role of passive indexers such as index funds or ETFs, for financial markets. I am trying to analyze the channels through which the nominally uninformed traders have influences on the return and market quality of the underlying stocks.
In Chapter 2, I study retail short sellers. It is interesting to paint retail short sellers in a positive light, because most of the literature assume that retail investors are noise traders and less likely to take short selling positions. First, by using a novel short sell transaction data from 2010 to 2016, this paper is the first to provide a comprehensive sample of short selling initiated by retail investors. I find that retail short selling is not limited, which takes up around 11% of retail trading. Second, using this sample, I find that retail short selling can predict negative stock returns. A trading strategy that mimics weekly retail shorting earns an annualized riskadjusted value-(equal-) weighted return of 6% (12.25%). Their predictive ability is beyond that coming from overall retail investors as a group or from off-exchange institutional short sellers. Third, my results suggest that retail short sellers can profitably exploit public information, especially when it is negative. Retail short sellers also tend to be contrarians who provide liquidity when the market is one-sided due to (institutional) buying pressures. Therefore, this paper broadens our understanding on the heterogeneity of short sellers, sheds new light on the strategies of informed traders, and complements a growing literature about the informativeness of retail investors.
In Chapter 3, I am mainly working on passive investing. Since the decision to buy or sell stocks is often directed by broader fund flows and rebalancing and not typically by stock fundamentals, we construct proxies for the two sources of trading in passive investments: one is proportionally flow-induced trading; the other is disproportionally index rebalancing. Next, we consider systematic information and provide three measures of price efficiency: price delay, variance ratio, and return synchronicity. We find that indexing significantly increases price efficiency, especially the market- (industry-) wide information. There are two related channels that drive this positive effect. First is through arbitrage: passive investing causes price discrepancies and decreases arbitrage costs, which in turn increases the speed with which systematic information is incorporated into stock prices. The second driver of the positive effect is through short selling: stocks that are added to indexes increase the available lendable shares, which reduces the cost of short selling and thus makes the incorporation of (negative) information into price faster. Overall, this paper establishes and explains the link between indexing and market quality.
In Chapter 4, I am instead starting from passive holding. Especially, index funds right now are the largest stake holders of most SP500 stocks, and they are regarded as long-term holders. It is therefore important to see how the fund managers are motived to monitor, vote, and engage with firm-level governance and long-term performance. We find that the stocks with the longest passive holding indeed outperform. We stress that the active monitoring role of passive funds contributes to long-term value creation.
Dynamic analysis is widely used in malware detection, taint analysis, vulnerability detection, and other areas for enhancing the security of Android. Compared to static analysis, dynamic analysis is immune to common code obfuscation techniques and dynamic code loading. Existing dynamic analysis techniques rely on in-lab running environment (e.g., modified systems, rooted devices, or emulators) and require automatic input generators to execute the target app. However, these techniques could be bypassed by anti-analysis techniques that allow apps to hide sensitive behavior when an in-lab environment is detected through predefined heuristics (e.g., IMEI number of the device is invalid). Meanwhile, current input generators are still not intelligent enough to invoke adequate app behavior and provide sufficient code coverage. Therefore, it is an important research direction to investigate dynamic analysis techniques which enable a more complete execution under real running environments. This thesis focuses on dynamically analyzing app behavior by using public APIs and side-channel information, such that the techniques can be deployed on unrooted devices used by public users.
We first propose an advanced code obfuscation technique to hide small pieces of sensitive code with a code-reuse technique. This technique can hinder existing static analysis as well as dynamic analysis based on code-level events, such as API calls or Dalvik instructions. We implement a semi-automatic tool named AndroidCubo and show that it protects both Java and native code at a small runtime overhead.
Since relying on code-level event monitoring for revealing underlying app behavior can be bypassed by obfuscation and anti-analysis techniques, we propose a novel technique to dynamically monitoring apps by observing changes to public resources on the device. We propose to observe the resources with public APIs and virtual file interfaces to monitor sensitive behavior, and then use machine learning techniques to identify the initiating app of the behavior. We implement a system named UpDroid which contains a monitor published on Google Play and a server-side analyzer. UpDroid can be easily deployed on devices used by the public and successfully monitor sensitive behavior of the app that is being analyzed. This work is a successful investigation of dynamic analysis on unrooted devices.
To conduct more fine-grained analysis on apps, we propose to use GPU interrupt timing information to infer the launched app and concrete behavior within a running app, such as layout switching. We obtain GPU interrupt timing information from a side channel - /proc/interrupts. We sample the number of the raised GPU interrupt and get the timing series while an activity occurs on the device to generate a feature vector for that activity. Then, we use machine learning techniques to train classification models for the activities. With the models, we are able to identify different types of app activities, e.g., identify the launched app or disdinguish the activities within an app. This work further demonstrates the effectiveness of dynamic analysis on unrooted devices.
Finally, we conduct a simulation study for dynamically analyzing the factors that would affect the malware spreading on unrooted devices. In this work, we recruit participants to spread out messages, which simulates the malware spreading messages sent from infected mobile devices, to their friends. Each message contains a malicious-look link to simulate the malware downloading links. When the participants spread out the messages, we use dynamic analysis to monitor the status of their devices and record the infection rate. The results show that spreading method, relationship, contact frequency would significantly affect the spreading of malware by analyzing the infection rates of different statuses of the device and the differences of the spreading messages.