Etd

Modeling User Behaviors in Online Systems and Ubiquitous Devices

Public Deposited

The proliferation of online systems and the near-ubiquitously owned smart devices have not only dramatically changed people's lifestyles but also opened up new avenues for research. The data we collect from our everyday lives, modestly through sensor signals in smart devices and reflected through posts on online social networks, is a treasure trove for understanding user behaviors. This dissertation, with its three novel contributions, aims to harness the power of this data to solve various problem settings: Firstly, we tackle single-modality, single-label classification problems, a common scenario where data is available but insufficient. This is particularly relevant in the natural language processing domain, where text classification problems abound. Our research not only addresses some character-level and phonetic-level manipulations that current detection frameworks struggle with but also proposes practical solutions in combining domain-specific augmentation methods. These solutions have the potential to significantly improve the accuracy and efficiency of text classification, offering a comprehensive approach to general text classification. We leverage online social networks' posts and benchmark datasets in our experiments and achieved performance boost in various research papers. Secondly, we delve into more challenging problems with minimal data access and a multimodality nature. We propose different scalable automatic data collection and filtering (semi-labeling) methods together with the labeled dataset to counter the lack of data problem. We then benchmark the results through multimodal frameworks. We also provide crucial insights through data analysis procedures which not only enhance our understanding of the data collection and filtering process but also provide a solid foundation for future research in this area. We leverage online social networks image and textual data in our research. We publicly share our labeled dataset and provide benchmarking results in our research. Lastly, we focus on resolving existing multi-label classification problems with modality transformation and beyond. Specifically, we focus on human activity recognition tasks where data is collected through smart mobile sensors. We provide many transformations on label encoding for multi-label classification problems beyond straightforward one-hot encodings. Specifically, we provide graph transformations, thus capturing label co-occurrence relationships and further building various heterogeneous hype-graph learning frameworks for better classification results. We further discuss the potential future works in leveraging language models that better captures semantic meaning in the classification labels. We gained performance boost through various designs across multiple datasets.

Creator
Contributeurs
Degree
Unit
Publisher
Identifier
  • etd-121233
Advisor
Committee
Defense date
Year
  • 2024
Date created
  • 2024-04-19
Resource type
Source
  • etd-121233
Rights statement
Dernière modification
  • 2024-05-29

Relations

Dans Collection:

Contenu

Articles

Permanent link to this page: https://digital.wpi.edu/show/vq27zs266