SaTC: Core: Small: Enabling the Automated Delivery of Context-Aware Notifications

Sponsor: National Science Foundation

Award Number: 2346845

Abstract:

Many online platforms use notifications to users to advise them when the potential harm of communications is in question, as in the case of false advertising, cyberbullying, scams, or personal threats. The notifications permit users to see the source or context of what they receive, rather than take it at face value. Previous research, however, has shown that these efforts fail to flag most of the fraudulence. This situation stems from the difficulty of automatically flagging online material, forcing service providers to heavily rely on manual verification, a scarce resource that cannot keep up with the number of communications posted every day. In this project, the team is developing tools that can enable the automated identification of fraudulence so users can receive correct notifications.

The project is improving the state of the art of automated identification of fraudulent online material. First, the project team is developing robust stance detection techniques powered by recent advances in large language models. These techniques can enable more precise and effective identification of fraudulent material that raises awareness of its context. Second, the team is developing multi-modal techniques that combine the textual and image component of communications and analyze them together, by adapting computer vision techniques like perceptual hashing, optical character recognition, and multi-modal embeddings. Throughout the project, one of the main goals of the project is to develop techniques that are scalable and can operate on vast posted material using limited hardware resources. To this end, the researchers are working on reducing the size of the used machine learning models through model distillation, and on taking advantage of specialized technologies to efficiently handle embeddings like vector databases.

For more information click here.