Security and Privacy Analysis of Web Browsing

Web privacy

In the modern day where the online world is driven by digital advertisement, web tracking has become omnipresent and harmful to users’ privacy. As individuals surf among web pages, an intricate web of cookies, hidden pixels, fingerprinting scripts, and sophisticated algorithms silently monitor their every click. As a result, there has been a tug-of-war between web privacy technologies and web tracking techniques.

OmniCrawl: Comprehensive Measurement of Web Tracking With Real Desktop and Mobile Browsers

The OmniCrawl project focused on web tracking in the mobile environment. The majority of prior research has concentrated on desktop browsers or emulated mobile environments, potentially overlooking the nuances of mobile tracking. To address this gap, the study introduces OmniCrawl, a novel web measurement infrastructure that facilitates real-world browsers, highlighting the limitations of using emulated mobile browsers in research. Through the use of OmniCrawl, our study has revealed that the third-party advertising and tracking ecosystem in mobile browsers is more comparable to that of desktop browsers than previously thought. Our research demonstrates that common methodological choices in web measurement studies, such as the use of emulated mobile browsers and Selenium, can lead to website behavior that differs from what users actually experience.

Fig 1. OmniCrawl overview and workflow

Investigating Advertisers’ Domain-changing Behaviors and Their Impacts on Ad-blocker Filter Lists

The second study shifts the focus to ad blockers and their susceptibility to replica ad domains (RAD domains). Ad blockers traditionally rely on filter lists to thwart ads and trackers, but a rising trend of registering new domains—akin to original ones—has raised concerns about the efficacy of filter lists.

Fig 2. The percentage of ad domains and RAD domains, by purpose. The populations are ad domains (5,133) and RAD domains (420) with identifiable purposes, respectively.

This research embarked on an in-depth investigation of RAD domains, aiming to quantify their prevalence and impact. From a crawl of 50,000 websites, we identified 1,748 unique RAD domains, 1,096 of which survived for an average of 410.5 days before they were blocked. Additionally, we discovered that the RAD domains affected 10.2% of the websites we crawled, and 23.7% of the RAD domains exhibiting privacy-intrusive behaviors, undermining ad blockers’ privacy protection.

Website Fingerprinting

Website fingerprinting (WF) attacks allow an adversary who can observe the traffic patterns of the victim to predict the website the victim is visiting. On the negative side, website fingerprinting can erode user privacy, even when they use Tor. On the positive side, it can help to detect abnormal websites solely based on the traffic pattern and therefore block them.

FALCO: Detecting JavaScript-based Cyber Attack Using Website Fingerprints

We developed FALCO, a system to detect JavaScript injection attacks, which expose users to browser-based DDoS and unwanted ads. The FALCO detects these attacks by identifying discrepancies in website behavior fingerprints, obtained from external domain dependencies. FALCO achieves a 96.98% detection rate. We also offer an easy-to-use browser extension for users.

Fig 3. Obtain fingerprint from requests using Bloom filter

Know Your Victim: Tor Browser Setting Identification via Network Traffic Analysis

In another research, we developed a method to identify users’ browser settings through network traffic analysis to address privacy concerns in the Tor network. We demonstrate that browser settings significantly impact traffic and create a classifier with over 99% accuracy under closed-world assumptions. The project contributes insights into the relation between browser settings and network behavior.

Fig 4. Feature Set Summary

Publications