Web Surfing Under Surveillance: What Data Do Browser Developers Collect?
When you browse websites, someone is watching you. This has become almost normal: data collection is now built not only into web pages but also into many programs. We conducted a study to find out exactly what information popular browser developers learn about you and how much this affects your privacy.
Methodology
When it comes to web surfing, there are two fundamentally different types of data collection: what the browser itself does, and what scripts on websites do.
We used OpenStat statistics to select the most popular browsers in Russia, focusing on desktop versions for Windows. The most common were Google Chrome (almost half the market), Yandex Browser (often installed alongside other programs), Mozilla Firefox, and Opera. Apple Safari was excluded since its Windows version hasn’t been updated since 2012. Microsoft Edge, pre-installed in Windows 10, had a small market share, but we expected it to be the most “spy-like.” We also included Internet Explorer, which is often overrepresented in stats due to other programs identifying as IE.
We evaluated the “spying” behavior of browsers in several stages. First, we downloaded the latest official versions, installed them on clean operating systems, and launched them with default settings. Then we changed the start page to blank and repeated the experiment. Finally, we left the browser open on a blank page (about:blank) for an hour to see if it made any network requests besides checking for updates.
All tests were conducted in virtual machines using both Windows 10 and Windows XP to filter out background traffic. As we previously found in our Windows 10 study, the OS itself sends a lot of data to Microsoft, making it hard to isolate browser activity. Edge (and others) can send requests via system processes, making simple traffic filtering by process name unreliable.
We used several tools to monitor browser network activity: TCPView for real-time connections, MakeCert to generate a fake security certificate and decrypt HTTPS traffic, AppContainer Loopback Exemption to bypass Windows 10 app isolation (especially for Edge and IE), and Wireshark for detailed log analysis. To ensure no packet went unnoticed, we also used a hardware sniffer—a TP-Link MR3040 router flashed with OpenWrt, set up as an intermediate router to log all traffic in real time.
Legalized Surveillance
The idea that someone else knows what you do on your computer has become familiar. Many people are calm about it because they don’t understand the volume and nature of the data sent about their activity. On the other hand, privacy advocates may see any log transmission as a violation. The truth is somewhere in between, and we tried to get as close to it as possible.
Most users believe only “anonymous statistics for product improvement” are collected, as stated in installation warnings. However, the wording is often vague, ending with “…and other information,” giving company lawyers free rein.
Google knows all your contacts, addresses, and even your health status. Microsoft can identify your handwriting. Free antivirus programs (and many paid ones) can legally send any file to their developers as “suspicious.” Compared to this, browsers seem relatively harmless, but their data collection can still have consequences. Let’s see what they send and where.
Google Chrome
On first launch, Chrome 56.0 makes nine connections to Google servers in four subnets. One subnet is in Russia, serviced by Rostelecom. Chrome sends information about its version, OS version, and recent user network activity. If there’s no activity (first launch), the log says “No recent network activity.”
It requests certificates to verify Google.com and its mirrors, including analytics and statistics sites. If you log into Google, extra traffic goes to other Google subnets and servers. Every new tab triggers connections to the same subnets. Chrome generates a unique identifier (X-Client-Data) and uses cookies (NID). All tabs in one browser share the same X-Client-Data identifier.
Some traffic goes to Yandex servers, but in our test, only empty “keep alive” packets were sent. Other traffic is related to Google SafeBrowsing and update checks.
Yandex Browser
Yandex Browser 17.3 is more active from the start, making dozens of connections, not just to Yandex but also to Mail.ru, VKontakte, and even Google. This is likely due to partnerships and alternative search options. Some traffic is sent via system processes, matching the browser’s connections.
The most detailed data goes to api.browser.yandex.ru, including computer and browser configuration, password manager status, and number of bookmarks. It also reports on other browsers installed and their status. In our test, this data totaled 86 KB in plain text, even though the browser was freshly installed. The log even showed our video card as “VirtualBox Graphics Adapter,” meaning Yandex Browser can detect if it’s running in a virtual environment.
Yandex Browser also determines the device’s physical location using the Wi2Geo geolocation service, sending coordinates and accuracy to wi2geo.mobile.yandex.net, even if you don’t explicitly allow it.
Microsoft Edge
We tested Microsoft Edge 38.14 on Windows 10 build 1607. Edge is almost always active, even if you don’t launch it, making connections to Microsoft servers in the background. On launch, it connects to seven major Microsoft networks, which serve both content delivery and large-scale data collection.
Surprisingly, Edge didn’t show obvious suspicious activity. The only indirect identifiers were basic telemetry, User-Agent, and cookies. With a blank start page, traffic was minimal. The only oddity was a string with DefaultLocation and MUID values sent to msn.com, but these were encoded.
Based on previous research, we suspect Edge’s modest behavior is an illusion. As part of Windows 10, Microsoft has many ways to collect detailed user and network activity data, not necessarily through the browser itself.
Opera
During installation, Opera 43.0 sends traffic not only to opera.com but also to BitGravity and EdgeCast servers, containing only anonymized IDs, browser, and OS version. On each start, Opera displays ads for various brands, as part of its monetization model. We even saw a comment from Booking.com in the intercepted traffic, inviting developers to work in Amsterdam.
Besides opera.com, Opera often connects to Wikimedia’s Dutch network for SSL certificate checks. All “personal data” is limited to the User-Agent string. Data compression via Opera Turbo uses a system process and sends traffic to opera-mini.net servers.
In our test, Opera behaved modestly. With default settings, it loaded a lot of ad content at startup, but these connections soon closed. Opera did not disclose any sensitive details.
Mozilla Firefox
Mozilla Firefox uses Amazon Web Services, as seen by multiple connections to compute.amazonaws.com on startup. Traffic also goes to Akamai, Cloudflare, EdgeCast, and Google for updates and quick search requests. By default, new tabs show links to other Mozilla projects, with images loaded from the web.
Main statistics are sent to telemetry.mozilla.org and are minimal and harmless. Physical location is determined via Mozilla Location Service only if the user allows it in settings (“Menu → Tools → Page Info → Permissions → Access Your Location”). We found no suspicious activity; all traffic matched the user agreement.
Encryption vs. Encoding
Encryption makes data unreadable without a key, while encoding is for shortening and standardizing records. Browsers often send traffic that is both encrypted and encoded. Some variables are obvious (e.g., s:1440x900x24 for screen resolution and color depth), while others are less clear (e.g., _ym_uid or fpr:335919976901). These are usually processed by automated systems, not humans.
How to Increase Your Privacy
Preventing most browser statistics from being sent is fairly simple. Uncheck “Send usage statistics” or similar options during installation or in the privacy settings. Also, enable “Send Do Not Track requests,” “Ask before sending my location,” and disable “Automatically send problem reports.”
“Do Not Track” adds a header to outgoing traffic, but it’s up to each website to honor it. Disabling automatic location sharing means sites can’t determine your location without permission. Disabling problem reports prevents sending detailed crash or error data to developers. If you don’t use “hacker” extensions or settings, you can help developers improve the browser by leaving some options enabled.
Conclusions
We tested popular browsers, intercepted, and analyzed their automatically generated traffic. The conclusions are cautious: some data is encrypted and encoded, so its purpose is unknown. The situation can change at any time with new browser versions or privacy policy changes.
The browsers we tested do send data to developers and partners during use, whether on a computer or smartphone. However, this data is generally not personal or sensitive. Most is technical and fairly minimal: screen resolution, processor architecture (but not model or serial number), number of open tabs (but not their addresses), and the number of saved passwords (not the passwords themselves).
Before logging into any online service, a user can only be identified indirectly. However, even general technical data can form a unique fingerprint. It’s unlikely many people have the same OS version, browser, install date, plugins, bookmarks, monitor resolution, processor type, RAM, and other small details. This digital fingerprint doesn’t reveal your identity but can reliably distinguish you from others.
Browsers and sites assign users anonymous IDs—alphanumeric strings—to keep statistics separate. Developers don’t care about your name or preferences; that’s the domain of marketing departments and social networks. Search engines, social networks, online games, dating, and job sites are the main hunters for personal data—but that’s another story.