Maintaining Privacy While Collecting WhatsApp Data: Insights from Research Associate Pallab Deb in India

12/10/2024
Photo of multiple phones laid out on top of a desk. The phones are opened to the WhatsApp app and the text on WhatsApp messages is blurred.

At the Digital Witness Lab, we build platforms and tools in service of public-interest research. This means a lot of our work—coding, building, and labeling—happens behind the scenes. Our WhatsApp Watch project has led us to create one of the largest sets of WhatsApp message data. To understand what we’re learning from the data, we turn to our team members working on it. Keep reading to hear from Pallab Deb, one of our India-based Research Associates.

The conversation has been edited for clarity.

What do you do at the Digital Witness Lab? I’m a Research Associate at the lab and the larger project I’m working on is WhatsApp Watch. We’re collecting messages from WhatsApp groups to monitor the platform, and for that we need to be inside a lot of groups. My job is largely sourcing those groups for data collection as well as analyzing the data collected. I also contribute to the Lab's writing and publications.

Where did you start finding groups? Our first project was around elections, so we were looking to join groups either formally or informally associated with political parties. I started by visiting political parties’ social media profiles and searching for whether they’d mentioned any WhatsApp groups there. But I quickly realized that political party groups aren’t often shared online. We’ve had the most success in joining relevant WhatsApp groups through our contacts. I was able to reach out to political consultants who had access to WhatsApp groups and were willing to share them with us. Once inside these groups, members will post invitation links to other groups which we join as well. For some parties, we did manage to find invitation links online, which eventually led to finding more links within the groups.

Are there any other ways you’re collecting data? We’ve also worked with data donors who have given us access to WhatsApp groups that they’re already a part of. Building trust has been essential for that. We understand that it can feel scary to link WhatsApp groups with external systems, but data donors are able to choose which groups that they want to share data from and we assure them of our policy around not collecting personally identifiable information.

How do you maintain group members’ privacy in your research? As a researcher, I’m looking at the content of a message and its metadata outside of the WhatsApp interface. We don’t collect names, user names, phone numbers, profile photos or statuses. Basically anything that could link a particular person to a particular group or message, we’re not collecting.

How are the messages collected? At first we had one phone where we had created a WhatsApp account and began joining groups in order to collect message data from them. We joined so many groups at first that the phone started crashing—the level of multimedia and messages coming in really affects the health of the device. So, quickly after that, our Technical Lead Micha Gorelick got us a second and third phone to enable more data collection, and now I believe we have 13 phones with four more on the way.

What did you end up learning about elections and WhatsApp? The major takeaway has been seeing the level of coordination that political parties have on WhatsApp. There are groups for the district that you’re in, the subdistrict, the city, and even the particular area of your city. So WhatsApp allows political actors to be very granular in their targeting. For example, if a party wants to send a particular message to only women living in a particular city, they can often find a group for that.

The Bharatiya Janata Party (BJP) has really squeezed everything it can out of WhatsApp. Now it has an actual informal role in the party called WhatsApp pramukh or chief, where the person’s job is to manage WhatsApp groups.

What makes studying WhatsApp different from other platforms? WhatsApp is more insidious in a lot of ways. First of all, for a majority of people in India with smartphones, WhatsApp is the primary messaging app. So imagine the app that you use to talk to your friends or your family daily is also flooding you with political content and potentially harmful misinformation. As opposed to, say, going on Facebook or Instagram and then finding things there, WhatsApp messages are really pushed on to people.

Recent News