AI devours your information: It knows what you search for, do and upload — and uses that data

2025-08-24 04:00:00 英文原文

作者：Raúl Limón

Artificial intelligence is a data devourer. For effectiveness, it has to be, but scarcity of what it feeds on can be a serious problem, particularly for AI agents, the conversational robots with the ability to act on behalf of users to buy, respond to emails, and manage invoices and schedules, among dozens of other possibilities. To do so, they need to know about the person they are talking to, learn about their life, and violate their privacy — which sometimes, they have permission to do. Big tech companies are already investigating how to tackle this problem on several fronts. But in the meantime, according to Hervé Lambert, global consumer operations manager at Panda Security, AI access to data poses risks of “commercial manipulation, exclusion, or even extortion.”

AI’s problematic relationship with private information has been proven by researchers at University College London and the Mediterranea University of Reggio Calabria in a study presented at the USENIX security symposium in Seattle. According to the report, AI web browser assistants execute widespread tracking, profiling, and personalization practices that raise serious privacy concerns.

During tests employing a user profile invented by researchers, AI web browser assistants shared search information with their servers and even banking and health data, as well as the user’s IP address. All demonstrated the ability to guess attributes like age, sex, salary, and interests of users and they utilized such information to personalize responses, even during different navigation sessions. Just one assistant, Perplexity, did not reveal evidence of profiling or personalization.

“Although many people are aware that search engines and social media platforms compile information about them for targeted advertising, AI web browser assistants operate with unprecedented access to user online behavior in areas of their online life that should remain private. Even if they offer convenience, our findings show that sometimes, they do so at the cost of user privacy, without any transparency or consent and sometimes, in violation of privacy legislation and their company’s own terms of service. This collection and exchange of information is not trivial: in addition to the sale and exchange of data with third parties, in a world where mass hackings are frequent, there is no way of knowing what is happening with search history once it has been collected,” explains Anna Maria Mandalari, primary author of the study that was conducted by the UCL’s electronic and electrical engineering department.

Lambert agrees with the study’s conclusions. “Technology is collecting users’ data, even that which is personal, to train and improve intelligent and automatic learning models. This helps companies to offer — to put it diplomatically — more personalized services. But developing these new technologies, obviously, raises a host of questions and concerns about privacy and user consent. Ultimately, we don’t know how companies and their smart systems are using our personal data.”

Among the potential risks cited by Lambert are commercial and geopolitical manipulation, exclusion, extortion, and identity theft. These dangers exist even when users have given their consent, consciously or otherwise. “Platforms,” adds Lambert, “are updating their privacy policies and that’s a little suspicious. In fact, such updates — and this is important — include clauses that allow for the use of data.” But consumers, in the vast majority of cases, accept conditions without reading or thinking about them, to ensure continuity in the service or out of pure haste.

Google is one of the companies that recently changed its privacy terms to, according to an email sent to its users, “improve our services.” In that statement, it admits to its use of interactions with its AI applications through Gemini, and has launched a new function for those who wish to opt out. That is the so-called “temporary chat” feature, which allows for the elimination of recent queries, and avoids the company using them “to personalize” future queries or “to train models.”

The user has to be proactive to protect themselves from these functions by deactivating the “keep activity” function and by managing and deleting Gemini app activity. If they fail to do so, their lives will be shared with the company. “A subset of uploads submitted starting September 2 — like files, videos, screens you ask about, and photos shared with Gemini — will also be used to help improve Google services for everyone,” states the corporation. It will also utilize audios recorded by the AI tools and data from Gemini Live recordings.

“As before, when Google uses your activity to improve its services (including training generative AI models), it gets help from human reviewers. To protect your privacy, we disconnect chats from your account before sending them to service providers,” explains the company in its statement, in which it admits that, even though it is disconnected from the user’s account, it uses and has used personal data (“As before”) and that it sells or shares it (“sending them to service providers”).

Marc Rivero, lead security researcher at Kaspersky, agrees on the risks involved with the dissemination of information, pointing to the use of WhatsApp data for AI: “It raises serious privacy concerns. Private messaging apps are one of the most sensitive digital environments for users, as they contain intimate conversations, personal data, and even confidential information. Allowing an AI tool to automatically access these messages without clear and explicit consent undermines user trust.”

He adds: “From the cybersecurity perspective, this is also troubling. Cyber criminals are taking advantage more and more of AI to widen their attacks on social engineering and collection of personal data. If those attackers find a way to exploit this kind of interaction, we could be facing a new path to fraud, identity theft, and other criminal activities.”

WhatsApp insists that “your personal messages with friends and family are off limits.” Its AI is trained through direct interaction with the artificial intelligence application and according to the company, “you have to take action to start the conversation by opening a chat or sending a message to the AI. Only you or a group participant can initiate this, not Meta or WhatsApp. Talking to an AI provided by Meta doesn’t link your personal WhatsApp account information on Facebook, Instagram, or any other apps provided by Meta.” Nonetheless, it does offer a warning: “What you send to Meta may be used to provide you with accurate responses or to improve Meta’s AI models, so don’t send messages to Meta with information you don’t want it to know.”

Storage and archive transfer services have also come under questioning. The latest example took place after the popular site WeTransfer’s modification to its terms of service, which was seen as an ask for limitless access to user data to improve future artificial intelligence systems. In response to consumer concerns about the possible free use of their documents and creations, the company was forced to reformulate the clause, offering the clarification: “To be extra clear: YES — your content is always your content. In fact, section 6.2 of our Terms of Service clearly states that you ‘own and retain all right, title, and interest, including all intellectual property rights, in and to the Content.’ YES — you’re granting us permission to ensure we can run and improve the WeTransfer service properly. YES — our terms are compliant with applicable privacy laws, including the GDPR [the European Union’s General Data Protection Regulation]. NO — we are not using your content to train AI models. NO — we do not sell your content to third parties.”

Given the proliferation of intelligent devices, which go far beyond conversational AI chats, Eusebio Nieva, technical director of Check Point Software for Spain and Portugal, advocates for regulations that guarantee transparency and explicit consent, security regulations for devices, and prohibition and restrictions on high-risk providers, as seen in the European regulation. “Incidents of violations of privacy underline the need for consumers, regulators, and companies to work together to guarantee security,” he says.

Lambert agrees and calls for users and companies to take responsibility in this new panorama. He rejects the idea that preventative regulation represents a step backward in development. “Protecting our users does not mean that we are going to slow down; it means that, from the outset of a project, we include privacy and digital footprint protection, thereby becoming more effective and efficient in protecting our most important assets, which are our users.”

Alternatives being researched by companies

Tech companies are aware of the problem generated by the use of personal data, not just because of the ethical and legal privacy conflicts, but also because they say that limitations in accessing them are slowing development of their systems.

Meta founder Mark Zuckerberg has directed the work of its Superintelligence Lab towards “self-improving AI,” systems capable of increasing the performance of artificial intelligence through advancements in equipment (particularly processors), in programming (including self-programming) and through the AI itself training language learning models on which it is based.

And it’s not just experiences based on synthetic data — tools and guidelines are also employed in adapting behavior to user needs. The startup Sakana AI has created a system called Darwin Gödel Machine, in which an AI agent adapts its code to improve its performance carrying out the tasks that it is assigned.

All these advances toward AI that surpasses human intelligence by overcoming obstacles such as data limitations also carry risks. Chris Painter, policy director at the non-profit AI research organization METR, warns that if AI accelerates the development of its own capacities, it could also be used for pirating, weapons design, and human manipulation.

“The rise in geopolitical tensions, economic volatility and operational environments that are becoming more complex, alongside attacks that are carried out using AI, have left organizations more vulnerable to cyber threats,” says Agustín Muñoz-Grandes, director of Accenture Security in Spain and Portugal. “Cyber security can no longer be a last-minute fix. It should be integrated beginning with the design of every initiative using AI.”

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

关于《AI devours your information: It knows what you search for, do and upload — and uses that data》的评论

暂无评论

发表评论

摘要

AI agents, such as conversational robots, require extensive user data to operate effectively but this poses significant privacy risks. Researchers at University College London and Mediterranea University of Reggio Calabria found that AI web browser assistants track users extensively, sharing personal information with servers without explicit consent. This includes sensitive data like banking and health information. Companies like Google and WhatsApp are updating their privacy policies to accommodate AI use, raising concerns about commercial manipulation and violations of privacy laws. Experts call for stricter regulations to ensure transparency, user consent, and security in the development and deployment of AI technologies.

AI devours your information: It knows what you search for, do and upload — and uses that data

Alternatives being researched by companies

关于《AI devours your information: It knows what you search for, do and upload — and uses that data》的评论

发表评论

摘要

相关新闻

相关讨论