Edit 14-05-2025 7:00am PT:Microsoft, via Twitter (below), has now stated that the company does not use the data to train its large language models (AI models).

In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document. https://t.co/o9DGn9QnHbNovember 25, 2024

Microsoft Office 2024

It is not a secret that Microsoft’s Office has Connected Experiences whichanalyze content created by users. However, according to@nixCraft, an author ofCyberciti.biz, Microsoft’s Connected Experiences feature automatically gathers data from Word and Excel files to train the company’s AI models. This feature is turned on by default, meaning user-generated content is included in AI training unless manually deactivated. However, this deactivation is a very convoluted process. Microsoft has yet to comment on the information, so take it with a grain of salt [EDIT: as stated above, Microsoft has now said this feature does not enable AI].

This default setting allows Microsoft to use documents such as articles, novels, or other works intended for copyright or commercial purposes without explicit consent. The implications are significant for creators and businesses relying on Microsoft Office for proprietary work, as their data could become part of the company’s AI development. For this reason, anyone concerned about protecting their intellectual property or sensitive information should take action immediately.

Anton Shilov

To do so, users must actively opt out by finding and disabling the feature in settings. The process requires unchecking the box ‘Turn on optional connected experiences’ that is enabled by default.

On a Windows PC, the steps include going to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and unchecking the box. Seven steps to disable a critical feature that is turned on automatically seems very convoluted.

Microsoft’s approach mirrors a broad trend in the tech industry, where other companies have introduced similar features to train their AI models. While all AI models are trained on something generated by humans, doing so without their consent is unethical, to put it mildly.

Microsoft has not publicly confirmed or denied that it uses content from Excel and Word documents generated by users of Microsoft Office to train its AI models. Nonetheless, there is a clause in Microsoft’sServices Agreementthat grants the company ‘a worldwide and royalty-free intellectual property license to use Your Content.’

Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.

“To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services,” the clause reads.

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.