使用张量网络进行隐私的机器学习

论文标题

使用张量网络进行隐私的机器学习

Privacy-preserving machine learning with tensor networks

论文作者

Pozas-Kerstjens, Alejandro, Hernández-Santana, Senaida, Monturiol, José Ramón Pareja, López, Marco Castrillón, Scarpa, Giannicola, González-Guillén, Carlos E., Pérez-García, David

论文摘要

最近，张量网络用于提供局部量子多体系统的低能状态的有效表示，最近被提出为机器学习体系结构，这些体系结构可以在传统方面具有优势。在这项工作中，我们表明，张量网络体系结构具有保护隐私的机器学习的特别潜在属性，这在诸如医疗记录处理之类的任务中很重要。首先，我们描述了前馈神经网络中存在的新隐私脆弱性，以合成和现实世界的数据集说明它。然后，我们开发出明确的条件，以确保对这种脆弱性的鲁棒性，这涉及仪表对称下等效的模型的表征。我们严格地证明，张量 - 网络架构满足了此类条件。在此过程中，我们为基质产品状态定义了一种新型的规范形式，该状态具有高度的规律性，并根据基于单数值分解的规范形式固定了剩余规格。我们通过在医疗记录数据集中对矩阵产品状态进行培训的实际示例来补充分析结果，这表明攻击者从模型参数中提取有关培训数据集的信息的可能性很大。鉴于培训张量 - 网络架构的专业知识越来越大，这些结果表明，可能不必强迫在预测准确性和确保处理信息的隐私之间做出选择。

Tensor networks, widely used for providing efficient representations of low-energy states of local quantum many-body systems, have been recently proposed as machine learning architectures which could present advantages with respect to traditional ones. In this work we show that tensor network architectures have especially prospective properties for privacy-preserving machine learning, which is important in tasks such as the processing of medical records. First, we describe a new privacy vulnerability that is present in feedforward neural networks, illustrating it in synthetic and real-world datasets. Then, we develop well-defined conditions to guarantee robustness to such vulnerability, which involve the characterization of models equivalent under gauge symmetry. We rigorously prove that such conditions are satisfied by tensor-network architectures. In doing so, we define a novel canonical form for matrix product states, which has a high degree of regularity and fixes the residual gauge that is left in the canonical forms based on singular value decompositions. We supplement the analytical findings with practical examples where matrix product states are trained on datasets of medical records, which show large reductions on the probability of an attacker extracting information about the training dataset from the model's parameters. Given the growing expertise in training tensor-network architectures, these results imply that one may not have to be forced to make a choice between accuracy in prediction and ensuring the privacy of the information processed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题