Privacy Artificial Intelligence

Sherpa.ai Federated Learning and Differential Privacy Framework: Protect user privacy without renouncing the power of Artificial Intelligence

The Sherpa.ai Federated Learning and Differential Privacy Framework is an open-source framework for Machine Learning that allows collaborative learning to take place, without sharing private data. It has been developed to facilitate open research and experimentation in Federated Learning and Differential Privacy. Federated Learning is a Machine Learning paradigm aimed at learning models from decentralized data, such as data located on users’ smartphones, in hospitals, or banks, and ensuring data privacy. This is achieved by training the model locally in each node (e.g., on each smartphone, at each hospital, or at each bank), sharing the model-updated local parameters (not the data) and securely aggregating them to build a better global model. Federated Learning can be combined with Differential Privacy to ensure a higher degree of privacy. Differential Privacy is a statistical technique to provide data aggregations, while avoiding the leakage of individual data records. This technique ensures that malicious agents intervening in the communication of local parameters can not trace this information back to the data sources, adding an additional layer of data privacy.

This technology could be disruptive in cases where it is compulsory to ensure data privacy, as in the following examples:

  • When data contains sensitive information, such as email accounts, personalized recommendations, and health information, applications should employ data privacy mechanisms to learn from a population of users whilst the sensitive data remains on each user’s device.
  • When data is located in data silos, an automotive parts manufacturer, for example, may be reluctant to disclose their data, but would benefit from models that learn from other manufacturers' data, in order to improve production and supply chain management.
  • Due to data-privacy legislation, banks and telecom companies, for example, cannot share individual records, but would benefit from models that learn from data across several entities.

Sherpa.ai is focused on democratizing Federated Learning by providing methodologies, pipelines, and evaluation techniques specifically designed for Federated Learning. The Sherpa.ai Federated Learning SDK enables developers to simulate Federated Learning scenarios with models, algorithms, and data provided by the framework, as well as their own data.

import numpy as np
import tensorflow as tf
import shfl

from shfl.federated_government import FederatedGovernment

# Load data to use in simulation
database = shfl.data_base.Emnist()
train_data, train_labels, test_data, test_label = database.load_data()

# Deploy data over data nodes
iid_dist = shfl.data_distribution.IidDataDistribution(database)
federated_data, test_data, test_labels = iid_dist.get_federated_data(num_nodes=20,
                                                                     percent=10)

# Create function that builds a model
def model_builder():
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
    model.add(tf.keras.layers.Dense(64, activation='relu'))
    model.add(tf.keras.layers.Dropout(0.1))
    model.add(tf.keras.layers.Dense(10, activation='softmax'))

    model.compile(optimizer="rmsprop",
                  loss="categorical_crossentropy",
                  metrics=["accuracy"])

    return shfl.model.DeepLearningModel(model)

# Choose aggregattor
aggregator = shfl.federated_aggregator.FedAvgAggregator()
fed_government = FederatedGovernment(model_builder, federated_data, aggregator)

# Run a few rounds of federated learning
fed_government.run_rounds(3, test_data, test_label)

Use Cases

Improve Diagnostics and Care Using Secure and Private Patient Data

Sensitive data from the healthcare industry is subject to strict data protection regulations. In order to learn from healthcare information and share patient data securely, Federated Learning can be employed so that medical institutions can ensure data privacy, while providing patients with the most advanced processes, diagnostic tools, and care possible.

Keep Funds Secure Without Sharing Customer Data

Banks and financial institutions can use Federated Learning to identify money laundering transactions by using private transaction data to build more capable models. All banks using the same system benefit from each other’s transaction data, without exposing their own raw data or customers' data to competitors.

Deploy Industry 4.0 without Disclosing Sensitive Data

Companies providing operations and maintenance services to customers across the globe can benefit from Federated Learning and Differential Privacy by learning from all equipment data available, without disclosing any sensitive customer data. Through anonymous collaboration, plants, machines, and factories of all sizes can be run more efficiently and intelligently, while private data remains protected.

Advance Research Using a Private Framework

Universities and research institutions can use Federated Learning to anonymously combine their efforts, advance their research, and amplify their findings, while ensuring their data remains private, thanks to a Federated Framework.

Train Automatic Surveillance Models while Ensuring Anonymity

Automatic surveillance systems can be trained using Machine Learning models from multiple facilities and their respective security equipment and information, without accessing surveillance images or information. The use of this technology ensures anonymity and privacy, while providing a way to increase safety and security measures using Artificial Intelligence techniques.

Facilitate Edge Computing and Train Models at the Data Source

The accelerated development of devices with increasing computational capabilities, such as mobile and IoT devices, has created the opportunity to learn complex models and decentralize data, using Edge Computing. Federated Learning helps to improve Machine Learning models on distributed devices by sharing global information among nodes, while ensuring data remains private on each device.

FRANCISCO HERRERA, PH.D.

It is the most powerful framework on the market that respects user privacy, based on cutting edge Federated Learning technology.

FRANCISCO HERRERA, PH.D.

  • Ph.D. in Mathematics
  • Highly Cited Researcher (Thomson Reuters) in the areas of Engineering and Computer Sciences
  • Spanish National Award in Computer Science
  • More than 331 Journal papers published, which account for 85,240 citations in Google Scholar.



Competitive Benchmarking

Federated Learning and
Differential Privacy Features

Federated Learning Framework
Use Federated Models with different datasets
Support other libraries
Sampling Enviroment: IID or Non-IID data distribution
Federated Aggregation Mechanisms
Federated Attack Simulation

Differential Privacy
Mechanisms: Exponential, Laplacian, Gaussian
Sensitivity Sampler
Subsampling methods to increase privacy
Adaptive Differential Privacy

Desired Properties
Documentation & Tutorials
High-level API
Ability to extend the framework with new properties
  • Complete
  • Partial
  • Not available
  • Not specified

Provisions

  1. Definition of Confidential Information. Understood as any information disclosed by Sherpa, either in writing or orally, in the forms of samples, models, software or any other type of technical data, trade secrets, manufacturing processes or other (know-how) of Sherpa, whether it had been disclosed prior to the date of this Agreement or thereafter, including, but not limited to, information regarding business plans, products or services, financial provisions, patents, trademark, utility models and any other intellectual or industrial property rights and/or requests for them (whether they are registered or not), computer passwords and/or their source code, inventions, processes, designs, whether they are graphic designs or not, engineering, advertising or finance, which are identified, in writing, as confidential or exclusive or that could be considered as such by an average person, taking into account the circumstances The following information, technical data or knowhow shall not be considered Confidential Information where the information: (i) was known to the User prior to receiving any of the Confidential Information from the other and this circumstance can be accredited; (ii) has become publicly known, unless that circumstance takes place as a consequence of an act or failure to act of the User; (iii) was received by the User from a third party who is not required secrecy; (iv) Sherpa has given written permission to disclose.
  2. Obligation of confidentiality. The User agrees not to disclose, or permit to be disclosed any Sherpa Confidential Information to third parties and undertakes to take appropriate measures to ensure the secrecy of any Sherpa Confidential Information with the purpose that the same do not goes into public knowledge, nor fall into possession of unauthorized third parties. The above mentioned measures must adequate to the maximum diligence in a manner similar to the way the User protect its own confidential information of similar content, which cannot be inferior to the diligence of a reasonable businessman.
  3. Term of the obligation of confidentiality. The obligations assumed by the User under this agreement shall remain in force while the visits and meetings with Sherpa takes places, and after the termination of such activities, until the Confidential Information goes into public knowledge, only if this had happened by cause other than breach of this agreement by the User.
  4. Non-compliance with obligations. The User understands that the monetary damages that could develop from a breach of this agreement may not be a sufficient remedy, and therefore expressly agree that any present or future breach of the obligations contained in this agreement and, notwithstanding any other rights or remedies that could correspond, would be enough to encourage before the competent jurisdiction to adopt the necessary precautionary measures to remedy and/or prevent such failure, without having to prove that damage has been caused.
  5. Return information exchanged. The User agrees to return to Sherpa all documents containing Confidential Information which may have been supplied and all copies thereof, and permanently erase or destroy all Confidential Information or any part thereof stored in electronic files. For this purpose, for documents it is understood any storage devise containing data or information.
  6. Ownership of rights. This agreement does not grant the User license or property rights under any intellectual property of Sherpa, nor is any right granted to the User over the Sherpa Confidential Information, except as provided herein.
  7. Non-transferability. This agreement is a personal agreement between Sherpa and the User; it may not be transferred, in whole or part, without prior written consent of the other party.
  8. Applicable law and jurisdiction. This agreement shall be governed by Spanish law and shall be interpreted in accordance with them. The Parties to this agreement expressly submit to the exclusive jurisdiction of the Courts of Bilbao, renouncing any other jurisdiction that may correspond to the resolution of any dispute arising in relation to the implementation of this agreement. By accepting this message, the user accepts all the clauses contained in this agreement.