How to Deploy a Machine Learning Model With Concrete ML

May 30, 2023

—

Luis Montero

× Concrete ML is a Privacy-Preserving Machine Learning set of tools that aims to simplify the use of Fully Homomorphic Encryption (FHE) for developers so they can automatically turn machine learning models into their homomorphic equivalent.

Github

Documentation

‍

Concrete ML v1.0.0 introduced several new features, such as improved performance and better model development assistance. Let’s look at how to use Concrete ML v1.0.0 to deploy machine learning models. The scripts in this blog post are illustrative of the deployment tools that you can build.

To start, access the code examples used in this simple Concrete ML model that performs breast cancer classification. Keep in mind that some of them are not part of the Concrete ML PyPI package. The scripts are based on Boto3, and they deploy Concrete ML models to an AWS EC2 hosted FastAPI server.

Let's review the provided example that focuses on the confidential diagnosis of breast cancer using Fully Homomorphic Encryption (FHE).

Model serialization

So you trained a Concrete ML model and you want to serve it to your users?

First, compile the model and use the simulation feature to make sure that the predictions of your model match what you expect.

model.predict(fhe="simulate")

Once you’re happy with the accuracy, serialize your model for deployment:

dev = FHEModelDev("./path_to_model", model)
dev.save()

Replace “./path_to_model” with a directory name of your choice. You can use:

dev.save(via_mlir=True)

for cross-platform compatibility (e.g., training your model on a Mac M1 and deploying to an Intel based server).

Train the model with train_with_docker.sh, with:

# From root of Concrete ML repository
cd ./use_case_examples/deployment/breast_cancer_builtin  

# Will create a dev folder with client/server.zip files
bash train_with_docker.sh

This will pull Concrete ML’s Docker image, which may take a bit of time.

After the model is trained and saved, all assets are ready for deployment to a cloud provider.

Deployment

Deploying the server to AWS.

You can test the deployment locally on your personal machine, but in production environments, FHE works best when deployed to a cloud provider that offers a wide variety of powerful machines. Since FHE can be computationally intensive, it makes sense to deploy your model on a compute-optimized instance.

Concrete ML includes utility scripts to ease deployment to AWS. This takes the form of a simple CLI that leverages Boto3 under the hood.

# Create a AWS EC2 instance and launches server
python -m concrete.ml.deployment.deploy_to_aws \
	--path-to-model "./dev" \
	--port 5000 \
	--instance-type "c5.4xlarge" \
	--instance-name "my_super_model" \
	--verbose 1 \
	--wait-bar 1

In this command line, the options can be changed: --instance-type can be replaced with the instance types available on AWS, while --instance-name is just an identifier that lets you find your instance in the AWS console.

This command line performs the following steps:

Creates an AWS EC2 instance with proper permissions, security-group, ssh-keys, public IP address
Waits until the instance is available through SSH (note that instance start-up can take a few seconds)
Copies the needed files, mainly the source for the server application and the serialized model, from your local machines to the remote instance using scp
Installs all needed dependencies on the server (in a tmux session)
Runs the server (also in a tmux session)

For more advanced users who may want to have a look at how it works under the hood, details are given in server.py.

That’s it! You have now deployed a Concrete ML model.

In the logs of the deployment script, you'll find the URL of the FastAPI server. You will need to keep this URL in order to use it in the client code.

Creating the client.

Depending on your data or use case, you might need to develop a client application. Here are several examples of how to write client application code.

For the example discussed here on Breast Cancer Diagnosis, the client code is fairly straightforward and shows how simple an FHE client can be. To run it, build the Docker image for the client using the script provided, then launch the Docker container using the appropriate script:

# Build docker image
python build_docker_client_image.py  

# Run and attach the terminal to the docker container
bash client.sh  

# Launch in the container. This will trigger the inference using the remote server
URL="" python client.py

When launching the inference of your model in FHE, specify the IP address of the FHE endpoint as the URL environment variable.

Going further.

Check out the Concrete ML documentation for more information. You’ll find an example on the usage of the Client/Server APIs and a section about deployment.

Keep in mind that the script to deploy to AWS makes some assumptions on the tools available on the machine (like ssh) used to deploy the model. These assumptions may not be true on some systems. A future release of Concrete will include the use of AWS ECR and AWS ECS, but in the meantime you can use the approach described here on most Linux systems.

If you use a different cloud provider or if you encounter an issue deploying to AWS, you can always use the provided Dockerfile and the corresponding API to build your own Docker image to serve your model.

Note that the server currently holds all public keys in memory and that, for some models, this might be an issue. To solve this, you can either modify the server to use a database to manage the keys or you can use a deployment machine with enough RAM to hold your keys in memory.

If you don’t like bash scripting, the documentation shows you how to perform all the steps in this tutorial without leaving Python.

Conclusion

With a few simple scripts, you can deploy Concrete ML models on AWS. These scripts, though minimalistic, are illustrative of the deployment tools that you can build, which also feature user management, key management, and more. These scripts can also be used for prototyping and identifying the bottlenecks within your FHE application (key-size, FHE runtime, …) in a real-world client-server setting.

Additional links

Star the Concrete ML Github repository to endorse our work.
Review the Concrete ML documentation.
Get support on our community channels.
Learn FHE, help us advance the space and make money with the Zama Bounty Program.

Related Blog Posts

[Video Tutorial] Improving Multiple-GPU Throughput Using TFHE-rs

Tutorials

In this tutorial, Zama team member Agnes Leroy, shows you how to improve multiple-GPU throughput using TFHE-rs.

Zama Bounty Program Season 8

Announcements

Announcing the winning submissions from Season 7 and the new bounties for Season 8.

Call For Builders: Onboard The Next Trillions In DeFi With Confidential Lending

Confidential Blockchain

DeFi is fast, open, and efficient—but too transparent for institutions. What if it offered Swiss-bank-level privacy?

Read more →

Back to blog

Privacy is necessary for an open society in the electronic age. Privacy is not secrecy. A private matter is something one doesn't want the whole world to know, but a secret matter is something one doesn't want anybody to know. Privacy is the power to selectively reveal oneself to the world.If two parties have some sort of dealings, then each has a memory of their interaction. Each party can speak about their own memory of this; how could anyone prevent it? One could pass laws against it, but the freedom of speech, even more than privacy, is fundamental to an open society; we seek not to restrict any speech at all. If many parties speak together in the same forum, each can speak to all the others and aggregate together knowledge about individuals and other parties. The power of electronic communications has enabled such group speech, and it will not go away merely because we might want it to.Since we desire privacy, we must ensure that each party to a transaction have knowledge only of that which is directly necessary for that transaction. Since any information can be spoken of, we must ensure that we reveal as little as possible. In most cases personal identity is not salient. When I purchase a magazine at a store and hand cash to the clerk, there is no need to know who I am. When I ask my electronic mail provider to send and receive messages, my provider need not know to whom I am speaking or what I am saying or what others are saying to me; my provider only need know how to get the message there and how much I owe them in fees. When my identity is revealed by the underlying mechanism of the transaction, I have no privacy. I cannot here selectively reveal myself; I must always reveal myself.Therefore, privacy in an open society requires anonymous transaction systems. Until now, cash has been the primary such system. An anonymous transaction system is not a secret transaction system. An anonymous system empowers individuals to reveal their identity when desired and only when desired; this is the essence of privacy.Privacy in an open society also requires cryptography. If I say something, I want it heard only by those for whom I intend it. If the content of my speech is available to the world, I have no privacy. To encrypt is to indicate the desire for privacy, and to encrypt with weak cryptography is to indicate not too much desire for privacy. Furthermore, to reveal one's identity with assurance when the default is anonymity requires the cryptographic signature.We cannot expect governments, corporations, or other large, faceless organizations to grant us privacy out of their beneficence. It is to their advantage to speak of us, and we should expect that they will speak. To try to prevent their speech is to fight against the realities of information. Information does not just want to be free, it longs to be free. Information expands to fill the available storage space. Information is Rumor's younger, stronger cousin; Information is fleeter of foot, has more eyes, knows more, and understands less than Rumor.We must defend our own privacy if we expect to have any. We must come together and create systems which allow anonymous transactions to take place. People have been defending their own privacy for centuries with whispers, darkness, envelopes, closed doors, secret handshakes, and couriers. The technologies of the past did not allow for strong privacy, but electronic technologies do.We the Cypherpunks are dedicated to building anonymous systems. We are defending our privacy with cryptography, with anonymous mail forwarding systems, with digital signatures, and with electronic money.Cypherpunks write code. We know that someone has to write software to defend privacy, and since we can't get privacy unless we all do, we're going to write it. We publish our code so that our fellow Cypherpunks may practice and play with it. Our code is free for all to use, worldwide. We don't much care if you don't approve of the software we write. We know that software can't be destroyed and that a widely dispersed system can't be shut down.Cypherpunks deplore regulations on cryptography, for encryption is fundamentally a private act. The act of encryption, in fact, removes information from the public realm. Even laws against cryptography reach only so far as a nation's border and the arm of its violence. Cryptography will ineluctably spread over the whole globe, and with it the anonymous transactions systems that it makes possible.For privacy to be widespread it must be part of a social contract. People must come and together deploy these systems for the common good. Privacy only extends so far as the cooperation of one's fellows in society. We the Cypherpunks seek your questions and your concerns and hope we may engage you so that we do not deceive ourselves. We will not, however, be moved out of our course because some may disagree with our goals.The Cypherpunks are actively engaged in making the networks safer for privacy. Let us proceed together apace.Onward.Eric Hughes9 March 1993