Concrete ML v1.4: Encrypted Training and Faster Decision Trees

January 19, 2024

Andrei Stoian

This new version of Concrete ML introduces a highly anticipated feature by our community. While in the past Concrete ML focused on securing machine learning inference, you can now train a model on encrypted data.

Plus, significant efforts have been made to speed things up! The latest update dramatically improves the speed for tree-based models like XGBoost, random forests, and decision trees — expect to see performance boosts of 2-3 times in common quantization scenarios. And that's not all; this means you can now run more complex, high bit-width tree-based models without worrying about any slow down.

Training on encrypted data

‍Think about all the sensitive data tucked away in encrypted form, whether it's in the cloud or on your own servers. Machine learning has the potential to really tap into the value of this data. But here’s the snag: traditionally, when you train models, there's a security hurdle. Typically, you'd have to decrypt the data, and that involves a bit of a dance between the data scientist and the database. This process of moving and decrypting data can be a headache, especially when you're trying to collaborate with others on sensitive information. It can really slow things down.

Concrete ML v1.4 introduces a new feature for logistic regression (read more here). Now, you can train your model directly on encrypted data. No need to decrypt it first! This means your sensitive data stays secure throughout the process. Plus, you can work together with others more easily, combining data to create even more powerful models.

Training a logistic regression model on encrypted data with Concrete ML doesn't compromise accuracy. You'll get results that are just as accurate as if you were training with unencrypted data. As for the training time, it scales reasonably with the complexity of your data. The training time increases linearly in relation to the number of features and the size of the training set. For instance, training a model on a dataset with 10,000 rows and 10 features? That's about 1 hour. It's a small investment of time for the significant benefit of keeping your data secure throughout the training process.

Optimized latency for tree-based models

Tree-based models like XGBoost and random forest come with a handy feature in Concrete ML: a precision parameter. This parameter is crucial for tailoring these models to work seamlessly with Fully Homomorphic Encryption (FHE). The default setting for this parameter has been tweaked to strike a better balance between speed and accuracy. In the past, cranking up the precision meant dealing with a significant slowdown. Now, thanks to improvements in Concrete ML, any increase in complexity is linear, making things more predictable and manageable.

What's more, a faster comparison operator was integrated. This means you can expect up to twice the speed for secure inference in common quantization scenarios. And you can now opt for higher quantization precision, from 10 to 24 bits, without facing a steep climb in latency. This enhancement is particularly useful for datasets with non-uniform distributions, where such precision is often crucial. Plus, it opens the door to using deeper decision trees, which can be a game changer for analyzing large datasets.

Additional links

Star the Concrete ML Github repository to endorse our work.
Review the Concrete ML documentation.
Get support on our community channels.

Read more related posts

Training Predictive Models on Encrypted Data using Fully Homomorphic Encryption

Zama Concrete ML now supports training of Logistic Regression models on encrypted data.

March 14, 2024

Jordan Frery and Luis Montero

Concrete ML

Engineering

Build an End-to-End Encrypted Shazam Application Using Concrete ML

A tutorial on how to code a privacy-preserving version of Shazam using Zama Concrete ML.

February 14, 2024

The Zama Team

Tutorials

Concrete ML

Zama Product Releases - January 2024

Explore Zama's January 2024 product updates, introducing GPU acceleration in TFHE-rs and new features in Concrete...

January 22, 2024

The Zama Team

Announcements

fhEVM v0.3: New Stack and Better Performances

This new version of Zama's fhEVM introduces a brand new technical stack, alongside other exciting features.

January 19, 2024

Morten Dahl

Announcements

fhEVM

TFHE-rs v0.5: Detecting Overflows, Running on GPU and More

This new version of TFHE-rs introduces two key enhancements: GPU acceleration for improved performance and overflow detection

January 19, 2024

Jean-Baptiste Orfila

Announcements

TFHE-rs

Concrete v2.5: Multiple-Outputs and Iterative Functions, TFHE-rs Under the Hood, and New Truncate-PBS Operator

This new version of Concrete introduces support for multi-output functions and iterative use in loops, and many other features.

January 19, 2024

Quentin Bourgerie

Announcements

Concrete

Privacy is necessary for an open society in the electronic age. Privacy is not secrecy. A private matter is something one doesn't want the whole world to know, but a secret matter is something one doesn't want anybody to know. Privacy is the power to selectively reveal oneself to the world.If two parties have some sort of dealings, then each has a memory of their interaction. Each party can speak about their own memory of this; how could anyone prevent it? One could pass laws against it, but the freedom of speech, even more than privacy, is fundamental to an open society; we seek not to restrict any speech at all. If many parties speak together in the same forum, each can speak to all the others and aggregate together knowledge about individuals and other parties. The power of electronic communications has enabled such group speech, and it will not go away merely because we might want it to.Since we desire privacy, we must ensure that each party to a transaction have knowledge only of that which is directly necessary for that transaction. Since any information can be spoken of, we must ensure that we reveal as little as possible. In most cases personal identity is not salient. When I purchase a magazine at a store and hand cash to the clerk, there is no need to know who I am. When I ask my electronic mail provider to send and receive messages, my provider need not know to whom I am speaking or what I am saying or what others are saying to me; my provider only need know how to get the message there and how much I owe them in fees. When my identity is revealed by the underlying mechanism of the transaction, I have no privacy. I cannot here selectively reveal myself; I must always reveal myself.Therefore, privacy in an open society requires anonymous transaction systems. Until now, cash has been the primary such system. An anonymous transaction system is not a secret transaction system. An anonymous system empowers individuals to reveal their identity when desired and only when desired; this is the essence of privacy.Privacy in an open society also requires cryptography. If I say something, I want it heard only by those for whom I intend it. If the content of my speech is available to the world, I have no privacy. To encrypt is to indicate the desire for privacy, and to encrypt with weak cryptography is to indicate not too much desire for privacy. Furthermore, to reveal one's identity with assurance when the default is anonymity requires the cryptographic signature.We cannot expect governments, corporations, or other large, faceless organizations to grant us privacy out of their beneficence. It is to their advantage to speak of us, and we should expect that they will speak. To try to prevent their speech is to fight against the realities of information. Information does not just want to be free, it longs to be free. Information expands to fill the available storage space. Information is Rumor's younger, stronger cousin; Information is fleeter of foot, has more eyes, knows more, and understands less than Rumor.We must defend our own privacy if we expect to have any. We must come together and create systems which allow anonymous transactions to take place. People have been defending their own privacy for centuries with whispers, darkness, envelopes, closed doors, secret handshakes, and couriers. The technologies of the past did not allow for strong privacy, but electronic technologies do.We the Cypherpunks are dedicated to building anonymous systems. We are defending our privacy with cryptography, with anonymous mail forwarding systems, with digital signatures, and with electronic money.Cypherpunks write code. We know that someone has to write software to defend privacy, and since we can't get privacy unless we all do, we're going to write it. We publish our code so that our fellow Cypherpunks may practice and play with it. Our code is free for all to use, worldwide. We don't much care if you don't approve of the software we write. We know that software can't be destroyed and that a widely dispersed system can't be shut down.Cypherpunks deplore regulations on cryptography, for encryption is fundamentally a private act. The act of encryption, in fact, removes information from the public realm. Even laws against cryptography reach only so far as a nation's border and the arm of its violence. Cryptography will ineluctably spread over the whole globe, and with it the anonymous transactions systems that it makes possible.For privacy to be widespread it must be part of a social contract. People must come and together deploy these systems for the common good. Privacy only extends so far as the cooperation of one's fellows in society. We the Cypherpunks seek your questions and your concerns and hope we may engage you so that we do not deceive ourselves. We will not, however, be moved out of our course because some may disagree with our goals.The Cypherpunks are actively engaged in making the networks safer for privacy. Let us proceed together apace.Onward. By Eric Hughes. 9 March 1993.

Concrete ML v1.4: Encrypted Training and Faster Decision Trees

Training on encrypted data

Optimized latency for tree-based models

Additional links

Read more related posts

Training Predictive Models on Encrypted Data using Fully Homomorphic Encryption

Build an End-to-End Encrypted Shazam Application Using Concrete ML

Zama Product Releases - January 2024

fhEVM v0.3: New Stack and Better Performances

TFHE-rs v0.5: Detecting Overflows, Running on GPU and More

Concrete v2.5: Multiple-Outputs and Iterative Functions, TFHE-rs Under the Hood, and New Truncate-PBS Operator

Libraries

Products & Services

Developers

Company

Contact