Concrete ML v1.4: Encrypted Training and Faster Decision Trees

January 19, 2024
  -  
Andrei Stoian

This new version of Concrete ML introduces a highly anticipated feature by our community. While in the past Concrete ML focused on securing machine learning inference, you can now train a model on encrypted data.

Plus, significant efforts have been made to speed things up! The latest update dramatically improves the speed for tree-based models like XGBoost, random forests, and decision trees — expect to see performance boosts of 2-3 times in common quantization scenarios. And that's not all; this means you can now run more complex, high bit-width tree-based models without worrying about any slow down.

Training on encrypted data

Think about all the sensitive data tucked away in encrypted form, whether it's in the cloud or on your own servers. Machine learning has the potential to really tap into the value of this data. But here’s the snag: traditionally, when you train models, there's a security hurdle. Typically, you'd have to decrypt the data, and that involves a bit of a dance between the data scientist and the database. This process of moving and decrypting data can be a headache, especially when you're trying to collaborate with others on sensitive information. It can really slow things down.

Concrete ML v1.4  introduces a new feature for logistic regression (read more here). Now, you can train your model directly on encrypted data. No need to decrypt it first! This means your sensitive data stays secure throughout the process. Plus, you can work together with others more easily, combining data to create even more powerful models.

Training a logistic regression model on encrypted data with Concrete ML doesn't compromise accuracy. You'll get results that are just as accurate as if you were training with unencrypted data. As for the training time, it scales reasonably with the complexity of your data. The training time increases linearly in relation to the number of features and the size of the training set. For instance, training a model on a dataset with 10,000 rows and 10 features? That's about 1 hour. It's a small investment of time for the significant benefit of keeping your data secure throughout the training process.

Optimized latency for tree-based models

Tree-based models like XGBoost and random forest come with a handy feature in Concrete ML: a precision parameter. This parameter is crucial for tailoring these models to work seamlessly with Fully Homomorphic Encryption (FHE). The default setting for this parameter has been tweaked to strike a better balance between speed and accuracy. In the past, cranking up the precision meant dealing with a significant slowdown. Now, thanks to improvements in Concrete ML, any increase in complexity is linear, making things more predictable and manageable.

What's more, a faster comparison operator was integrated. This means you can expect up to twice the speed for secure inference in common quantization scenarios. And you can now opt for higher quantization precision, from 10 to 24 bits, without facing a steep climb in latency. This enhancement is particularly useful for datasets with non-uniform distributions, where such precision is often crucial. Plus, it opens the door to using deeper decision trees, which can be a game changer for analyzing large datasets.

Additional links

Read more related posts

TFHE-rs v0.4.0: Signed Integers and Encrypted Conditionals

The new version of TFHE-rs introduces support for signed integers

Read Article

Concrete v2.5: Multiple-Outputs and Iterative Functions, TFHE-rs Under the Hood, and New Truncate-PBS Operator

This new version of Concrete introduces support for multi-output functions and iterative use in loops, and many other features.

Read Article

fhEVM v0.3: New Stack and Better Performances

This new version of Zama's fhEVM introduces a brand new technical stack, alongside other exciting features.

Read Article

Zama Product Releases - January 2024

With these releases, Zama continues to build its suite of products to make homomorphic encryption accessible, easy, and fast.

Read Article