
Google has rolled out VaultGemma, a new AI model, developed with privacy-preserving techniques to keep training data confidential. VaultGemma is the small language model (SLM) that boasts of one billion parameters, and is touted as the largest model ever trained with differential privacy (DP). The model has been developed by applying a new set of scaling laws derived by Google researchers in partnership with its DeepMind AI unit.
The release is part of Google’s Gemma family of models and is aimed at researchers and developers who want to experiment with privacy-preserving AI systems. By open-sourcing the model, Google hopes to speed up work on secure machine learning and make privacy-focused approaches easier to test and deploy.
VaultGemma is trained with differential privacy, a mathematical method that limits how much information about any one person can be learned from a model. Google says the model can safely be trained on sensitive datasets because it controls the amount of data exposed during the training process.
The team built VaultGemma using open datasets and synthetic data. The goal was to create a model that does not memorize specific details from its training data. This reduces the risk of data leaks through model outputs, a problem that has been seen in other large language models.
Google highlighted in its announcement that VaultGemma meets the strict definitions of differential privacy, which have been independently verified by external reviewers. This makes it different from models that only claim to be privacy-preserving but do not meet formal standards.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.