Blog

I already explained DP ML in another post 1, so this blog post covers the question, how can we design a service that lets customers finetune Large Language Models in a privacy preserving way.

With the rise of data privacy laws like GDPR, DSGVO and CCPA, companies face increased scrutiny on data handling practices. The demand for privacy-preserving AI models is growing, especially in highly regulated industries. Despite this demand, many businesses lack the in-house expertise to implement their own model fine-tuning. Although there are a lot of third party services for finetuning models, they do not offer any privacy guarantees over the potentially sensitive datasets.

This blog post addresses a potential cloud service offering privacy-preserving LLM finetuning.

Our goal is to design a system where:

  • Users upload their dataset: This dataset may contain sensitive information.
  • The dataset is perturbed: This ensures that the original sensitive data cannot be reconstructed.
  • The model is fine-tuned: The perturbed data is used to fine-tune a pre-trained model, preserving its utility while protecting privacy.
  • Users receive fine-tuned model weights: The resulting model is privacy-preserving and ready for deployment.

Please refer to the complete code and the flask webservice implementation here: https://github.com/anon767/DP_FinetuningService/tree/main

Process

To ensure privacy, we perturb the dataset directly in the user’s browser before it is sent to the server. This prevents any sensitive data from leaving the user’s environment. We achieve this by training a Word2Vec (W2V) model and applying noise mechanisms 2.

Thus we train a W2V model on text8 dataset using a vocab of 100k with these parameters:

  • AdamW with weight Decay
  • 50 dimensions (Needs to be fast for Inference on Browser)
  • Embedding clipped to [-1, 1] for calculating Epsilon
  • Windows size of 5 and 8 negative samples for Noise Contrastive Loss
  • We apply either a Laplacian or Gaussian noise mechanism; initial experiments show Gaussian noise performs better. With an embedding range of two (global sensitivity) and a noise standard deviation of 0.1, our LDP epsilon value is approximately 96.9 (assuming delta = 10^-5).

And eventually we load the Word2Vec model with tensorflowjs in the browser and perturb the text in JavaScript:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
async function loadModel() {
    model = await tf.loadGraphModel('/static/model.json');
    console.log('Custom Word2Vec model loaded');
}

async function loadVocab() {
    const response = await fetch('/static/word_index.json');
    vocab = await response.json();
    console.log('Vocabulary loaded');
}

We perturb each word vector with Gaussian noise:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
function perturbVector(vector) {
    const stdDev = 0.1;
    return vector.map(v => v + gaussianNoise(stdDev));
}

function gaussianNoise(stdDev) {
    const u1 = Math.random();
    const u2 = Math.random();
    const z0 = Math.sqrt(-2.0 * Math.log(u1)) * Math.cos(2.0 * Math.PI * u2);
    return z0 * stdDev;
}
async function perturbText(text) {
    const words = text.split(/\s+/);  // Tokenize by spaces
    const perturbedVectors = [];
    for (const word of words) {
        const wordIndex = await getWordEmbedding(word);  // Get word embedding
        const perturbedVector = perturbVector(wordIndex);  // Perturb the vector
        perturbedVectors.push(perturbedVector);
    }
    return perturbedVectors;
}

The cool thing is, the user does not have to trust us up until this point. The cleartext data is not leaving his premises. Given an original text like:

1
2
3
4
5
6
7
8
9
Harry Potter was a highly unusual boy in many ways.
He was born a wizard, and his life changed forever when he received a letter from Hogwarts.
Harry couldn't believe that he was going to a school for wizards.
"You're a wizard, Harry," said Hagrid, as he handed him the letter.
The scar on his forehead, shaped like a lightning bolt, marked him as someone special.
Voldemort, the dark wizard, had tried to kill Harry when he was just a baby.
Ron Weasley and Hermione Granger quickly became Harry's best friends.
The trio went on many adventures, from discovering secret rooms to battling dark forces.
At Hogwarts, Harry learned the importance of friendship, courage, and loyalty.

We send the perturbed vectors that roughly correspond to this text:

1
2
3
4
5
6
7
harry potter c by highly unusual boy a many ways
even city born no wizard work was life changed forever as modern received an letter up censoring
harry handloading believe head he then going general york school an wizards guillotin strong wizard
rolex had outscored as a handed him rather arts best precocial model his bachs shaped like an lightning reproduction
marked him religious someone special naslund non dark wizard had tried to kill harry when up c just a baby ron
simplify information lochaber arresting quickly became mallard best friends the iroquoian went on standard adventures
than refereeing secret rooms has pi dark both for gaspard harry learned main importance series friendship courage and loyalty

Once the perturbed data reaches the server, the backend fine-tunes the selected model. To prevent the model from memorizing sensitive information, we apply Differentially Private Stochastic Gradient Descent (DP-SGD). Additionally, we use Low-Rank Adaptation (LoRA) to fine-tune only a subset of model parameters, preserving the original performance. By clipping gradients at 1 and adding noise with a standard deviation of 0.1, we achieve an epsilon of approximately 48.45 (assuming delta = 10^-5).

So how does LoRA work3 ? Imagine a feedforward network: $$X^t = WX^{(t-1)} + b$$ While $X^0 \in R^{[N, |inputFeatures|]}$ being the input features and $W \in R^{[|outputFeatures|, |inputFeatures|]}$ being the weight matrix. The weight matrix might be potentially very large, having billions of parameters. So instead of finetuning that matrix we add another weight matrix with low rank and keep the original $W$ untouched: $$W_{LoRA} = AB$$ With $A \in R^{[|outputFeatures|, r]}$ and $B \in R^{[r, |inputFeatures|]}$. If $r$ is sufficiently small we can optimize $A$ and $B$ separately which will have a lot less parameters. $W_{LoRA}$ is simply a linear combination of these two matrices giving the original dimensions.

This gives us: $$X^t = WX^{(t-1)} + b + W_{LoRA}X^{(t-1)} $$

So instead of optimizing $|outputFeatures|*|inputFeatures|$ parameters, we optimize $|outputFeatures|*r + |inputFeatures|*r$ parameters. Thus, rank $r$ needs to be sufficiently small, bounded by the harmonic mean of $∣outputFeatures∣$ and $∣inputFeatures∣$ to ensure the LoRA optimization is efficient and reduces the total parameter count.

We load the selected model and find all modules that can potentially be finetuned by LoRA. We could also hardcode this, but since in the end we don’t know which model the user is going to select we do it dynamically.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from peft import get_peft_model, LoraConfig, TaskType
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load pre-trained model
selected_model = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(selected_model)
model = AutoModelForCausalLM.from_pretrained(selected_model)

# Configure LoRA
lora_compatible_modules = []

# Check each module in the model
for name, module in model.named_modules():
    # Check if the module is a type supported by LoRA
    if isinstance(module, (nn.Linear, nn.Conv2d)) or \
            (hasattr(nn, "Conv1D") and isinstance(module, nn.Conv1d)):
        lora_compatible_modules.append(name)

# Apply LoRA
lora_config = LoraConfig(
    r=16,
    lora_alpha=16,
    target_modules=lora_compatible_modules,  # Fine-tune linear layers
    task_type=TaskType.CAUSAL_LM
)
model = get_peft_model(model, lora_config)

Finally, we can use Opacus to train the LoRA adapters of the LLM with the DP-SGD 4.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from opacus import PrivacyEngine
from torch.optim import AdamW
from torch.utils.data import DataLoader

privacy_engine = PrivacyEngine()
model, optimizer, dataloader = privacy_engine.make_private(
    module=model,
    optimizer=AdamW(model.parameters(), lr=5e-5),
    data_loader=dataloader,
    noise_multiplier=1.0,
    max_grad_norm=1.0
)

for epoch in range(epochs):
    for batch in dataloader:
        optimizer.zero_grad()
        outputs = model(**batch)
        loss = outputs.loss
        loss.backward()
        optimizer.step()

In this post, we have outlined a framework for privacy-preserving fine-tuning of large language models (LLMs). The proposed system allows users to upload sensitive datasets with confidence, as their data remains secure through Local Differential Privacy (LDP) mechanisms applied directly in their browser. Once perturbed, the data is fine-tuned using Differentially Private Stochastic Gradient Descent (DP-SGD) on the server, with Low-Rank Adaptation (LoRA) ensuring efficient and parameter-efficient optimization.


  1. https://thecout.com/blog/dp/ ↩︎

  2. Dwork, C., & Roth, A. (2014). “The Algorithmic Foundations of Differential Privacy.” Foundations and Trends® in Theoretical Computer Science. (Comprehensive introduction to DP basics.) ↩︎

  3. Hu, E. J., Shen, Y., Wallis, P., et al. (2021). “LoRA: Low-Rank Adaptation of Large Language Models.” arXiv preprint arXiv:2106.09685. (Original paper on LoRA.) ↩︎

  4. Abadi, M., Chu, A., Goodfellow, I., et al. (2016). “Deep Learning with Differential Privacy.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS'16). (Foundational work introducing DP-SGD.) ↩︎

comments powered by Disqus