Morelora
Lora asli:
$ W = w_0 + uv $ Dan $ rank (UV) leq r $
Inisialisasi yang lebih baik:
$ W = w_0 - u_0 {v_0} + uv $
Aditif Lora:
$ W = w_0 + ui_ {r (1 kali frac {n} {r})} + i_ {r ( frac {m} {r} kali 1)} v $ Di mana $ U in mathbb {r}^{m kali r}, v di { mathbb {r}^{r kali n}} $ Dan $ rank (UV) leq 2r $
Hadamard Mul Lora:
$ W = w_0 + odot_ {i = 1}^{i = k} ( delta_i) $ Di mana $ Delta_i = u_iv_i $
$ r '= frac {r} {k}, u_i in mathbb {r}^{m kali r'} $ , $ V_i di mathbb {r}^{r ' kali n} $ Dan $ rank ( odot_ {i = 1}^{i = k} ( delta_i^t)) leq ( frac {r} {k})^k $
Hadamard menambahkan lora:
$ W = w_0 + odot_ {i = 1}^{i = k} ( delta_i) $ Di mana $ Delta_i = u_ii_ {r '(1 kali frac {n} {r'})}+i_ {r '( frac {m} {r'} kali 1)} v_i $ $
$ r '= frac {r} {k} $ , $ U_i di mathbb {r}^{r ' kali n} $ , $ V_i di mathbb {r}^{m kali r '} $ Dan $ rank ( odot_ {i = 1}^{i = k} ( delta_i)) leq ( frac {2r} {k})^k $
Hadamard Lora: Aktivasi
$ Delta = odot_ {i = 1}^{i = k} ( tanh (u_iv_i^t)) $
$ Delta = odot_ {i = 1}^{i = k} ( sigma (u_iv_i^t)) $
Dylora:
Perbarui secara acak serangkaian peringkat
Perbarui bagian dari lapisan
Referensi:
@online { kexuefm-9590 ,
title = {梯度视角下的LoRA:简介、分析、猜测及推广} ,
author = {苏剑林} ,
year = { 2023 } ,
month = { Apr } ,
url = { url{https://spaces.ac.cn/archives/9590} } ,
} @misc { hyeonwoo2023fedpara ,
title = { FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning } ,
author = { Nam Hyeon-Woo and Moon Ye-Bin and Tae-Hyun Oh } ,
year = { 2023 } ,
eprint = { 2108.06098 } ,
archivePrefix = { arXiv } ,
primaryClass = { cs.LG }
}