Morelora
Lora original:
$ W = W_0 + UV $ y $ rank (uv) leq r $
Mejor inicialización:
$ W = w_0 - u_0 {v_0} + UV $
Aditivo lora:
$ W = w_0 + ui_ {r (1 times frac {n} {r})} + i_ {r ( frac {m} {r} times 1)} v $ dónde $ U in mathbb {r}^{m times r}, v in { mathbb {r}^{r times n}} $ y $ rank (uv) leq 2r $
Hadamard Mul Lora:
$ W = w_0 + odot_ {i = 1}^{i = k} ( delta_i) $ dónde $ Delta_i = u_iv_i $
$ r '= frac {r} {k}, u_i in mathbb {r}^{m times r'} $ , $ V_i in mathbb {r}^{r ' Times n} $ y $ rank ( odot_ {i = 1}^{i = k} ( delta_i^t)) leq ( frac {r} {k})^k $
Hadamard Agregar lora:
$ W = w_0 + odot_ {i = 1}^{i = k} ( delta_i) $ dónde $ Delta_i = u_ii_ {r '(1 Times frac {n} {r'})}+i_ {r '( frac {m} {r'} times 1)} v_i $
$ r '= frac {r} {k} $ , $ U_i in mathbb {r}^{r ' Times n} $ , $ V_i in mathbb {r}^{m times r '} $ y $ rank ( odot_ {i = 1}^{i = k} ( delta_i)) leq ( frac {2r} {k})^k $
Hadamard Lora: activación
$ Delta = odot_ {i = 1}^{i = k} ( tanh (u_iv_i^t)) $
$ Delta = odot_ {i = 1}^{i = k} ( sigma (u_iv_i^t)) $
Dylora:
Actualizar al azar una serie de rangos
Actualizar parte de las capas
Referencia:
@online { kexuefm-9590 ,
title = {梯度视角下的LoRA:简介、分析、猜测及推广} ,
author = {苏剑林} ,
year = { 2023 } ,
month = { Apr } ,
url = { url{https://spaces.ac.cn/archives/9590} } ,
} @misc { hyeonwoo2023fedpara ,
title = { FedPara: Low-Rank Hadamard Product for Communication-Efficient Federated Learning } ,
author = { Nam Hyeon-Woo and Moon Ye-Bin and Tae-Hyun Oh } ,
year = { 2023 } ,
eprint = { 2108.06098 } ,
archivePrefix = { arXiv } ,
primaryClass = { cs.LG }
}