hfppl 다운로드 hfppl 소스 코드 다운로드

hfppl

AI 소스 코드

1.0.0

다운로드

llamppl + huggingface

LLAMPPL은 언어 모델 확률 프로그래밍을위한 연구 프로토 타입입니다. LLMS, 기호 프로그램 논리 및 확률 컨디셔닝을 결합한 확률 프로그램을 작성하여 언어 생성 작업을 지정합니다. 이러한 작업을 해결하기 위해 LLAMPPL은 특수 순차적 인 Monte Carlo 추론 알고리즘을 사용합니다. 이 기술인 SMC 스티어링은 최근 워크숍 초록에 설명되어 있습니다.

이 저장소는 Huggingface Transformers와 함께 사용하기 위해 llamppl을 구현합니다.

설치

llamppl을 시험해 보려면 GPT-2를 사용하여 간단한 제한된 세대 작업을 수행하는 Colab의 데모 노트북을 확인하십시오. (대형 모델은 Colab의 무료 버전보다 더 많은 RAM 또는 GPU 리소스가 필요할 수 있습니다.)

메모

우리는시를 사용하여 종속성을 관리합니다. 시를 설치하지 않은 경우 pip install poetry 시로 설치할 수 있습니다.

자신의 기계를 시작하려면이 저장소를 복제하고 poetry install 실행하여 hfppl 및 그 종속성을 설치하십시오.

 git clone https://github.com/probcomp/hfppl
cd hfppl
poetry install

그런 다음 예제를 실행하십시오. 이로 인해 Vicuna-7B-V1.5의 가중치가 다운로드됩니다.

 poetry run python examples/hard_constraints.py

모든 것이 효과가 있다면, 당신은 모델이 최대 5 글자의 단어를 사용하여 정치적 뉴스를 생성하는 것을 볼 수 있습니다 (예 : Jill Biden 박사는 여전히 백악관에서 1 년이 걸리지 만 그녀는 오늘 UN으로 처음 여행 할 예정입니다.”).

llamppl로 모델링

llamppl 프로그램은 hfppl.Model 클래스의 서브 클래스입니다.

 from hfppl import Model , LMContext , CachedCausalLM

# A LLaMPPL model subclasses the Model class
class MyModel ( Model ):

    # The __init__ method is used to process arguments
    # and initialize instance variables.
    def __init__ ( self , lm , prompt , forbidden_letter ):
        super (). __init__ ()

        # A stateful context object for the LLM, initialized with the prompt
        self . context = LMContext ( lm , prompt )
        self . eos_token = lm . tokenizer . eos_token_id
        
        # The forbidden letter
        self . forbidden_tokens = set ( i for ( i , v ) in enumerate ( lm . vocab )
                                      if forbidden_letter in v )
    
    # The step method is used to perform a single 'step' of generation.
    # This might be a single token, a single phrase, or any other division.
    # Here, we generate one token at a time.
    async def step ( self ):
        # Condition on the next token *not* being a forbidden token.
        await self . observe ( self . context . mask_dist ( self . forbidden_tokens ), False )
        
        # Sample the next token from the LLM -- automatically extends `self.context`.
        token = await self . sample ( self . context . next_token ())

        # Check for EOS or end of sentence
        if token . token_id == self . eos_token or str ( token ) in [ '.' , '!' , '?' ]:
            # Finish generation
            self . finish ()

    # To improve performance, a hint that `self.forbidden_tokens` is immutable
    def immutable_properties ( self ):
        return set ([ 'forbidden_tokens' ])

모델 클래스는 llamppl 프로그램을 지정하는 데 유용한 여러 가지 방법을 제공합니다.

self.sample(dist[, proposal]) 주어진 분포에서 샘플. 제안서를 제공한다고해서 작업 설명을 수정하지는 않지만 추론을 향상시킬 수 있습니다. 예를 들어, 우리는 금지 된 편지를 선제 적으로 피하는 제안을 사용합니다.
주어진 부울 표현에 대한 self.condition(cond) 조건.
self.finish() 생성이 완료되었음을 나타냅니다.
self.observe(dist, obs) 주어진 분포에서 '소프트 컨디셔닝'형태를 수행합니다. dist 에서 값 v 샘플링 한 다음 즉시 실행 condition(v == obs) 과 동일합니다.

추론을 실행하기 위해 smc_steer 또는 smc_standard 메소드를 사용합니다.

 import asyncio
from hfppl import smc_steer

# Initialize the HuggingFace model
lm = CachedCausalLM . from_pretrained ( "meta-llama/Llama-2-7b-hf" , auth_token = < YOUR_HUGGINGFACE_API_TOKEN_HERE > )

# Create a model instance
model = MyModel ( lm , "The weather today is expected to be" , "e" )

# Run inference
particles = asyncio . run ( smc_steer ( model , 5 , 3 )) # number of particles N, and beam factor K

샘플 출력 :

 sunny.
sunny and cool.
34° (81°F) in Chicago with winds at 5mph.
34° (81°F) in Chicago with winds at 2-9 mph.
hot and humid with a possibility of rain, which is not uncommon for this part of Mississippi.

추가 문서는 https://probcomp.github.io/hfppl에서 찾을 수 있습니다.

확장하다

추가 정보