ดาวน์โหลด Jlama - Jlama Source Source Download

Jlama

ซอร์สโค้ดอื่น ๆ

v0.8.3

ดาวน์โหลด

- Jlama: เอ็นจิ้นการอนุมาน LLM ที่ทันสมัยสำหรับ Java

jlama น่ารัก

คุณสมบัติ

การสนับสนุนแบบจำลอง:

Gemma & Gemma 2 รุ่น
นางแบบ Llama & Llama2 & Llama3
โมเดล Mistral & Mixtral
รุ่น Qwen2
IBM Granite Models
รุ่น GPT-2
รุ่นเบิร์ต
BPE tokenizers
Tokenizers คำพูด

ดำเนินการ:

ความสนใจเพจ
ส่วนผสมของผู้เชี่ยวชาญ
การโทรเครื่องมือ
สร้าง EMBEDDINGS
การสนับสนุนตัวจําแนก
รูปแบบ HuggingFace Safetensors และรูปแบบโทเค็น
สนับสนุนประเภท F32, F16, BF16
รองรับ Q8, Q4 quantization รุ่น Q4
การดำเนินการ GEMM ที่รวดเร็ว
กระจายการอนุมาน!

Jlama ต้องการ Java 20 หรือใหม่กว่าและใช้ API เวกเตอร์ใหม่เพื่อการอนุมานที่เร็วขึ้น

- ใช้ทำเพื่ออะไร?

เพิ่มการอนุมาน LLM โดยตรงไปยังแอปพลิเคชัน Java ของคุณ

- เริ่มต้นอย่างรวดเร็ว

‍♀ วิธีใช้เป็นลูกค้าในพื้นที่ (กับ Jbang!)

Jlama มีเครื่องมือบรรทัดคำสั่งที่ทำให้ใช้งานง่าย

CLI สามารถทำงานกับ Jbang ได้

 # Install jbang (or https://www.jbang.dev/download/)
curl -Ls https://sh.jbang.dev | bash -s - app setup

# Install Jlama CLI (will ask if you trust the source)
jbang app install --force jlama@tjake

ตอนนี้คุณติดตั้ง Jlama แล้วคุณสามารถดาวน์โหลดรุ่นจาก HuggingFace และแชทกับมันได้ หมายเหตุฉันมีโมเดลล่วงหน้าที่มีอยู่ที่ https://hf.co/tjake

 # Run the openai chat api and UI on a model
jlama restapi tjake/Llama-3.2-1B-Instruct-JQ4 --auto-download

เปิดเบราว์เซอร์ไปที่ http: // localhost: 8080/

การแชทสาธิต

Usage:

jlama [COMMAND]

Description:

Jlama is a modern LLM inference engine for Java !
Quantized models are maintained at https://hf.co/tjake

Choose from the available commands:

Inference:
  chat                 Interact with the specified model
  restapi              Starts a openai compatible rest api for interacting with this model
  complete             Completes a prompt using the specified model

Distributed Inference:
  cluster-coordinator  Starts a distributed rest api for a model using cluster workers
  cluster-worker       Connects to a cluster coordinator to perform distributed inference

Other:
  download             Downloads a HuggingFace model - use owner/name format
  list                 Lists local models
  quantize             Quantize the specified model

? ‍ วิธีใช้ในโครงการ Java ของคุณ

จุดประสงค์หลักของ Jlama คือการให้วิธีง่ายๆในการใช้แบบจำลองภาษาขนาดใหญ่ใน Java

วิธีที่ง่ายที่สุดในการฝัง Jlama ในแอพของคุณคือการรวม Langchain4J

หากคุณต้องการฝัง Jlama โดยไม่มี Langchain4j ให้เพิ่มการพึ่งพา Maven ต่อไปนี้ในโครงการของคุณ:

< dependency >
  < groupId >com.github.tjake</ groupId >
  < artifactId >jlama-core</ artifactId >
  < version >${jlama.version}</ version >
</ dependency >

< dependency >
  < groupId >com.github.tjake</ groupId >
  < artifactId >jlama-native</ artifactId >
  <!-- supports linux-x86_64, macos-x86_64/aarch_64, windows-x86_64 
       Use https://github.com/trustin/os-maven-plugin to detect os and arch -->
  < classifier >${os.detected.name}-${os.detected.arch}</ classifier >
  < version >${jlama.version}</ version >
</ dependency >

Jlama ใช้คุณสมบัติตัวอย่าง Java 21 คุณสามารถเปิดใช้งานคุณสมบัติทั่วโลกด้วย:

 export JDK_JAVA_OPTIONS= " --add-modules jdk.incubator.vector --enable-preview "

หรือเปิดใช้งานคุณสมบัติตัวอย่างโดยการกำหนดค่าคอมไพเลอร์ Maven และปลั๊กอิน FailSafe

จากนั้นคุณสามารถใช้คลาสโมเดลเพื่อเรียกใช้โมเดล:

 public void sample () throws IOException {
    String model = "tjake/Llama-3.2-1B-Instruct-JQ4" ;
    String workingDirectory = "./models" ;

    String prompt = "What is the best season to plant avocados?" ;

    // Downloads the model or just returns the local path if it's already downloaded
    File localModelPath = new Downloader ( workingDirectory , model ). huggingFaceModel ();
    
    // Loads the quantized model and specified use of quantized memory
    AbstractModel m = ModelSupport . loadModel ( localModelPath , DType . F32 , DType . I8 );

    PromptContext ctx ;
    // Checks if the model supports chat prompting and adds prompt in the expected format for this model
    if ( m . promptSupport (). isPresent ()) {
        ctx = m . promptSupport ()
                . get ()
                . builder ()
                . addSystemMessage ( "You are a helpful chatbot who writes short responses." )
                . addUserMessage ( prompt )
                . build ();
    } else {
        ctx = PromptContext . of ( prompt );
    }

    System . out . println ( "Prompt: " + ctx . getPrompt () + " n " );
    // Generates a response to the prompt and prints it
    // The api allows for streaming or non-streaming responses
    // The response is generated with a temperature of 0.7 and a max token length of 256
    Generator . Response r = m . generate ( UUID . randomUUID (), ctx , 0.0f , 256 , ( s , f ) -> {});
    System . out . println ( r . responseText );
 }

ให้ดาวกับเรา!

หากคุณชอบหรือกำลังใช้โครงการนี้เพื่อสร้างของคุณเองโปรดให้ดารากับเรา เป็นวิธีฟรีในการแสดงการสนับสนุนของคุณ

️แผนงาน

รองรับรุ่นที่มากขึ้นเรื่อย ๆ
~~เพิ่ม Tokenizers Java Pure~~
~~รองรับปริมาณ (เช่น K-Quantization)~~
เพิ่มการสนับสนุน LORA
การสนับสนุน Graalvm
~~เพิ่มการอนุมานแบบกระจาย~~

️ใบอนุญาตและการอ้างอิง

รหัสมีอยู่ภายใต้ใบอนุญาต Apache

หากคุณพบว่าโครงการนี้มีประโยชน์ในการวิจัยของคุณโปรดอ้างอิงงานนี้ที่

 @misc{jlama2024,
    title = {Jlama: A modern Java inference engine for large language models},
    url = {https://github.com/tjake/jlama},
    author = {T Jake Luciani},
    month = {January},
    year = {2024}
}

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน v0.8.3
ประเภท ซอร์สโค้ดอื่น ๆ
เวลาอัปเดต 2025-02-25
ขนาด 3.19MB
มาจาก Github

แอปที่เกี่ยวข้อง

Google Dorks

2025-03-10
shepherd

2025-06-04
hidusbf

2025-02-14
mongo express

2025-06-04
Free Algorithms Books

2025-05-29
markdownpedia

2025-04-22

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
Google Dorks

ซอร์สโค้ดอื่น ๆ

1.0
shepherd

ซอร์สโค้ดอื่น ๆ

v6.1.6-react-shepherd: Prepare Release (#3063)
hidusbf

ซอร์สโค้ดอื่น ๆ

1.0.0
Google Dorks

ซอร์สโค้ดอื่น ๆ

1.0
shepherd

ซอร์สโค้ดอื่น ๆ

v6.1.6-react-shepherd: Prepare Release (#3063)
hidusbf

ซอร์สโค้ดอื่น ๆ

1.0.0

ข้อมูลที่เกี่ยวข้อง ทั้งหมด