ดาวน์โหลด openai realtime console - ดาวน์โหลด openai realtime console ซอร์สโค้ดดาวน์โหลด

คอนโซลเรียลไทม์ OpenAI

OpenAI Realtime Console มีวัตถุประสงค์เพื่อใช้เป็นตัวตรวจสอบและอ้างอิง API แบบโต้ตอบสำหรับ OpenAI Realtime API มันมาพร้อมกับไลบรารียูทิลิตี้สองไลบรารี openai/openai-realtime-api-beta ที่ทำหน้าที่เป็น ไคลเอนต์อ้างอิง (สำหรับเบราว์เซอร์และ Node.js) และ /src/lib/wavtools ซึ่งช่วยให้การจัดการเสียงอย่างง่ายในเบราว์เซอร์

การเริ่มคอนโซล

นี่คือโปรเจ็กต์ React ที่สร้างขึ้นโดยใช้ create-react-app ที่รวมกลุ่มผ่าน Webpack ติดตั้งโดยแยกเนื้อหาของแพ็คเกจนี้แล้วใช้งาน

$ npm i

เริ่มต้นเซิร์ฟเวอร์ของคุณด้วย:

$ npm start

ควรจะสามารถใช้ได้ผ่าน localhost:3000

สารบัญ

การใช้คอนโซล
1. การใช้เซิร์ฟเวอร์รีเลย์
ไคลเอนต์อ้างอิง API แบบเรียลไทม์
1. กำลังส่งเสียงแบบสตรีมมิ่ง
2. การเพิ่มและการใช้เครื่องมือ
3. กำลังขัดจังหวะโมเดล
4. อ้างอิงเหตุการณ์ของลูกค้า
Wavtools
1. เริ่มต้น WavRecorder อย่างรวดเร็ว
2. WavStreamPlayer เริ่มต้นอย่างรวดเร็ว
รับทราบและติดต่อ

การใช้คอนโซล

คอนโซลต้องการคีย์ OpenAI API ( คีย์ผู้ใช้ หรือ คีย์โปรเจ็กต์ ) ที่มีสิทธิ์เข้าถึง Realtime API คุณจะได้รับแจ้งเมื่อเริ่มต้นให้ป้อน มันจะถูกบันทึกผ่าน localStorage และสามารถเปลี่ยนแปลงได้ตลอดเวลาจาก UI

หากต้องการเริ่มเซสชัน คุณจะต้อง เชื่อมต่อ ซึ่งจะต้องมีการเข้าถึงไมโครโฟน จากนั้นคุณสามารถเลือกระหว่างโหมดการสนทนา แบบแมนนวล (Push-to-talk) และ vad (Voice Activity Detection) และสลับระหว่างโหมดเหล่านี้ได้ตลอดเวลา

มีสองฟังก์ชั่นที่เปิดใช้งาน;

get_weather : ถามสภาพอากาศได้ทุกที่ แล้วโมเดลจะพยายามระบุตำแหน่ง แสดงบนแผนที่ และรับสภาพอากาศสำหรับสถานที่นั้นอย่างดีที่สุด โปรดทราบว่าไม่มีการเข้าถึงตำแหน่ง และพิกัดจะ "คาดเดา" จากข้อมูลการฝึกของโมเดล ดังนั้นความแม่นยำอาจไม่สมบูรณ์แบบ
set_memory : คุณสามารถขอให้โมเดลจดจำข้อมูลให้คุณได้ และโมเดลจะจัดเก็บไว้ใน JSON blob ทางด้านซ้าย

คุณสามารถขัดจังหวะโมเดลได้อย่างอิสระเมื่อใดก็ได้ในโหมด Push-to-Talk หรือ VAD

การใช้เซิร์ฟเวอร์รีเลย์

หากคุณต้องการสร้างการใช้งานที่มีประสิทธิภาพมากขึ้นและลองใช้ไคลเอนต์อ้างอิงโดยใช้เซิร์ฟเวอร์ของคุณเอง เราได้รวม Node.js Relay Server ไว้ด้วย

$ npm run relay

มันจะเริ่มต้นโดยอัตโนมัติบน localhost:8081

คุณจะต้องสร้างไฟล์ .env ด้วยการกำหนดค่าต่อไปนี้:

 OPENAI_API_KEY=YOUR_API_KEY
REACT_APP_LOCAL_RELAY_SERVER_URL=http://localhost:8081

คุณจะต้องรีสตาร์ททั้งแอป React และเซิร์ฟเวอร์รีเลย์สำหรับ .env. การเปลี่ยนแปลงที่จะมีผลใช้บังคับ URL เซิร์ฟเวอร์ภายในถูกโหลดผ่าน ConsolePage.tsx หากต้องการหยุดใช้เซิร์ฟเวอร์รีเลย์เมื่อใดก็ได้ เพียงลบตัวแปรสภาพแวดล้อมหรือตั้งค่าเป็นสตริงว่าง

 /**
 * Running a local relay server will allow you to hide your API key
 * and run custom logic on the server
 *
 * Set the local relay server address to:
 * REACT_APP_LOCAL_RELAY_SERVER_URL=http://localhost:8081
 *
 * This will also require you to set OPENAI_API_KEY= in a `.env` file
 * You can run it with `npm run relay`, in parallel with `npm start`
 */
const LOCAL_RELAY_SERVER_URL : string =
  process . env . REACT_APP_LOCAL_RELAY_SERVER_URL || '' ;

เซิร์ฟเวอร์นี้เป็น เพียงการส่งต่อข้อความธรรมดา แต่สามารถขยายไปยัง:

ซ่อนข้อมูลรับรอง API หากคุณต้องการจัดส่งแอปเพื่อเล่นออนไลน์
จัดการสายบางอย่างที่คุณต้องการเก็บเป็นความลับ (เช่น instructions ) บนเซิร์ฟเวอร์โดยตรง
จำกัดประเภทของกิจกรรมที่ไคลเอ็นต์สามารถรับและส่งได้

คุณจะต้องใช้คุณสมบัติเหล่านี้ด้วยตนเอง

ไคลเอนต์อ้างอิง API แบบเรียลไทม์

ไคลเอนต์อ้างอิงล่าสุดและเอกสารประกอบมีอยู่ใน GitHub ที่ openai/openai-realtime-api-beta

คุณสามารถใช้ไคลเอ็นต์นี้ด้วยตนเองในโปรเจ็กต์ React (ส่วนหน้า) หรือ Node.js สำหรับเอกสารฉบับเต็ม โปรดดูที่พื้นที่เก็บข้อมูล GitHub แต่คุณสามารถใช้คำแนะนำที่นี่เป็นข้อมูลเบื้องต้นในการเริ่มต้นได้

 import { RealtimeClient } from '/src/lib/realtime-api-beta/index.js' ;

const client = new RealtimeClient ( { apiKey : process . env . OPENAI_API_KEY } ) ;

// Can set parameters ahead of connecting
client . updateSession ( { instructions : 'You are a great, upbeat friend.' } ) ;
client . updateSession ( { voice : 'alloy' } ) ;
client . updateSession ( { turn_detection : 'server_vad' } ) ;
client . updateSession ( { input_audio_transcription : { model : 'whisper-1' } } ) ;

// Set up event handling
client . on ( 'conversation.updated' , ( { item , delta } ) => {
  const items = client . conversation . getItems ( ) ; // can use this to render all items
  /* includes all changes to conversations, delta may be populated */
} ) ;

// Connect to Realtime API
await client . connect ( ) ;

// Send an item and triggers a generation
client . sendUserMessageContent ( [ { type : 'text' , text : `How are you?` } ] ) ;

กำลังส่งเสียงแบบสตรีมมิ่ง

หากต้องการส่งการสตรีมเสียง ให้ใช้เมธอด .appendInputAudio() หากคุณอยู่ในโหมด turn_detection: 'disabled' คุณต้องใช้ .generate() เพื่อบอกให้โมเดลตอบสนอง

 // Send user audio, must be Int16Array or ArrayBuffer
// Default audio format is pcm16 with sample rate of 24,000 Hz
// This populates 1s of noise in 0.1s chunks
for ( let i = 0 ; i < 10 ; i ++ ) {
  const data = new Int16Array ( 2400 ) ;
  for ( let n = 0 ; n < 2400 ; n ++ ) {
    const value = Math . floor ( ( Math . random ( ) * 2 - 1 ) * 0x8000 ) ;
    data [ n ] = value ;
  }
  client . appendInputAudio ( data ) ;
}
// Pending audio is committed and model is asked to generate
client . createResponse ( ) ;

การเพิ่มและการใช้เครื่องมือ

การทำงานกับเครื่องมือเป็นเรื่องง่าย เพียงโทร .addTool() และตั้งค่าการโทรกลับเป็นพารามิเตอร์ตัวที่สอง การโทรกลับจะดำเนินการด้วยพารามิเตอร์สำหรับเครื่องมือ และผลลัพธ์จะถูกส่งกลับไปยังโมเดลโดยอัตโนมัติ

 // We can add tools as well, with callbacks specified
client . addTool (
  {
    name : 'get_weather' ,
    description :
      'Retrieves the weather for a given lat, lng coordinate pair. Specify a label for the location.' ,
    parameters : {
      type : 'object' ,
      properties : {
        lat : {
          type : 'number' ,
          description : 'Latitude' ,
        } ,
        lng : {
          type : 'number' ,
          description : 'Longitude' ,
        } ,
        location : {
          type : 'string' ,
          description : 'Name of the location' ,
        } ,
      } ,
      required : [ 'lat' , 'lng' , 'location' ] ,
    } ,
  } ,
  async ( { lat , lng , location } ) => {
    const result = await fetch (
      `https://api.open-meteo.com/v1/forecast?latitude= ${ lat } &longitude= ${ lng } &current=temperature_2m,wind_speed_10m`
    ) ;
    const json = await result . json ( ) ;
    return json ;
  }
) ;

กำลังขัดจังหวะโมเดล

คุณอาจต้องการขัดจังหวะโมเดลด้วยตนเอง โดยเฉพาะในโหมด turn_detection: 'disabled' ในการดำเนินการนี้ เราสามารถใช้:

 // id is the id of the item currently being generated
// sampleCount is the number of audio samples that have been heard by the listener
client . cancelResponse ( id , sampleCount ) ;

วิธีนี้จะทำให้โมเดลหยุดการสร้างทันที แต่ยังตัดรายการที่กำลังเล่นออกโดยการลบเสียงทั้งหมดหลังจาก sampleCount และล้างการตอบกลับด้วยข้อความ โดยใช้วิธีนี้ คุณสามารถขัดจังหวะโมเดลและป้องกันไม่ให้ "จดจำ" สิ่งใดก็ตามที่มันสร้างขึ้นซึ่งอยู่ข้างหน้าสถานะของผู้ใช้

อ้างอิงเหตุการณ์ของลูกค้า

มีเหตุการณ์ไคลเอนต์หลักห้าเหตุการณ์สำหรับโฟลว์การควบคุมแอปพลิเคชันใน RealtimeClient โปรดทราบว่านี่เป็นเพียงภาพรวมของการใช้ไคลเอนต์ ข้อมูลจำเพาะเหตุการณ์ Realtime API แบบเต็มนั้นใหญ่กว่ามาก หากคุณต้องการการควบคุมเพิ่มเติม โปรดดูที่ที่เก็บ GitHub: openai/openai-realtime-api-beta

 // errors like connection failures
client . on ( 'error' , ( event ) => {
  // do thing
} ) ;

// in VAD mode, the user starts speaking
// we can use this to stop audio playback of a previous response if necessary
client . on ( 'conversation.interrupted' , ( ) => {
  /* do something */
} ) ;

// includes all changes to conversations
// delta may be populated
client . on ( 'conversation.updated' , ( { item , delta } ) => {
  // get all items, e.g. if you need to update a chat window
  const items = client . conversation . getItems ( ) ;
  switch ( item . type ) {
    case 'message' :
      // system, user, or assistant message (item.role)
      break ;
    case 'function_call' :
      // always a function call from the model
      break ;
    case 'function_call_output' :
      // always a response from the user / application
      break ;
  }
  if ( delta ) {
    // Only one of the following will be populated for any given event
    // delta.audio = Int16Array, audio added
    // delta.transcript = string, transcript added
    // delta.arguments = string, function arguments added
  }
} ) ;

// only triggered after item added to conversation
client . on ( 'conversation.item.appended' , ( { item } ) => {
  /* item status can be 'in_progress' or 'completed' */
} ) ;

// only triggered after item completed in conversation
// will always be triggered after conversation.item.appended
client . on ( 'conversation.item.completed' , ( { item } ) => {
  /* item status will always be 'completed' */
} ) ;

Wavtools

Wavtools มีการจัดการสตรีมเสียง PCM16 ในเบราว์เซอร์อย่างง่ายดาย ทั้งการบันทึกและการเล่น

WavRecorder เริ่มต้นอย่างรวดเร็ว

 import { WavRecorder } from '/src/lib/wavtools/index.js' ;

const wavRecorder = new WavRecorder ( { sampleRate : 24000 } ) ;
wavRecorder . getStatus ( ) ; // "ended"

// request permissions, connect microphone
await wavRecorder . begin ( ) ;
wavRecorder . getStatus ( ) ; // "paused"

// Start recording
// This callback will be triggered in chunks of 8192 samples by default
// { mono, raw } are Int16Array (PCM16) mono & full channel data
await wavRecorder . record ( ( data ) => {
  const { mono , raw } = data ;
} ) ;
wavRecorder . getStatus ( ) ; // "recording"

// Stop recording
await wavRecorder . pause ( ) ;
wavRecorder . getStatus ( ) ; // "paused"

// outputs "audio/wav" audio file
const audio = await wavRecorder . save ( ) ;

// clears current audio buffer and starts recording
await wavRecorder . clear ( ) ;
await wavRecorder . record ( ) ;

// get data for visualization
const frequencyData = wavRecorder . getFrequencies ( ) ;

// Stop recording, disconnects microphone, output file
await wavRecorder . pause ( ) ;
const finalAudio = await wavRecorder . end ( ) ;

// Listen for device change; e.g. if somebody disconnects a microphone
// deviceList is array of MediaDeviceInfo[] + `default` property
wavRecorder . listenForDeviceChange ( ( deviceList ) => { } ) ;

WavStreamPlayer เริ่มต้นอย่างรวดเร็ว

 import { WavStreamPlayer } from '/src/lib/wavtools/index.js' ;

const wavStreamPlayer = new WavStreamPlayer ( { sampleRate : 24000 } ) ;

// Connect to audio output
await wavStreamPlayer . connect ( ) ;

// Create 1s of empty PCM16 audio
const audio = new Int16Array ( 24000 ) ;
// Queue 3s of audio, will start playing immediately
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;

// get data for visualization
const frequencyData = wavStreamPlayer . getFrequencies ( ) ;

// Interrupt the audio (halt playback) at any time
// To restart, need to call .add16BitPCM() again
const trackOffset = await wavStreamPlayer . interrupt ( ) ;
trackOffset . trackId ; // "my-track"
trackOffset . offset ; // sample number
trackOffset . currentTime ; // time in track

รับทราบและติดต่อ

ขอขอบคุณที่ลองใช้คอนโซลเรียลไทม์ เราหวังว่าคุณจะสนุกกับ Realtime API ขอขอบคุณเป็นพิเศษสำหรับทีมงาน Realtime API ทั้งหมดที่ทำให้สิ่งนี้เป็นไปได้ โปรดอย่าลังเลที่จะติดต่อ ถามคำถาม หรือแสดงความคิดเห็นโดยการสร้างปัญหาในพื้นที่เก็บข้อมูล คุณสามารถติดต่อเราและแจ้งให้เราทราบว่าคุณคิดอย่างไรโดยตรง!