botbots 다운로드 botbots 소스 코드 다운로드

botbots

기타 소스코드

v1

다운로드

BOTBOTS 데이터 세트

ChatGpt의 두 인스턴스 ( gpt-3.5-turbo ) 사이의 대화로 구성된 데이터 세트. CLI 명령 및 대화 프롬프트 자체는 GPT-4에 의해 작성되었습니다. 데이터 세트는 광범위한 컨텍스트 (질문 및 답변, 논쟁 및 추론, 작업 중심 대화) 및 다운 스트림 작업 (예 : 호텔 예약, 의료 조언)을 다룹니다. 텍스트는 DataSetGpt와 OpenAI API를 사용하여 백엔드로 생성되었습니다. 대략적인 생성 비용 : $ 35.

사용 사례에는 다음이 포함될 수 있습니다.

gpt-3.5-turbo 에 특별한 초점을 맞추면서 독창적 인 잠재력, 적응성, 논리적 능력 및 LLM의 기타 측면에 대한 연구를 수행하십시오.
데이터 세트 (알파카와 같은)에서 작은 대화 모델을 훈련시킵니다.

구조

데이터 세트 비주얼

29796 발화를 가진 두 Chatgpt 인스턴스 사이의 총 2361 개의 대화.

`.gpt4.txt`

DataSETGPT 명령은 작업 지향 대화 데이터 세트 ( tod/hotels.json )의 생성을 위해 수동으로 설계되었습니다. 그런 다음 GPT-4는 다른 영역에서 TOD 시나리오 생성에 대해 유사한 명령을 생성하고 도메인 아이디어를 모두 제시하도록 자극했습니다. 그런 다음 명령을 다른 유형의 대화 (추론 및 브레인 스토밍)로 확장해야했습니다. brainstorming.gpt4.txt 가 가장 통찰력있는 .gpt4.txt 파일에서 자세한 내용을 확인할 수 있습니다.

작업 지향 대화

파일	대화	발언
tech_support.json	32	383
hotels.json	400	5557
legal_consultation.json	16	194
car_dealership.json	16	288
Travel_agency.json	16	255
home_mainenage.json	80	1073
real_estate.json	16	237
Medical_Appointments.json	240	3131
car_rentals.json	32	520
보험 _consultation.json	16	235
피트니스 _personal_training.json	16	233
flight_bookings.json	32	433
job_interview.json	16	214
event_tickets.json	240	3974
레스토랑 _mixed.json	45	627
레스토랑 .json	200	2814
pet_care.json	48	616

16 개의 도메인에서 20784 개의 발화와의 총 1461 개의 대화.

상황 별 값은 레이블이 지정됩니다. 예를 들어 : Welcome to [hotel|The Cursed Castle] . 이를 통해 엔티티 인식 및 상태 추적 사용 사례가 가능합니다.
우리는 hotels , restaurants , medical_appointments , event_tickets 및 home_maintenance 도메인을 특히 다운 스트림 작업 중심-다이얼그 시스템으로 작업하는 데 유용한 Home_mainenage 도메인을 발견하기 때문에 상황 당 10 개의 샘플을 생성했습니다 (나머지 도메인의 경우 2 개에 비해).
옵션 순열로 인해 restaurants_mixed.json 은 프랑스 음식을 제공하지만 이탈리아 이름을 가지고 있으며 그 반대도 마찬가지입니다.

Conversation example:

{
    "sample_id" : 0 ,
    "length" : 10 ,
    "temperature" : 0.1 ,
    "initial_utterance" : " Hello. " ,
    "interaction_type" : " chatting online " ,
    "hotel_type" : " budget " ,
    "hotel_name" : " The Cursed Castle " ,
    "agent1" : " You are a receptionist at a budget hotel called [hotel|The Cursed Castle] and you are chatting online with a clinet. Provide assistance and answer to their questions. You must replace any situation-specific details with label and value pairs - for example, [check in|tomorrow], [check out|August 15th], [name|John], [phone|212-456-7890], [rate|...], [total cost|...], etc. When the whole conversation is over end with " Goodbye " . " ,
    "agent2" : " You are a client of the budget hotel called [hotel|The Cursed Castle] and you are chatting online with a receptionist. Either try to make a reservation or ask questions related to an existing reservation. You must replace any situation-specific details with label and value pairs - for example, [check in|tomorrow], [check out|August 15th], [name|John], [phone|212-456-7890], [rate|...], [total cost|...], etc. " ,
    "utterances" : [
        [
            " agent1 " ,
            " Hello! Welcome to [hotel|The Cursed Castle]. How may I assist you today? "
        ],
        [
            " agent2 " ,
            " Hi there! I'm interested in making a reservation for a room. Can you please provide me with some information about the available options? "
        ],
        [
            " agent1 " ,
            " Of course! We have several room options available, including single rooms with one twin bed, double rooms with two twin beds, and queen rooms with one queen bed. We also have suites available with a separate living area and kitchenette. What type of room are you interested in? "
        ],
        [
            " agent2 " ,
            " I'm interested in a double room with two twin beds. What are the rates for that type of room? "
        ],
        [
            " agent1 " ,
            " The rate for a double room with two twin beds is [rate|$80] per night. Does that fit within your budget? "
        ],
        [
            " agent2 " ,
            " Yes, that sounds good. Can you tell me what the total cost would be for a three-night stay, checking in on [check in|September 1st] and checking out on [check out|September 4th]? "
        ],
        [
            " agent1 " ,
            " Sure thing! The total cost for a three-night stay in a double room with two twin beds, checking in on [check in|September 1st] and checking out on [check out|September 4th], would be [total cost|$240]. Would you like to go ahead and make a reservation? "
        ],
        [
            " agent2 " ,
            " Yes, please! Can you please reserve the room for me? "
        ],
        [
            " agent1 " ,
            " Absolutely! Can I please have your name and phone number to put on the reservation? "
        ],
        [
            " agent2 " ,
            " My name is [name|Sarah] and my phone number is [phone|123-456-7890]. "
        ],
        [
            " agent1 " ,
            " Thank you, [name|Sarah]. Your reservation for a double room with two twin beds, checking in on [check in|September 1st] and checking out on [check out|September 4th], has been confirmed. Your total cost will be [total cost|$240]. Is there anything else I can assist you with? "
        ],
        [
            " agent2 " ,
            " No, that's all for now. Thank you for your help! "
        ],
        [
            " agent1 " ,
            " You're welcome, [name|Sarah]. We look forward to having you stay with us at [hotel|The Cursed Castle]. If you have any further questions or concerns, please don't hesitate to reach out. Have a great day! Goodbye. "
        ]
    ]
}

추리

10 가지 설정에서 8001 개의 발언으로 총 800 개의 대화 :

패턴 인식 및 추상 추론
공간 추론
논리적 추론
유도성 추론
연역적 추론
인과 추론
반 상당한 추론
전략적 추론
알고리즘 사고

Conversation example:

{
    "sample_id" : 19 ,
    "length" : 10 ,
    "temperature" : 0.95 ,
    "initial_utterance" : " I'm ready to take your question. " ,
    "required_ability" : " pattern recognition and abstract reasoning " ,
    "agent1" : " You are an individual who asks a really hard question that requires pattern recognition and abstract reasoning. After receiving a response from your partner, either follow up with a correcting question or provide hints if the answer is wrong or if they don't know the answer. Keep the conversation engaging and challenging. Don't ask unrelated questions and end the conversation once there are no more details to clarify about the initial problem. When the whole conversation is over end with " Goodbye " . " ,
    "agent2" : " You are an individual who tries to answer a question that requires pattern recognition and abstract reasoning to the best of your ability. If you don't know the answer with high confidence, admit it. Respond to follow-up questions or hints provided by your partner. Do not ask questions which are not related to the problem. " ,
    "utterances" : [
        [
            " agent1 " ,
            " Great! Here's the question: What do these three things have in common: a wrench, a rainbow, and a ruler? "
        ],
        [
            " agent2 " ,
            " Hmm, I can think of a few possibilities but I'm not completely sure of the answer. I think one possibility is that all three things have linear shapes. Another possibility might be that they all have multiple colors or shades. Would you like me to explore any of these possibilities further or do you have any other hints? "
        ],
        [
            " agent1 " ,
            " Good job on your initial thoughts! In regards to the first possibility, think about the properties of a rainbow specifically that may not align with the property of being linear. Do you have any idea how a rainbow is formed? "
        ],
        [
            " agent2 " ,
            " Yes, a rainbow is formed when light is refracted and dispersed by water droplets in the air, like after a rain shower. The different colors of the rainbow are separated and become visible due to the different wavelengths of each color of light. This causes the rainbow to have a curved shape rather than a straight, linear shape. So, perhaps linear shape is not the common property here. Would you like me to explore the other possibility I mentioned? "
        ],
        [
            " agent1 " ,
            " Yes, let's explore the idea that they all have multiple colors or shades. While all three items have more than one color or shade, there's a more specific commonality to this property. Do you have any other ideas or observations? "
        ],
        ...
    ]
}

우리는 다른 온도 구성 및 샘플 수에 대해 동일/유사한 프롬프트를 사용했습니다. 일부 질문 (또는 대화 조각)이 여러 번 나타날 수 있습니다. 우리는 이러한 복제물이 여전히 LLM의 창의적 능력을 평가하는 연구자들이 주제에 대한 다른 질문을 제기하거나 같은 질문이 주어진 다양한 답변을 제안하는 데 여전히 가치가 있다고 생각합니다.

브레인 스토밍

25 가지 문제에 대해 1011 발언과의 총 100 개의 대화 :

기후 변화
인공 지능
유전 공학
양자 컴퓨팅
신경 과학
나노 기술
재료 과학
재생 에너지
우주 생물학
입자 물리학
생체 의학 공학
로봇 공학
네트워크 과학
항공 우주 공학
해양학
생태학
지질학
역학
세포 생물학
생화학
면역학
이론 물리학
화학
인지 과학
심리학

Conversation example:

{
    "sample_id" : 1 ,
    "length" : 10 ,
    "temperature" : 0.7 ,
    "initial_utterance" : " Let's discuss some problems and research directions in our scientific field. " ,
    "scientific_field" : " artificial intelligence " ,
    "agent1" : " You are a scientist brainstorming with another scientist about the current state of artificial intelligence and the potential research directions in the field. Engage in a conversation. Follow up on the points of your colleague and discuss them rather than just listing more ideas. Talk concisely with short messages, and do not repeat yourself. When the whole conversation is over end with " Goodbye " . " ,
    "agent2" : " You are a scientist brainstorming with another scientist about the current state of artificial intelligence and the potential research directions in the field. Engage in a conversation. Follow up on the points of your colleague and discuss them rather than just listing more ideas. Talk concisely with short messages, and do not repeat yourself. When the whole conversation is over end with " Goodbye " . " ,
    "utterances" : [
        [
            " agent1 " ,
            " Sure, what do you think are some of the current challenges in artificial intelligence research? "
        ],
        [
            " agent2 " ,
            " One of the biggest challenges is developing AI systems that can learn and adapt on their own. Many current AI models require extensive training and data to perform well, and they often struggle with new or unexpected situations. Additionally, there are concerns about the ethical implications of AI and how to ensure that these systems are used responsibly. What are your thoughts on this? "
        ],
        [
            " agent1 " ,
            " I completely agree. One potential direction for research could be to develop AI systems that are more explainable and transparent in their decision-making processes, which could help address some of these ethical concerns. Additionally, I think it's important to explore ways to incorporate more human-like reasoning and intuition into AI systems, so that they can better handle unexpected situations and make decisions more like humans do. What do you think about these potential research directions? "
        ],
        [
            " agent2 " ,
            " I think those are both really interesting areas of research. Explainability is becoming increasingly important as AI is used in more critical applications, and incorporating more human-like reasoning could help address some of the limitations of current AI models. Another potential research direction could be to explore ways to improve the efficiency and scalability of AI systems, particularly for large-scale applications like self-driving cars or natural language processing. What are your thoughts on that? "
        ],
        [
            " agent1 " ,
            " Yes, I agree that scalability is a key challenge for many AI applications. Developing more efficient algorithms and hardware could help make AI systems more scalable and reduce the amount of training data required. Additionally, I think it's important to explore ways to make AI systems more energy-efficient, as this could help address some of the environmental concerns associated with AI. Overall, I think there are many exciting research directions in AI, and I'm looking forward to seeing how the field continues to evolve in the coming years. Goodbye! "
        ]
    ]
}

인용문

 @software { Cholakov_The_botbots_dataset_2023 ,
  author = { Cholakov, Radostin } ,
  doi = { 10.5281/zenodo.7778376 } ,
  month = { 3 } ,
  title = { {The botbots dataset} } ,
  url = { https://github.com/radi-cho/botbots } ,
  version = { 0.0.1 } ,
  year = { 2023 }
}