neoplanner下载neoplanner源代码下载

neoplanner

其他源码

1.0.0

下载

Neoplanner

抽象的

该仓库包含了一个名为“ Neoplanner”的顺序规划代理的实现。该计划者适用于具有较大状态空间和动作空间的基于文本的环境。它通过查询与基础LLM的疑问协同结合了两种状态空间搜索，以获得最佳的行动计划。奖励信号定量用于驱动搜索。通过最大化状态值的上限置信度来维持探索和剥削的平衡。在需要随机探索的地方，LLM被查询以生成动作计划。从每个试验中学习以文本格式存储为实体关系。这些用于LLM的将来查询以持续改进。科学世界环境中的实验揭示了与当前最佳方法相比提高了124％，这是在多个任务中获得的平均奖励。以下是架构。

开始

首先，克隆回购并导航到Neoplanner目录并安装要求

git clone https://github.com/swarna-kpaul/neoplanner
cd neoplanner
python3 -m pip install -r requirements.txt

然后，您需要修改config/keys.py文件以更新OpenAiaPikey 。您可以通过注册到OpenAI门户来获取API密钥。第一次用户可以免费获得5美元的信用额。您可以从此URL获取API键

此后可以进口包装

 from solver import neoplanner

初始化求解器对象

 # task is the identifier of tasks as specified in 
# stmloadfile is the name of the file (with full path) that contains saved state. The state will be loaded initially. default value is None
# stmstoragefile is the name of the file (with full path) whare intermediate states can be saved. default value is None
# beliefstorefile is the name of the file (with full path) whare intermediate learnings can be saved. default value is None
# beliefloadfile is the name of the file (with full path) that contains intermediate learnings. The learnings will be loaded initially. default value is None
# sigma is exploration probability constant. Increasing its value would increase random exploration by the the LLM.
solverobj = neoplanner ( task = "2-1" , stmloadfile =  None , stmstoragefile = None , beliefstorefile = None , beliefloadfile = None , sigma = 0.3 )

运行求解器。

 env = solverobj . train ()
######## get actionplan from statespace graph
additionalinstructions , actionplan , _ , _ , _ = env . getinstructions ()

培训将继续进行，直到达到目标。您可以中断培训过程之间。在这种情况下，请确保您提供stmstoragefile和ChielStoreFile，以保存中间状态和信念。

您可以加载stmstoragefile并查询env对象以从状态空间图中获取操作计划。

 import pickle
from solver import scienv
env = scienv ( "2-1" )
stmstoragefile = < file name with full path >
with open ( stmstoragefile , 'rb' ) as f :
  rootnodeid , invalidnodeid , DEFAULTVALUE , statespace , totaltrials , actiontrace , environment = pickle . load ( f )
env . model . rootnodeid = rootnodeid
env . model . invalidnodeid = invalidnodeid
env . model . DEFAULTVALUE = DEFAULTVALUE
env . model . statespace = statespace
env . model . totaltrials = totaltrials
env . environment = environment
env . reset ()

additionalinstructions , actionplan , _ , _ , _ = env . getinstructions ()

预处理文件夹包含所有训练有素的状态空间和学习7个任务的文件。您可以使用上述代码来查看任务的求解操作计划，通过使用适当的文件名设置stmstoragefile

通过运行以下代码可以看到学习

 import pickle
beliefloadfile = < belief load file name >
with open ( beliefloadfile , 'rb' ) as f :
    beliefaxioms , totalexplore = pickle . load ( f )
print ( beliefaxioms )

引用

 @misc{paul2023sequential,
      title={Sequential Planning in Large Partially Observable Environments guided by LLMs}, 
      author={Swarna Kamal Paul},
      year={2023},
      eprint={2312.07368},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

展开

附加信息