holodeck
v0.3.1

Holodeck是用于建立在虚幻引擎4上的增强学习的高保真模拟器。
pip install holodeck
(需要> = Python 3.5)
有关完整的说明(包括Docker),请参见安装。
Holodeck的界面类似于Openai的健身房。
我们尝试提供一种电池,包括方法,让您直接使用Holodeck,而需要最少的摆弄。
为了证明,这是使用DefaultWorlds软件包的快速示例:
import holodeck
# Load the environment. This environment contains a UAV in a city.
env = holodeck . make ( "UrbanCity-MaxDistance" )
# You must call `.reset()` on a newly created environment before ticking/stepping it
env . reset ()
# The UAV takes 3 torques and a thrust as a command.
command = [ 0 , 0 , 0 , 100 ]
for i in range ( 30 ):
state , reward , terminal , info = env . step ( command ) state :传感器值的传感器名称(nparray)。reward :先前行动中获得的奖励terminal :指示当前状态是否是终端状态。info :包含其他环境特定信息。如果要访问特定传感器的数据,请导入传感器并从状态词典中检索正确的值:
print ( state [ "LocationSensor" ])Holodeck支持多代理环境。
step的调用仅为主体提供一个措施,然后打勾模拟。
act为特定代理提供了持续的行动,并且不会打勾模拟。提供了一项措施后, tick将推进模拟前进。该行动一直持续到另一个呼吁采取act之前。
import holodeck
import numpy as np
env = holodeck . make ( "CyberPunkCity-Follow" )
env . reset ()
# Provide an action for each agent
env . act ( 'uav0' , np . array ([ 0 , 0 , 0 , 100 ]))
env . act ( 'nav0' , np . array ([ 0 , 0 , 0 ]))
# Advance the simulation
for i in range ( 300 ):
# The action provided above is repeated
states = env . tick ()您可以使用以下方式访问多代理环境的奖励,终端和位置:
task = states [ "uav0" ][ "FollowTask" ]
reward = task [ 0 ]
terminal = task [ 1 ]
location = states [ "uav0" ][ "LocationSensor" ] ( uav0来自方案配置文件)
Holodeck可以通过GPU加速渲染来无头。查看使用无头的Holodeck
@misc{HolodeckPCCL,
Author = {Joshua Greaves and Max Robinson and Nick Walton and Mitchell Mortensen and Robert Pottorff and Connor Christopherson and Derek Hancock and Jayden Milne and David Wingate},
Title = {Holodeck: A High Fidelity Simulator},
Year = {2018},
}
Holodeck是BYU的感知,认知和控制实验室(https://pcc.cs.byu.edu/)的一个项目。