holodeck
v0.3.1

Holodeck是用於建立在虛幻引擎4上的增強學習的高保真模擬器。
pip install holodeck
(需要> = Python 3.5)
有關完整的說明(包括Docker),請參見安裝。
Holodeck的界麵類似於Openai的健身房。
我們嘗試提供一種電池,包括方法,讓您直接使用Holodeck,而需要最少的擺弄。
為了證明,這是使用DefaultWorlds軟件包的快速示例:
import holodeck
# Load the environment. This environment contains a UAV in a city.
env = holodeck . make ( "UrbanCity-MaxDistance" )
# You must call `.reset()` on a newly created environment before ticking/stepping it
env . reset ()
# The UAV takes 3 torques and a thrust as a command.
command = [ 0 , 0 , 0 , 100 ]
for i in range ( 30 ):
state , reward , terminal , info = env . step ( command ) state :傳感器值的傳感器名稱(nparray)。reward :先前行動中獲得的獎勵terminal :指示當前狀態是否是終端狀態。info :包含其他環境特定信息。如果要訪問特定傳感器的數據,請導入傳感器並從狀態詞典中檢索正確的值:
print ( state [ "LocationSensor" ])Holodeck支持多代理環境。
step的調用僅為主體提供一個措施,然後打勾模擬。
act為特定代理提供了持續的行動,並且不會打勾模擬。提供了一項措施後, tick將推進模擬前進。該行動一直持續到另一個呼籲採取act之前。
import holodeck
import numpy as np
env = holodeck . make ( "CyberPunkCity-Follow" )
env . reset ()
# Provide an action for each agent
env . act ( 'uav0' , np . array ([ 0 , 0 , 0 , 100 ]))
env . act ( 'nav0' , np . array ([ 0 , 0 , 0 ]))
# Advance the simulation
for i in range ( 300 ):
# The action provided above is repeated
states = env . tick ()您可以使用以下方式訪問多代理環境的獎勵,終端和位置:
task = states [ "uav0" ][ "FollowTask" ]
reward = task [ 0 ]
terminal = task [ 1 ]
location = states [ "uav0" ][ "LocationSensor" ] ( uav0來自方案配置文件)
Holodeck可以通過GPU加速渲染來無頭。查看使用無頭的Holodeck
@misc{HolodeckPCCL,
Author = {Joshua Greaves and Max Robinson and Nick Walton and Mitchell Mortensen and Robert Pottorff and Connor Christopherson and Derek Hancock and Jayden Milne and David Wingate},
Title = {Holodeck: A High Fidelity Simulator},
Year = {2018},
}
Holodeck是BYU的感知,認知和控制實驗室(https://pcc.cs.byu.edu/)的一個項目。