warthunder replay parser
1.0.0
不幸的是,Warthunder重播文件似乎不包含任何易于阅读的信息(例如WOT,其中包括一些JSON)。这是解析Warthunder重播文件(.wrpl)的非常非常基本的尝试。有WT-Tools,尽管它似乎与(Multipart)服务器重播似乎不起作用。
有三个脚本可用:
配x 出于您自己的风险使用,在某些国家/地区刮擦(受保护的)网页可能会违反TOS/LAW
该脚本可用于从https://warthunder.com/en/tournament/replay/ page刮下来。这样调用:
python replays_scraper.py <num_pages>
其中<num_pages>是要刮擦的页面数(通常每个页面有25个重播)。它将用所有发现的重播打印一个JSON对象。
由于该页面受登录保护,因此该脚本期望一个auth_cookie.json文件,登录名:cookie:
auth_cookie.json:
{
"identity_sid" : " ... "
}在哪里...是identity_sid cookie的价值(您可以通过登录Warthunder.com并阅读浏览器中的cookie来获得。
从https://warthunder.com/en/tournament/replay/下载重播。
python download_replay.py <replay_id>
其中<replay_id>是重播ID(64位,分数为十进制或十六进制符号)。这将将重播文件存储在一个文件夹中,以十六进制表示法中的重播ID命名。
在文件夹中解析重播
python parse_replay.py <replay_folder>
它希望重播文件命名<replay_folder> 0000.wrpl,0001.wrpl等。
输出将以JSON形式:
parsing replay in /path/to/replay/005569aa001501ca
parsing /path/to/replay/005569aa001501ca/0000.wrpl
parsing /path/to/replay/005569aa001501ca/0001.wrpl
parsing /path/to/replay/005569aa001501ca/0002.wrpl
parsing /path/to/replay/005569aa001501ca/0003.wrpl
parsing /path/to/replay/005569aa001501ca/0004.wrpl
parsing /path/to/replay/005569aa001501ca/0005.wrpl
parsing /path/to/replay/005569aa001501ca/0006.wrpl
parsing /path/to/replay/005569aa001501ca/0007.wrpl
parsing /path/to/replay/005569aa001501ca/0008.wrpl
parsing /path/to/replay/005569aa001501ca/0009.wrpl
{
"level": "levels/avg_normandy.bin",
"mission_file": "gamedata/missions/cta/tanks/normandy/normandy_dom.blk",
"mission_name": "normandy_Dom",
"time_of_day": "day",
"weather": "hazy",
"time_of_battle_ts": 1641217514,
"time_of_battle": "2022-01-03 14:45:14",
"num_players": 21,
"players": [
{
"player_id": 34,
"vehicles": [
"us_m1a1_abrams",
"us_m1a1_hc_abrams"
]
},
{
"player_id": 35,
"vehicles": [
"us_m1_ip_abrams",
"us_hstv_l"
]
},
...
]
}
您也可以将脚本用作模块
import replays_scraper
import download_replay
import parse_replay
# set the cookies
cookies = { "identity_sid" : "secret_key" }
# download the html
pages = replays_scraper . download_pages ( 1 , cookies )
# scrape replay data from html
replays = []
for page in pages :
replays += replays_scraper . parse_page ( page )
# download the files of the last replay
download_replay . downloadReplay ( replays [ - 1 ][ "id" ])
# get the hexadecimal id (= folder name)
replay_id_hex = download_replay . _get_hex_id ( replays [ - 1 ][ "id" ])
# parse the replay
print ( parse_replay . parse_replay ( replay_id_hex ))