
https://db-benchmarks.com旨在製作數據庫和搜索引擎基準:
⚖️公平而透明- 應該在此數據庫 /搜索引擎提供此或該性能的情況下清楚
高質量- 控制變化係數可以產生結果,如果您今天,明天或下週進行查詢,則保持不變
?容易再現- 任何人都可以在自己的硬件上重現任何測試
易於理解- 圖表非常簡單
➕可擴展- 可插入式體系結構允許添加更多數據庫進行測試
並保留所有100%開源!
該存儲庫提供了一個可以完成工作的測試框架。
許多數據庫基準不是客觀的。其他人則做得不足以確保結果的準確性和穩定性,在某些情況下,這會打破基準的整個想法。一些例子:
https://imply.io/blog/druid-nails-cost-cost-felcipicy-challenge-challenge-against-clickhouse-and-rockset/:
實際上,我們想在同一硬件(M5.8xlarge)上進行基準測試,但是我們對M5.8xlarge擁有的唯一預生產配置實際上是m5d.8xlarge ...而是我們在C5.9xlarge實例上運行
壞消息,伙計們:當您在不同的硬件上運行基準時,至少您不能說“ 106.76%”和“ 103.13%”的其他東西。即使您在同一裸金屬服務器上進行測試,也很難將變化係數低於5%。不同服務器的3%差異很可能被忽略。鑑於所有這些,如何確保最終結論是正確的?
https://tech.marksblogg.com/benchmarks.html
馬克在許多不同的數據庫和搜索引擎上進行了出租車測試做得很好。但是,由於測試是在不同的硬件上進行的,因此結果表中的數字並不能真正可比。在評估表中的結果時,您始終需要牢記這一點。
https://clickhouse.com/benchmark/dbms/
當您僅運行每個查詢3次時,每個查詢很可能會獲得每個查詢的變化係數很高。這意味著,如果您一分鐘後進行測試,則可能會獲得20%的變化。一個人如何在自己的硬件上重現測試?不幸的是,我找不到如何做到這一點。
我們的信念是,公平的數據庫基準應遵循一些關鍵原則:
✅在完全相同的硬件上測試不同的數據庫
否則,當差異很小時,您應該確認錯誤餘量。
✅在每次測試之前清除完整的OS緩存測試
否則您將無法測試冷查詢。
✅正在測試的數據庫應禁用其所有內部緩存
否則,您將測量緩存性能。
✅最好也要測量冷跑。對於經常發生冷查詢的分析查詢尤其重要
否則,您會完全隱藏數據庫如何處理I/O。
✅在測試過程中沒有其他運行
否則,您的測試結果可能非常不穩定。
✅您需要在每個查詢之前重新啟動數據庫
否則,儘管清除了內部緩存,但以前的查詢仍然會影響當前查詢的響應時間。
✅您需要等到數據庫啟動後完全熱身
否則,您可能最終會與數據庫的I/O熱身過程競爭,這可能會嚴重破壞您的測試結果。
✅最好如果您提供一種變化係數,因此每個人都了解您的結果的穩定性,並確保自己足夠低
變化係數是一個非常好的度量標準,它顯示了您的測試結果的穩定性。如果它高於n%,則不能說一個數據庫比另一個數據庫快。
✅最好在固定的CPU頻率上測試
否則,如果您使用的是“按需” CPU調速器(通常是默認值),則可以輕鬆地將500ms響應時間轉換為1000+ MS。
✅最好在SSD/NVME而不是HDD上測試
否則,根據文件在HDD上的位置,您可以獲得低/更高/更高的I/O性能(我們測試),這至少可以使您的冷查詢結果錯誤。
在https://db-benchmarks.com後端使用的測試框架是完全開源的(AGPLV3許可證),可以在https://github.com/db-benchmarks/db-benchmarks上找到。這就是它的作用:
--limited )基準對數據庫的算法功能進行基準測試。--test將測試結果保存到文件--save保存將測試結果從文件中保存到遠程數據庫(均未經過測試的數據庫)select count(*) ,然後select * limit 1以確保在不同數據庫中數據收集相似cpuset和mem )。在部署測試框架之前,請確保您有以下內容:
PHP 8和:curl模塊mysqli模塊dockerdocker-composesensorsdstatcgroups v2安裝:
git clone [email protected]:db-benchmarks/db-benchmarks.git
cd db-benchmarks.env.example複製為.env.env中更新mem和cpuset ,並使用內存的默認值(在Megabytes中)和CPU,測試框架可以用於輔助任務(數據加載,獲取有關數據庫的信息)ES_JAVA_OPTS 。通常是Docker機器分配的內存大小首先,您需要準備測試:
轉到特定測試的目錄(所有測試都必須在目錄中./tests中),例如“ HN_SMALL”:
cd tests/hn_small運行初始腳本:
./init這將:
然後運行../../test test(它在項目root的文件夾中)以查看選項:
To run a particular test with specified engines, memory constraints and number of attempts and save the results locally:
/perf/test_engines/test
--test=test_name
--engines={engine1:type,...,engineN}
--memory=1024,2048,...,1048576 - memory constraints to test with, MB
[--times = N] - max number of times to test each query, 100 by default
[--dir = path] - if path is omitted - save to directory ' results ' in the same dir where this file is located
[--probe_timeout = N] - how long to wait for an initial connection, 30 seconds by default
[--start_timeout = N] - how long to wait for a db/engine to start, 120 seconds by default
[--warmup_timeout = N] - how long to wait for a db/engine to warmup after start, 300 seconds by default
[--query_timeout = N] - max time a query can run, 900 seconds by default
[--info_timeout = N] - how long to wait for getting info from a db/engine
[--limited] - emulate one physical CPU core
[--queries = /path/to/queries] - queries to test, ./tests/ < test name > /test_queries by default
To save to db all results it finds by path
/perf/test_engines/test
--save=path/to/file/or/dir, all files in the dir recursively will be saved
--host=HOSTNAME
--port=PORT
--username=USERNAME
--password=PASSWORD
--rm - remove after successful saving to database
--skip_calm - avoid waiting until discs become calm
----------------------
Environment variables:
All the options can be specified as environment variables, but you can ' t use the same option as an environment variables and as a command line argument at the same time.並運行測試:
../../test --test=hn_small --engines=elasticsearch,clickhouse --memory=16384如果您以本地模式(開發)進行測試,並且不關心測試不准確,則可以通過設置參數來避免光盤鎮定和CPU檢查--skip_inaccuracy
../../test --test=hn_small --engines=elasticsearch,clickhouse --memory=16384 --skip_inaccuracy現在,您在./results/ (在存儲庫的根源中)中有測試結果,例如:
# ls results/
220401_054753現在,您可以將結果上傳到數據庫中,以進一步可視化。在https://db-benchmarks.com/上使用的可視化工具也是開源的,可以在https://github.com/db-benchmarks/ui上找到。
這是您可以保存結果的方法:
username=login password=pass host=db.db-benchmarks.com port=443 save=./results ./test或者
./test --username=login --password=pass --host=db.db-benchmarks.com --port=443 --save=./results
我們渴望看到您的測試結果。如果您認為應該將它們添加到https://db-benchmarks.com,請向此存儲庫提出結果。
請記住以下內容:
./results 。然後,我們將:
.
|-core <- Core directory, contains base files required for tests.
| |-engine.php <- Abstract class Engine. Manages test execution, result saving, and parsing of test attributes.
| |-helpers.php <- Helper file with logging functions, attribute parsing, exit functions, etc.
|-misc <- Miscellaneous directory, intended for storing files useful during the initialization step.
| |-func.sh <- Meilisearch initialization helper script.
|-plugins <- Plugins directory: if you want to extend the framework by adding another database or search engine for testing, place it here.
| |-elasticsearch.php <- Elasticsearch plugin.
| |-manticoresearch.php <- Manticore Search plugin.
| |-clickhouse.php <- ClickHouse plugin.
| |-mysql.php <- MySQL plugin.
| |-meilisearch.php <- Meilisearch plugin.
| |-mysql_percona.php <- MySQL (Percona) plugin.
| |-postgres.php <- Postgres plugin.
| |-typesense.php <- Typesense plugin.
|-results <- Test results directory. The results shown on https://db-benchmarks.com/ are found here. You can also use `./test --save` to visualize them locally.
|-tests <- Directory containing test suites.
| |-hn <- Hackernews test suite.
| | |-clickhouse <- Directory for "Hackernews test -> ClickHouse".
| | | |-inflate_hook <- Engine initialization script. Handles data ingestion into the database.
| | | |-post_hook <- Engine verification script. Ensures the correct number of documents have been ingested and verifies data consistency.
| | | |-pre_hook <- Engine pre-check script. Determines if tables need to be rebuilt, starts the engine, and ensures availability.
| | |-data <- Prepared data collection for the tests.
| | |-elasticsearch <- Directory for "Hackernews test -> Elasticsearch".
| | | |-logstash_tuned <- Logstash configuration directory for the "tuned" type.
| | | | |-logstash.conf
| | | | |-template.json
| | | |-elasticsearch_tuned.yml
| | | |-inflate_hook <- Engine initialization script for data ingestion.
| | | |-post_hook <- Verifies document count and data consistency.
| | | |-pre_hook <- Pre-check script for table rebuilding and engine initialization.
| | |-manticoresearch <- Directory for testing Manticore Search in the Hackernews test suite.
| | | |-generate_manticore_config.php <- Script for dynamically generating Manticore Search configuration.
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Verifies document count and consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine availability.
| | |-meilisearch <- Directory for "Hackernews test -> Meilisearch".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Ensures correct document count and data consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine start.
| | |-mysql <- Directory for "Hackernews test -> MySQL".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Ensures document count and consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine start.
| | |-postgres <- Directory for "Hackernews test -> Postgres".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Verifies document count and data consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine availability.
| | |-prepare_csv <- Prepares the data collection, handled in `./tests/hn/init`.
| | |-description <- Test description, included in test results and used during result visualization.
| | |-init <- Main initialization script for the test.
| | |-test_info_queries <- Contains queries to retrieve information about the data collection.
| | |-test_queries <- Contains all test queries for the current test.
| |-taxi <- Taxi rides test suite, with a similar structure.
| |-hn_small <- Test for a smaller, non-multiplied Hackernews dataset, similar structure.
| |-logs10m <- Test for Nginx logs, similar structure.
|-.env.example <- Example environment file. Update the "mem" and "cpuset" values as needed.
|-LICENSE <- License file.
|-NOTICE <- Notice file.
|-README.md <- You're reading this file.
|-docker-compose.yml <- Docker Compose configuration for starting and stopping databases and search engines.
|-important_tests.sh
|-init <- Initialization script. Handles data ingestion and tracks the time taken.
|-logo.svg <- Logo file.
|-test <- The executable file to run and save test results.
test=logs10m cpuset= " 0,1 " mem=32768 suffix=_tuned docker-compose up elasticsearch將要:
suffix=_tuned :地圖./tests/logs10m/es/data/idx_tuned作為數據目錄mem=32768將RAM限制為32GB,如果未指定,則將使用File .env使用默認值cpuset="0,1" :Elasticsearch的容器僅在CPU內核0和1上運行(這可能是第一個整個物理CPU)停止 - 只是CTRL-C 。
想參與該項目嗎?您可以做出貢獻:
這些都在等您的貢獻!