
https://db-benchmarks.com旨在制作数据库和搜索引擎基准:
⚖️公平而透明- 应该在此数据库 /搜索引擎提供此或该性能的情况下清楚
高质量- 控制变化系数可以产生结果,如果您今天,明天或下周进行查询,则保持不变
?容易再现- 任何人都可以在自己的硬件上重现任何测试
易于理解- 图表非常简单
➕可扩展- 可插入式体系结构允许添加更多数据库进行测试
并保留所有100%开源!
该存储库提供了一个可以完成工作的测试框架。
许多数据库基准不是客观的。其他人则做得不足以确保结果的准确性和稳定性,在某些情况下,这会打破基准的整个想法。一些例子:
https://imply.io/blog/druid-nails-cost-cost-felcipicy-challenge-challenge-against-clickhouse-and-rockset/:
实际上,我们想在同一硬件(M5.8xlarge)上进行基准测试,但是我们对M5.8xlarge拥有的唯一预生产配置实际上是m5d.8xlarge ...而是我们在C5.9xlarge实例上运行
坏消息,伙计们:当您在不同的硬件上运行基准时,至少您不能说“ 106.76%”和“ 103.13%”的其他东西。即使您在同一裸金属服务器上进行测试,也很难将变化系数低于5%。不同服务器的3%差异很可能被忽略。鉴于所有这些,如何确保最终结论是正确的?
https://tech.marksblogg.com/benchmarks.html
马克在许多不同的数据库和搜索引擎上进行了出租车测试做得很好。但是,由于测试是在不同的硬件上进行的,因此结果表中的数字并不能真正可比。在评估表中的结果时,您始终需要牢记这一点。
https://clickhouse.com/benchmark/dbms/
当您仅运行每个查询3次时,每个查询很可能会获得每个查询的变化系数很高。这意味着,如果您一分钟后进行测试,则可能会获得20%的变化。一个人如何在自己的硬件上重现测试?不幸的是,我找不到如何做到这一点。
我们的信念是,公平的数据库基准应遵循一些关键原则:
✅在完全相同的硬件上测试不同的数据库
否则,当差异很小时,您应该确认错误余量。
✅在每次测试之前清除完整的OS缓存测试
否则您将无法测试冷查询。
✅正在测试的数据库应禁用其所有内部缓存
否则,您将测量缓存性能。
✅最好也要测量冷跑。对于经常发生冷查询的分析查询尤其重要
否则,您会完全隐藏数据库如何处理I/O。
✅在测试过程中没有其他运行
否则,您的测试结果可能非常不稳定。
✅您需要在每个查询之前重新启动数据库
否则,尽管清除了内部缓存,但以前的查询仍然会影响当前查询的响应时间。
✅您需要等到数据库启动后完全热身
否则,您可能最终会与数据库的I/O热身过程竞争,这可能会严重破坏您的测试结果。
✅最好如果您提供一种变化系数,因此每个人都了解您的结果的稳定性,并确保自己足够低
变化系数是一个非常好的度量标准,它显示了您的测试结果的稳定性。如果它高于n%,则不能说一个数据库比另一个数据库快。
✅最好在固定的CPU频率上测试
否则,如果您使用的是“按需” CPU调速器(通常是默认值),则可以轻松地将500ms响应时间转换为1000+ MS。
✅最好在SSD/NVME而不是HDD上测试
否则,根据文件在HDD上的位置,您可以获得低/更高/更高的I/O性能(我们测试),这至少可以使您的冷查询结果错误。
在https://db-benchmarks.com后端使用的测试框架是完全开源的(AGPLV3许可证),可以在https://github.com/db-benchmarks/db-benchmarks上找到。这就是它的作用:
--limited )基准对数据库的算法功能进行基准测试。--test将测试结果保存到文件--save保存将测试结果从文件中保存到远程数据库(均未经过测试的数据库)select count(*) ,然后select * limit 1以确保在不同数据库中数据收集相似cpuset和mem )。在部署测试框架之前,请确保您有以下内容:
PHP 8和:curl模块mysqli模块dockerdocker-composesensorsdstatcgroups v2安装:
git clone [email protected]:db-benchmarks/db-benchmarks.git
cd db-benchmarks.env.example复制为.env.env中更新mem和cpuset ,并使用内存的默认值(在Megabytes中)和CPU,测试框架可以用于辅助任务(数据加载,获取有关数据库的信息)ES_JAVA_OPTS 。通常是Docker机器分配的内存大小首先,您需要准备测试:
转到特定测试的目录(所有测试都必须在目录中./tests中),例如“ HN_SMALL”:
cd tests/hn_small运行初始脚本:
./init这将:
然后运行../../test test(它在项目root的文件夹中)以查看选项:
To run a particular test with specified engines, memory constraints and number of attempts and save the results locally:
/perf/test_engines/test
--test=test_name
--engines={engine1:type,...,engineN}
--memory=1024,2048,...,1048576 - memory constraints to test with, MB
[--times = N] - max number of times to test each query, 100 by default
[--dir = path] - if path is omitted - save to directory ' results ' in the same dir where this file is located
[--probe_timeout = N] - how long to wait for an initial connection, 30 seconds by default
[--start_timeout = N] - how long to wait for a db/engine to start, 120 seconds by default
[--warmup_timeout = N] - how long to wait for a db/engine to warmup after start, 300 seconds by default
[--query_timeout = N] - max time a query can run, 900 seconds by default
[--info_timeout = N] - how long to wait for getting info from a db/engine
[--limited] - emulate one physical CPU core
[--queries = /path/to/queries] - queries to test, ./tests/ < test name > /test_queries by default
To save to db all results it finds by path
/perf/test_engines/test
--save=path/to/file/or/dir, all files in the dir recursively will be saved
--host=HOSTNAME
--port=PORT
--username=USERNAME
--password=PASSWORD
--rm - remove after successful saving to database
--skip_calm - avoid waiting until discs become calm
----------------------
Environment variables:
All the options can be specified as environment variables, but you can ' t use the same option as an environment variables and as a command line argument at the same time.并运行测试:
../../test --test=hn_small --engines=elasticsearch,clickhouse --memory=16384如果您以本地模式(开发)进行测试,并且不关心测试不准确,则可以通过设置参数来避免光盘镇定和CPU检查--skip_inaccuracy
../../test --test=hn_small --engines=elasticsearch,clickhouse --memory=16384 --skip_inaccuracy现在,您在./results/ (在存储库的根源中)中有测试结果,例如:
# ls results/
220401_054753现在,您可以将结果上传到数据库中,以进一步可视化。在https://db-benchmarks.com/上使用的可视化工具也是开源的,可以在https://github.com/db-benchmarks/ui上找到。
这是您可以保存结果的方法:
username=login password=pass host=db.db-benchmarks.com port=443 save=./results ./test或者
./test --username=login --password=pass --host=db.db-benchmarks.com --port=443 --save=./results
我们渴望看到您的测试结果。如果您认为应该将它们添加到https://db-benchmarks.com,请向此存储库提出结果。
请记住以下内容:
./results 。然后,我们将:
.
|-core <- Core directory, contains base files required for tests.
| |-engine.php <- Abstract class Engine. Manages test execution, result saving, and parsing of test attributes.
| |-helpers.php <- Helper file with logging functions, attribute parsing, exit functions, etc.
|-misc <- Miscellaneous directory, intended for storing files useful during the initialization step.
| |-func.sh <- Meilisearch initialization helper script.
|-plugins <- Plugins directory: if you want to extend the framework by adding another database or search engine for testing, place it here.
| |-elasticsearch.php <- Elasticsearch plugin.
| |-manticoresearch.php <- Manticore Search plugin.
| |-clickhouse.php <- ClickHouse plugin.
| |-mysql.php <- MySQL plugin.
| |-meilisearch.php <- Meilisearch plugin.
| |-mysql_percona.php <- MySQL (Percona) plugin.
| |-postgres.php <- Postgres plugin.
| |-typesense.php <- Typesense plugin.
|-results <- Test results directory. The results shown on https://db-benchmarks.com/ are found here. You can also use `./test --save` to visualize them locally.
|-tests <- Directory containing test suites.
| |-hn <- Hackernews test suite.
| | |-clickhouse <- Directory for "Hackernews test -> ClickHouse".
| | | |-inflate_hook <- Engine initialization script. Handles data ingestion into the database.
| | | |-post_hook <- Engine verification script. Ensures the correct number of documents have been ingested and verifies data consistency.
| | | |-pre_hook <- Engine pre-check script. Determines if tables need to be rebuilt, starts the engine, and ensures availability.
| | |-data <- Prepared data collection for the tests.
| | |-elasticsearch <- Directory for "Hackernews test -> Elasticsearch".
| | | |-logstash_tuned <- Logstash configuration directory for the "tuned" type.
| | | | |-logstash.conf
| | | | |-template.json
| | | |-elasticsearch_tuned.yml
| | | |-inflate_hook <- Engine initialization script for data ingestion.
| | | |-post_hook <- Verifies document count and data consistency.
| | | |-pre_hook <- Pre-check script for table rebuilding and engine initialization.
| | |-manticoresearch <- Directory for testing Manticore Search in the Hackernews test suite.
| | | |-generate_manticore_config.php <- Script for dynamically generating Manticore Search configuration.
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Verifies document count and consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine availability.
| | |-meilisearch <- Directory for "Hackernews test -> Meilisearch".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Ensures correct document count and data consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine start.
| | |-mysql <- Directory for "Hackernews test -> MySQL".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Ensures document count and consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine start.
| | |-postgres <- Directory for "Hackernews test -> Postgres".
| | | |-inflate_hook <- Data ingestion script.
| | | |-post_hook <- Verifies document count and data consistency.
| | | |-pre_hook <- Pre-check for table rebuilds and engine availability.
| | |-prepare_csv <- Prepares the data collection, handled in `./tests/hn/init`.
| | |-description <- Test description, included in test results and used during result visualization.
| | |-init <- Main initialization script for the test.
| | |-test_info_queries <- Contains queries to retrieve information about the data collection.
| | |-test_queries <- Contains all test queries for the current test.
| |-taxi <- Taxi rides test suite, with a similar structure.
| |-hn_small <- Test for a smaller, non-multiplied Hackernews dataset, similar structure.
| |-logs10m <- Test for Nginx logs, similar structure.
|-.env.example <- Example environment file. Update the "mem" and "cpuset" values as needed.
|-LICENSE <- License file.
|-NOTICE <- Notice file.
|-README.md <- You're reading this file.
|-docker-compose.yml <- Docker Compose configuration for starting and stopping databases and search engines.
|-important_tests.sh
|-init <- Initialization script. Handles data ingestion and tracks the time taken.
|-logo.svg <- Logo file.
|-test <- The executable file to run and save test results.
test=logs10m cpuset= " 0,1 " mem=32768 suffix=_tuned docker-compose up elasticsearch将要:
suffix=_tuned :地图./tests/logs10m/es/data/idx_tuned作为数据目录mem=32768将RAM限制为32GB,如果未指定,则将使用File .env使用默认值cpuset="0,1" :Elasticsearch的容器仅在CPU内核0和1上运行(这可能是第一个整个物理CPU)停止 - 只是CTRL-C 。
想参与该项目吗?您可以做出贡献:
这些都在等您的贡献!