Previous: Mongodb VS Mysql query performance, tested the query performance of mongodb and mysql. Result Description mongodb
The performance is OK, and it can be used instead of mysql.
But this test is at the million level, and my scene is at the KW level. So it is necessary to test the effect of mongodb at the kw level.
My test environment is 4G memory (a lot of memory is occupied by other programs), 2kw data, and the query randomly generates ids (query 20 ids at a time).
The test in such an environment is not ideal, and it is disappointing. The average query time is 500ms (compared to mysql
Not bad, especially under concurrent queries, the performance is poor. very low throughput). View its index size (query with db.mycoll.stats()): 2kw
There are about 1.1G indexes in the data, and the stored data is about 11G.
During the test, it was found that iowait accounts for about 50%, which seems to be the bottleneck of io. Also see mongodb
Not much memory used (less than the size of the index, it seems the machine is not big enough to test).
Change to a machine with available 6G memory. Under 50 concurrency, it can reach an average of 100 ms
It is relatively satisfactory, but the concurrency seems to be not strong enough. But this performance cannot be controlled by me, it is also controlled by the available memory of the machine. The reason is mongodb
It does not specify the size of memory that can be occupied. It uses all free memory as a cache, which is both an advantage and a disadvantage: advantage-it can maximize performance; disadvantage-easy to be interfered by other programs (it occupies its cache). According to my test, its ability to seize memory is not strong. mongodb
It uses a memory-mapped file vmm, the official description:
Memory Mapped Storage Engine
This is the current storage engine for MongoDB, and it uses
memory-mapped files for all disk I/O. Using this strategy,
the operating system’s virtual memory manager is in charge of
caching. This has several implications:
There is no redundancy between file system cache and database
cache: they are one and the same.
MongoDB can use all free memory on the server for cache space
automatically without any configuration of a cache size.
Virtual memory size and resident size will appear to be very
large for the mongod process. This is benign: virtual memory
space will be just larger than the size of the datafiles open and
mapped; resident size will vary depending on the amount of memory
not used by other processes on the machine.
Caching behavior (such as LRU’ing out of pages, and laziness of
page writes) is controlled by the operating system: quality of the
VMM implementation will vary by OS.
So looking at it this way, I think mongodb
Not specifying the memory size to guarantee normal caching is a disadvantage. It should at least ensure that all indexes can be placed in memory. But this behavior is not determined by the startup program, but by the environment (a fly in the ointment).
There is also an official paragraph saying that the index is placed in memory:
If your queries seem sluggish, you should verify that your
indexes are small enough to fit in RAM. For instance, if you’re
running on 4GB RAM and you have 3GB of indexes, then your indexes
probably aren’t fitting in RAM. You may need to add RAM and/or
verify that all the indexes you’ve created are actually being
used.
I still hope that the memory size can be specified in mongodb to ensure that it has enough memory to load the index.
Summary: Under the large amount of data (kw level), the concurrent query of mongodb is not ideal (100-200/s). Writing data is fast (in my environment, remote submission is nearly
1w/s, it is estimated that it is no problem to reach 1.5W/s, and it is basically not affected by the large amount of data).
Paste a test data:
1 id (memory usage <1.5g) | 10 id (memory usage 2-3g) | 20 id (memory usage >4g) | |||||||
1 | 2 | 3 | 1 | 2 | 3 | 1 | 2 | 3 | |
total time | 17.136 | 25.508 | 17.387 | 37.138 | 33.788 | 25.143 | 44.75 | 31.167 | 30.678 |
1 thread thruput | 583.5668 | 392.0339 | 575.1423 | 269.266 | 295.9631 | 397.725 | 223.4637 | 320.8522 | 325.9665 |
total time | 24.405 | 22.664 | 24.115 | 41.454 | 41.889 | 39.749 | 56.138 | 53.713 | 54.666 |
5 thread thruput | 2048.76 | 2206.142 | 2073.398 | 1206.156 | 1193.631 | 1257.893 | 890.6623 | 930.8733 | 914.6453 |
total time | 27.567 | 26.867 | 28.349 | 55.672 | 54.347 | 50.93 | 72.978 | 81.857 | 75.925 |
10 thread thruput | 3627.526 | 3722.038 | 3527.461 | 1796.235 | 1840.028 | 1963.479 | 1370.276 | 1221.643 | 1317.089 |
total time | 51.397 | 57.446 | 53.81 | 119.386 | 118.015 | 76.405 | 188.962 | 188.034 | 138.839 |
20 thread thruput | 3891.278 | 3481.53 | 3716.781 | 1675.238 | 1694.7 | 2617.63 | 1058.414 | 1063.637 | 1440.517 |
total time | 160.038 | 160.808 | 160.346 | 343.559 | 352.732 | 460.678 | 610.907 | 609.986 | 1411.306 |
50 thread thruput | 3124.258 | 3109.298 | 3118.257 | 1455.354 | 1417.507 | 1085.357 | 818.4552 | 819.6909 | 354.2818 |
total time | 2165.408 | 635.887 | 592.958 | 1090.264 | 1034.057 | 1060.266 | 1432.296 | 1466.971 | 1475.061 |
100 thread thruput | 461.8067 | 1572.606 | 1686.46 | 917.209 | 967.0647 | 943.1595 | 698.1797 | 681.6767 | 677.9381 |
The above test uses three kinds of queries (1, 10, 20 id each time), tests 3 times under different concurrency, and issues 1w each time
queries. The first line of data is the cumulative time of all threads (in ms), the second line of data is the throughput (1w /(total time / thread
num)). The memory usage is slowly increasing in the test, so the latter data may be more efficient (efficient environment).
From the above table, 10 – 20 threads are relatively high throughput. To see memory usage, the premise is that the index is loaded into memory and some memory is used as a cache.
Below is a pdf of index query optimization.
Indexing and Query Optimizer
PS:
The default mongodb server only has 10 concurrency, if you want to increase the number of connections, you can use –maxConns num
To improve its reception of concurrent data.
The java driver of mongodb only has a maximum of 10 concurrent connection pools by default. To improve it, you can add to the environment of mongo.jar
MONGO.POOLSIZE system parameter, such as java -DMONGO.POOLSIZE=50 …
The above test uses three kinds of queries (1, 10, 20 id each time), tests 3 times under different concurrency, and issues 1w each time
queries. The first line of data is the cumulative time of all threads (in ms), the second line of data is the throughput (1w /(total time / thread
num)). The memory usage is slowly increasing in the test, so the latter data may be more efficient (efficient environment).
From the above table, 10 – 20 threads are relatively high throughput. To see memory usage, the premise is that the index is loaded into memory and some memory is used as a cache.
Below is a pdf of index query optimization.
Indexing and Query Optimizer
PS:
The default mongodb server only has 10 concurrency, if you want to increase the number of connections, you can use –maxConns num
To improve its reception of concurrent data.
The java driver of mongodb only has a maximum of 10 concurrent connection pools by default. To improve it, you can add to the environment of mongo.jar
MONGO.POOLSIZE system parameter, such as java -DMONGO.POOLSIZE=50 …