Recently, I want to use Mongodb to do some storage layer things. At present, it is mainly used in id query or ids query (such as mysql‘s in
Inquire). Whether the new solution can be launched depends on performance. Now compare mongodb with mysql.
Environment: the same machine, equipped with mongodb 1.4, mysql; memory 4g; both databases only have unique search of id; data volume 120w
About 5w queries. Do an id or ids query on them.
the | mongodb | mysql | ||
total time | avg time/per query | total time | avg time/per query | |
1 id/per query | 11954ms, 00:11,954 | 0ms, 00,0 | 78613ms, 00:1:18,613 | 1ms, 00,1 |
10 id/per query | 35885ms, 00:35,885 | 0ms, 00,0 | 107274ms, 00:1:47,274 | 2ms, 00,2 |
20 id/per query | 63714ms, 00:1:3,714 | 1ms, 00,1 | 186398ms, 00:3:6,398 | 3ms, 00,3 |
5 threads, 20 id/per query | – | 5ms, 00,5 | – | 11ms, 00,11 |
10 thread, 20 id/per query | – | 10ms, 00, 10 | – | 22ms, 00, 22 |
the
From the table above:
For each query of an id, mongodb is 7 times faster than mysql under basically the same conditions.
For queries with multiple ids (10/20) per query, mongodb is 2 times faster than mysql.
Check multiple ids under multi-threading, mongodb is 1 times faster than mysql.
From the above test, mongodb is superior to mysql in applications mainly used for id query or ids query.
When inserting these data, mongodb is faster than mysql, inserting 120w into mongodb takes 83159ms,
->00:1:23,159 is enough. mysql has dozens of minutes.
When I am in the actual application (kw-level data volume), the speed of using mongodb is not as good as mysql. There may be reasons, mysql
The machine is good, but mongodb is not used properly. Keep tweaking and optimizing the program…
The in query of mongodb is attached:
1
2 BasicDBObject
q = new BasicDBObject();
3
4 BasicDBObject
in = new BasicDBObject();
5
7
8 void
createQ(Integer[] ids) {
9
10
q. put(“id”, in);
11
12
in.put(“$in”, ids);
13
Previous: Mongodb VS Mysql query performance, tested the query performance of mongodb and mysql. Result description mongodb
The performance is OK, and it can be used instead of mysql.
But this test is at the million level, and my scene is at the KW level. So it is necessary to test the effect of mongodb at the kw level.
My test environment is 4G memory (a lot of memory is occupied by other programs), 2kw data, and the query randomly generates ids (query 20 ids at a time).
The test in such an environment is not ideal, and it is disappointing. The average query time is 500ms (compared to mysql
Not bad, especially under concurrent queries, the performance is poor. very low throughput). View its index size (query with db.mycoll.stats()): 2kw
There are about 1.1G indexes in the data, and the stored data is about 11G.
During the test, it was found that iowait accounts for about 50%, which seems to be the bottleneck of io. Also see mongodb
Not much memory used (less than the size of the index, it seems the machine is not big enough to test).
Change to a machine with available 6G memory. Under 50 concurrency, it can reach an average of 100 ms
It is relatively satisfactory, but the concurrency seems to be not strong enough. But this performance cannot be controlled by me, it is also controlled by the available memory of the machine. The reason is mongodb
It does not specify the size of memory that can be occupied. It uses all free memory as a cache, which is both an advantage and a disadvantage: advantages? It can maximize performance; disadvantages? It is easily interfered by other programs (it occupies its cache). According to my test, its ability to seize memory is not strong. mongodb
It uses a memory-mapped file vmm, the official description:
Memory Mapped Storage Engine
This is the current storage engine for MongoDB, and it uses
memory-mapped files for all disk I/O. Using this strategy, the
operating system’s virtual memory manager is in charge of caching.
This has several implications:
td>
The above test uses three kinds of queries (1, 10, 20 id each time), tests 3 times under different concurrency, and issues 1w each time
queries. The first line of data is the cumulative time of all threads (in ms), the second line of data is the throughput (1w /(total time / thread
num)). The memory usage is slowly increasing in the test, so the latter data may be more efficient (efficient environment).
From the above table, 10 to 20 threads have relatively high throughput. To see memory usage, the premise is that the index is loaded into memory and some memory is used as a cache.
Below is a pdf of index query optimization.
Indexing and Query Optimizer
Indexing and Query Optimizer (Aaron Staple)
PS:
The default mongodb server only has 10 concurrency, if you want to increase the number of connections, you can use ?maxConns num
To improve its reception of concurrent data.
The java driver of mongodb only has a maximum of 10 concurrent connection pools by default. To improve it, you can add to the environment of mongo.jar
MONGO.POOLSIZE system parameter, such as java -DMONGO.POOLSIZE=50