Which is better between MySQL and MongoDB-1024programmer

Summary:This article describes Anders
After Karlsson found that MySQL was at a disadvantage in comparison with MongoDB, he tried his best to improve MySQL. Various storage engines, various memory management engines and embedded ideas, MySQL finally won after various attempts. But can this kind of victory really be called a victory? Or is this kind of victory really what everyone wants?

Preliminary key-value comparison, MongoDB wins

Faster and faster, this has always been our goal for the database system MySQL
Does it really make sense that Dragster cites disk speed as its biggest obstacle? Let’s just take it as an obstacle, what about the solution? ! If an obstacle restricts your Dragster, you can choose to go around it faster or improve your computer. for example:

Avoid using disks, replace them with memory as much as possible

Use a faster disk (such as SSD)

Actually the above pair of analogies is not good, because the limitation from the disk is so large, and surprisingly never improved. You might say, don’t we have SSD? Yes, this does boost the hard drive, but don’t forget: the CPU and RAM can be boosted faster than the hard drive! But let’s assume that our memory is large enough to directly replace the hard disk, so everything runs as fast as light? Obviously not, so stop making the ugly face that the hard drive is your biggest limitation!

Just as the speed of CPU core improvement is getting faster and faster, one day suddenly it is no longer as fast as it used to be. In order to solve this problem, multi-core technology was born. However, the problem of limiting the performance of the new CPU came one after another, becoming the most troublesome problem! For example, thread mutual exclusion! Another example is Query in MySQL
Cache mutual exclusion!

Without further ado, it’s finally time to start testing the benchmarks drawn up in May (English literature). Let me talk about why it took so long to start, because it took a lot of time to load the data into MySQL. In the process, I created an open source project for exporting data in JSON and importing it into MySQL. Once this work is done, I have data sorted by real world rules. Here, some columns must be deleted so that MySQL can process the data, because MySQL
Cluster can only store fixed-length data on disk. This is a lot of work for me:

A lot of raw material to write to disk

UTF-8 encoding means more than 3 times the data to be written

This ensures that MySQL
Cluster works well, but there are some special cases, depending on the type of the value. If the value type is text or class, then we must also use VARCHAR or similar formats, which really limit MySQL
Cluster. In order to make MySQL run more perfectly, only a very simple table can be created:

In this table, about 105 million rows of data are loaded. This should be a piece of cake with MySQL Cluster, right? But also remove MySQL
Cluster only supports 512MB hash data per part (really stupid limit). In desperation, the data can only be divided into 5 parts, and this part of the work is considered complete.

I have to say that without disk data, MySQL
Cluster operation is much more stable. Occasional data loss and other quirks did not occur when loading tables with VARCHAR formatted data. So it’s not only the data on disk that’s limiting you, your data type (VARCHAR) looks like it needs further refinement.

Without further ado, my server (8-core AMD CPU and 16GB RAM) is ready to go. Will support MySQL with InnoDB storage engine, MySQL
Cluster and MongoDB for testing. The test item is to read 1 million rows of data distributed on 100 threads under the same circumstances 10 times. In order to be fair, I must ensure that the data I need to install into memory has been placed in memory, so I tried it twice first. In case of NDB, MySQL will be used
API (NDBAPI will be tested at the end). The result is as follows:

MongoDB 110000 rows read per second

MySQL with InnoDB 30000 rows read per second

MySQL with NDB 32000 rows read per second

In the case of NDB, first do the following settings:

I can definitely tell you that in this mode a huge difference is made. Loading normal data, the result is similar. But when loading JSON (JSON is MongoDB’s native file format), the expected thing happens, MongoDB is faster than NDB/InnoDB
2.5 times, while NDB/InnoDB are equivalent.

Summarize:

In the era of cheap RAM, please remove the damn 512M setting!

Correction and addition of key-value comparison, MongoDB still wins

First, the test environment is exactly the same as above; second, a single table is used; finally, two processing engines, InnoDB and NDB, are used in MySQL. Test read of 1 million rows of data (total table size 105 million). It is also distributed 10 times on 100 threads, and a total of 10 million rows of data are read in.

After some inspections, it was found that the InnoDB engine was not fully cached. After correction, the test results are as follows:

MongoDB 110000 rows read per second

InnoDB 39000 rows read per second

NDB 32000 rows read per second

In this duel, MongoDB is still in an absolute advantage, and InnoDB is also significantly faster than NDB.

Key-value comparison in a specific environment, MySQL dawns at a glance

The maturity of MySQL is far from that of MongoDB. When you put MongoDB on the hard disk, you will find that its speed has declined severely. If we have a sufficient amount of memory (we put it on Amazon and there is plenty of memory usage there), does that mean it will perform well without doing any disk I/O?

Select a MongoDB data store, which also has 105 million rows of data. Originally I planned to use all MongoDB data storage, but I had to exclude data like VARCHAR format and putting data to disk through NDB would consume a lot of disk I/O, make sure NDB stored data will be fixed length (so a UTF -8
A VARCHAR(256) field will occupy 768 bytes). Make the table schema as follows:

After finishing the above work, the test console needs some tools:

CPU: AMD FX-8120 8-core Memory: 16G; Motherboard: M5A88-V (using Lite-On
LINE100TX network card replaces the Realtek chipset on the motherboard)

Disk system: no introduction because there is no disk I/O

Ubuntu 10.10

MySQL 5.6.5 64-bit

MySQL Cluster 7.2.6 64-bit

MongoDB 2.0.5 64-bit

It is also 10 times to read in 1 million data distributed on 100 threads, after ensuring that it will not be affected by disk I/O, the test results obtained are:

MongoDB 110000 rows read per second

MySQL Cluster 32000 rows read per second

MySQL with InnoDB 39000 rows read per second

MySQL with MEMORY/HEAP 43000 rows read per second

MySQL with MySAM 28000 rows read per second

MySQL’s performance on the last two items is undoubtedly disappointing! Then it was also found in the test that MylSAM only caches its own keys, not the entire data. But MylSAM’s performance is commendable, and no disk I/O was found from the beginning to the end. After solving this problem, let’s take a look at the results:

MySQL with MyISAM 37000 rows read per second

MySQL narrowly wins

Then we tested some other cases, such as: using NDB without CLIENT_COMPRESS. But compared with MongoDB’s 110,000, MySQL’s performance is still not improving. Summarize the best performance of MySQL in continuous attempts:

MySQL with MEMORY/HEAP: 43000 rows read per second

MySQL with NDB (without CLIENT_COMPRESS): 46000 rows per second

Although not all combinations have been tested, it is not difficult to deduce based on the above two results: when MySQL uses the MEMORY storage engine and CLIENT_COMPRESS, it does not use MySQL
Storage Engines, definitely faster than 43,000.

It is not difficult to expect that MySQL will cause a high load on the CPU in this case. Because everything is in memory and there is no disk I/O, then the only thing that may bind MySQL here is the CPU. So we bypass the standard server and use MySQL
Cluster, direct access to NDBAPI. This resulted in a better performance of 90,000, however this is still behind MongoDB.

Combining the above tests, we will also find:

MySQL with NDB (without using CLIENT_COMPRESS46000 rows per second

NDB 32000 rows read per second

Can we consider CLIENT_COMPRESS a “pest”? Is it possible to speculate that CLIENT_COMPRESS will slow things down by 25%-30%? ! If you want to see how much the client consumes, the easiest way is to use libmysqld
—MySQL Embedded
Library. In this way, we need to make changes to the benchmark program, and also ensure that the data has been written to memory before starting the test. We started testing when we were ready, and the results were just as we had speculated. 115,000! MySQL has finally won!

Summary: There is no winner here, only continuous improvement

After that, MySQL was also tested
The rapid speed of 172,000, but it is undoubtedly very far-fetched to use this as the basis for defeating MongoDB. Yes, what we see here is not victory or defeat, but the aggressiveness of MongoDB and the huge room for improvement that MySQL still has. (compiled / Zhong Hao