Distributed file storage system based on MongoDB: GridFS
GridFS is a distributed file system on top of MongoDB. It utilizes MongoDB’s distributed storage mechanism and stores file data and file metadata through MongoDB. It has the advantages of both document database and file system. GridFS is the product of the current big data trend and complex data analysis needs. To put it simply, GridFS realizes the file system by storing file data and file metadata in MongoDB, and handles failover and data integration through replication (Replication). It can also be used for read expansion, hot backup or offline batch processing. The data source can automatically split data through sharding, realize big data storage and load balancing, realize lightweight file system interface through database management and query of documents in the collection (including MapReduce) and Search and Analytics. A basic idea of GridFS is that large files can be divided into many blocks, and each block is stored as a separate document, so large files can be stored. Since MongoDB supports storing binary data in documents, the storage overhead of blocks can be minimized. GridFS uses MongoDB’s replication, sharding and other mechanisms to implement distributed file storage, and uses MongoDB for management and complex analysis. GridFS uses two documents…