GridFS is a way to store large files in MongoDB
The file specification in the database. All officially supported drivers implement the GridFS specification.
1 Why use GridFS
Since the size of BSON objects in MongoDB is limited, GridFS
The specification provides a transparent mechanism that can split a large file into multiple smaller documents. This mechanism allows us to save large file objects efficiently, especially
Don’t worry about those huge files, such as videos, high-definition pictures, etc.
2 How to realize mass storage
The specification specifies a standard for chunking files. Each file will hold a file collection object
One metadata object, one or more chunk block objects can be combined and stored in a chunk block collection.
3 Brief Introduction
GridFS uses two tables to store data: files (contains metadata objects
) and chunks (binary chunks containing some other relevant information).
In order for multiple GridFS to name a single database, files and blocks have a prefix, by default, the prefix is fs, so any default GridFS
The store will include namespaces fs.files and fs.chunks. Drivers of various third-party languages have permission to change this prefix, so you can try to set another
The GridFS namespace is used to store photos, and its specific locations are: photos.files and photos.chunks.
4 Command line tools
mongofiles is a tool for manipulating GridFS from the command line.
Check to see which GridFS files are in the library, and add a parameter “list” after “mongofiles”
Next, let’s go into the library to see if there is anything new. show collections.
View the contents of fs.files
Some basic metadata information is stored in fs.files:
Filename: the stored file name
chunkSize: the size of chunks
uploadDate: storage time
md5: md5 code of this file
length: file size, unit “byte”
View the contents of fs.chunks
Among them, n represents the serial number of chunks, which starts from 0.
Take out the file: ./mongofiles get testfile, and use md5sum testfile
Verify that the md5 value is the same as in the library.
7.6 Index
db.fs.chunks.ensureIndex({files_id:1, n:1}, {unique: true});
This way, a block can be retrieved using its files_id and the value of n. Note that GridFS can still use findOne
Get the first block, as follows:
db.fs.chunks.findOne({files_id: myFileID, n: 0});