Caffe uses Google Protocol buffer and LMDB or LevelDB to save data in a single unified database file. This allows faster data loading.
Saving Database in LMDB
I will not cover this step. If you are using ImageNet, CIFAR10, MNIST or some common datasets, please refer to Caffe examples to make LMDB or LevelDB databases.
Loading the Dataset
All the data has to be first converted into a common protobuf Caffe Datum format. The protobuf message for the Dataum is defined in caffe.proto file.
Copy the following excerpt from caffe.proto to datum.proto and follow the instruction and compile it.
Reading LMDB CIFAR10 in Python
Loading a binaryproto file
The mean of the data sometimes drastically affect the training. Caffe provides an elegant way to compute mean of data and creates mean.binaryproto file. To load, use the following excerpt from Caffe issue pages 1.
Create blob.proto and put the following scripts into the file.
To load the file in python, compile the above blob.proto (or just put the above proto on datum).