Keras.js uses a custom protocol buffer format binary file that is a serialization of the HDF5-format Keras model and weights file. The
python/encoder.py script performs this necessary conversion.
The HDF5-format Keras model file must include both the model architecture and the weights. This is the default behavior for Keras model saving:
Note that when using the ModelCheckpoint callback,
save_weights_only must not be set to
True (default is
Works for both Keras
model = Sequential() model.add(...) ...
... model = Model(inputs=..., outputs=...)
- protobuf 3.4+
Running the script
usage: encoder.py [-h] [-n NAME] [-q] hdf5_model_filepath positional arguments: hdf5_model_filepath optional arguments: -h, --help show this help message and exit -n NAME, --name NAME model name (defaults to filename without extension if not provided) -q, --quantize quantize weights to 8-bit unsigned int
./encoder.py -q /path/to/model.h5
The quantize flag enables weights-wise 8-bit uint quantization from 32-bit float, using a simple linear min/max scale calculated for every layer weights matrix. This will result in a roughly 4x reduction in the model file size. For example, the model file for Inception-V3 is reduced from 92 MB to 23 MB. Client-side, Keras.js then restores uint8-quantized weights back to float32 during model initialization.
The tradeoff, of course, is slightly reduced performance, which may or may not be perceptible depending on the model type and end application. For a study on the performance effects of quantization, this is an excellent resource.