Format

Keras.js uses a custom protocol buffer format binary file that is a serialization of the HDF5-format Keras model and weights file. The python/encoder.py script performs this necessary conversion.

The HDF5-format Keras model file must include both the model architecture and the weights. This is the default behavior for Keras model saving:

model.save('example.h5')

Note that when using the ModelCheckpoint callback, save_weights_only must not be set to True (default is False).

Works for both Keras Model and Sequential classes:

model = Sequential()
model.add(...)
...
...
model = Model(inputs=..., outputs=...)

Requirements

  • NumPy
  • h5py
  • protobuf 3.4+

Running the script

From the python/ directory:

./encoder.py --help
usage: encoder.py [-h] [-n NAME] [-q] hdf5_model_filepath

positional arguments:
  hdf5_model_filepath

optional arguments:
  -h, --help            show this help message and exit
  -n NAME, --name NAME  model name (defaults to filename without extension if
                        not provided)
  -q, --quantize        quantize weights to 8-bit unsigned int

Example:

./encoder.py -q /path/to/model.h5

Quantization

The quantize flag enables weights-wise 8-bit uint quantization from 32-bit float, using a simple linear min/max scale calculated for every layer weights matrix. This will result in a roughly 4x reduction in the model file size. For example, the model file for Inception-V3 is reduced from 92 MB to 23 MB. Client-side, Keras.js then restores uint8-quantized weights back to float32 during model initialization.

The tradeoff, of course, is slightly reduced performance, which may or may not be perceptible depending on the model type and end application. For a study on the performance effects of quantization, this is an excellent resource.