top of page

PMLS-Caffe

A distributed, multi-GPU deep learning framework.

Platform

Strads

A dynamic scheduler for model-parallel ML.

Bösen

A communication-efficient parameter server.

Platform
Topic Model (Latent Dirichlet Allocation)
  • Compared with Yahoo!LDA

  • >7x speedup

  • Settings: 4.5GB dataset (8.2m docs, 737m tokens, 141k vocab, 1000 topics), 50 machines (800 cores), PMLS v1.0 vs YahooLDA, program completion = reached -5.8e9 log-likelihood

Convolutional Neural Net (CNN)
  • Compared with Caffe (single GPU)

  • AlexNet and GoogLeNet

  • >5x speedup with 8 machines (8 GPUs)

  • Settings: 250GB data, 1.2m images, 1000 classes (ILSVRC 2012 dataset), 8 machines (1x K20 GPU each)

Sparse Logistic Regression
  • Settings: 29GB dataset (10m features, 50k samples), 8 machines (512 cores), PMLS v0.93 vs Shotgun Lasso, program completion = reached 0.5 loss function

Performance

Performance

Parallel ML System

A open-source, distributed Machine Learning
platform for productivity, scale, and efficiency.
bottom of page