Hops and Tensorflow enabling rapid machine learning

11 May, 2017 - 17:20

Distributed Tensorflow-as-a-Service now available on SICS ICE

Starting this June, we are offering Tensorflow as a managed service on the Hopsworks platform, hosted at the SICS ICE research datacenter facility in Luleå. 

Hopsworks, being the Hadoop application enabling in-place data sharing at the same time as separating projects on the same cluster, enables deep neural network models to be trained much faster by training in parallel. This is made possible by using many GPUs, high bandwidth, and low-latency inifiniband networking to improve the scalability of distribution on the Hopsworks platform. SICS ICE provides the latest Nvidia 1080 (Ti) GPUs connected by Infiniband networking, enabling data scientists to build models of any size with our managed scalable infrastructure.

The Hopsworks platform at SICS ICE (www.hops.site) also hosts a large number of public datasets and programs that can be used to experiment with training large models. Datasets of hundreds of GB and TBs in size that are available locally include Imagenet, Reddit stories/comments, 9m Images, Youtube8M, OpenStreetMaps, MsCoco, self-driving vehicle datasets, and numerous Kaggle datasets.

Hopsworks supports the execution of managed Tensorflow programs written in Python (or PySpark) on any available number of GPUs. Tensorflow programs can be run as Jupyter notebooks or ApacheZeppelin notebooks or directly as python programs. Distributed Tensorflow is also supported as both native Tensorflow-on-YARN and Tensorflow-on-YARN. 

Please contact us for further information!
Jim Dowling, jim.dowling [at] ri.se
Tor Björn Minde, tor.bjorn.minde [at] ri.se