About Me

Disa Mhembere, PhD

I received my PhD in Computer Science from the Johns Hopkins University in 2019. I was part of the Institute for Data Intensive Engineering and Science. My dissertation focussed on reimagining infrastructure for large-scale semi-external memory graph analysis and unsupervised clustering. My primary advisor was Randal Burns, PhD. I obtained my Masters in Computer Science in 2015 and Masters in Engineering Management in 2013, both from Hopkins. I earned my Bachelors in Electrical and Computer Engineering from Morgan State University in 2010.

Awards

In 2017 I received the UPE Academic Achievement award and the Best Presentation award at High-Performance Parallel and Distributed Computing (HPDC) Conference.
In 2014 I was awarded the Hopkins Computer Science Graduate (Paul V. Renoff) Fellowship, and the UPE Special Recognition award.

Interests

I love to work on highly-scalable tools, frameworks, libraries (generally in C++ and Python) for graph analytics and artificial intelligence applications.

I enjoy backend web-service infrastructure development and strongly believe in SaaS via RESTful web-services as a scalable service delivery strategy.

I also enjoy studying the effect of advancements in computing on business and their co-evolution as AI and high-scale computing becomes ubiquitous in industry and academia alike.

Selected Experience







Selected Skills

Parallel & Distributed Computing


C++


Python


Backend RESTful Dev.


Management

My Projects

NeuroData Graph Database

Development and maintenance of back and front-end web-services that leverage Django. The project sits at the intersection of big-data and computational neuroscience as we deliver connectome building and exploratory graph analysis tools.

Graphyti: Semi-external memory graph applications on FlashGraph

Development of a vertex-centric, semi-external memory, parallel graph analytics library. Graphyti is developed on FlashGraph and is a high level Python wrapper over C++. Graphyti, inherits its performance characteristics from FlashGraph and can outperform distributed frameworks on billion-node graphs on a single commodity server!

clusterNOR: Parallel, Semi-external and Distributed Memory clustering framework

Development of a parallelized, NUMA-architecture aware clustering framework. We utilize fine-grained I/O manipulation, informed thread-binding, NUMA-aware task scheduling and other memory access latency reduction techniques. We achieve a 10-100 X speedup when compared with commercial products such as Spark's MLlib, Turi (Formerly Dato, GraphLab) and H2O for k-means, and several factors speedup for other algorithms.

monya: A fast, scalable decision forests building and querying framework

Monya provides a unified programming interface for the creation and querying for forests of trees. Monya parallelizes tree construction and querying and contains scheduling, caching and memory (de)allocation optimizations.


Teaching

I teach a Seminar in Emerging Technologies for the MSEM program. We discuss, evaluate and develop concrete business plans for emerging and disruptive technologies. We also invite several CEOs and owners of businesses to speak.


I teach an Excel and Python bootcamp for the MSEM program. Students learn (i) essential (advanced) skills in MS Excel, (ii) fundamental programming skills in Python to allow students to reason about how software is developed.


I have been a teaching assistant for Parallel Programming several times. We tackle topics and projects using OpenMP, Java Threads, Hadoop!/MapReduce, Spark, Message Passing Interface (MPI) and GPU programming via CUDA.

Tennis

I received a full scholarship to compete at NCAA D1 level for Morgan State University as an undergraduate.


I am certified as a USTA P1 (highest level) high performance tennis coach. I have trained players who have gone on to compete in NCAA D1 tennis at UPenn, Ohio State University, and Loyola University.

Publications

  1. Graphyti: Engineering An Efficient Semi-External Memory Graph Library in FlashGraph. Disa Mhembere, Da Zheng, Carey E Priebe, Joshua T. Vogelstein and Randal Burns. In submission IEEE Transactions on Knowledge and Data Engineering 2019.
  2. clusterNOR: A NUMA-Optimized Clustering Framework. Disa Mhembere, Da Zheng, Carey E Priebe, Joshua T. Vogelstein and Randal Burns. Under Review IEEE Transactions on Parallel and Distributed Systems 2019.
  3. Forest Packing: Fast, Parallel Decision Forests. James Browne, Tyler Tomita, Disa Mhembere, Randal Burns and Joshua T. Vogelstein. SIAM International Conference on Data Mining (SDM19).
  4. FlashR: parallelize and scale R for machine learning using SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2018 (Best Paper Nomination).
  5. knor: A NUMA-optimized In-memory, Distributed and Semi-external-memory k-means Library. Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, and Randal Burns. High-Performance Parallel and Distributed Computing (HPDC) Proceedings 2017 (Best Presentation Award).
  6. FlashMatrix: Parallel, Scalable Data Analysis with Generalized Matrix Operations using Commodity SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. Under Review: 2017.
  7. Semi-External Memory Sparse Matrix Multiplication on Billion-node Graphs in a Multicore Architecture. Da Zheng, Disa Mhembere, Vince Lyzinski, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. IEEE Transactions on Parallel and Distributed Systems 2017.
  8. FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, Alexander Szalay, Randal Burns. FAST 2015.
  9. Computing Scalable Multivariate Glocal Invariants of Large (Brain-) Graphs. Disa Mhembere, William Gray Roncal, Daniel Sussman, Carey E. Priebe, Rex Jung, Sephira Ryman, R. Jacob Vogelstein, Joshua T. Vogelstein, Randal Burns. Global Conference on Signal and Information Processing IEEE, 2013.
  10. MIGRAINE: MRI Graph Reliability Analysis and Inference for Connectomics. William Gray Roncal, Zachary H Koterba, Disa Mhembere, Joshua T. Vogelstein, Randal Burns, R. Jacob Vogelstein. Global Conference on Signal and Information Processing IEEE, 2013.

Conference Abstracts

  1. Spectral Clustering For Billion-Node Graphs. Da Zheng, Disa Mhembere, Youngser Park, Joshua T. Vogelstein, Carey E. Priebe, Randal Burns. SIAM Workshop on Network Science 2016.
  2. MR Graph with Rich attribUTEs DataBase (Mr. GruteDB). Gregory Kiar, William Gray Roncal, Disa Mhembere, Eric Bridgeford, Shangsi Wang, Carey E. Priebe, Randal Burns, Joshua T. Vogelstein. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2016.
  3. Multivariate Invariants from Massive Brain-Graphs. Disa Mhembere, Sephira Ryman, Daniel Sussman, Rex Jung, Joshua T. Vogelstein, R. Jacob Vogelstein, Carey E. Priebe, Randal Burns. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
  4. Massive Diffusion MRI Graph Structure Preserves Spatial Information. Daniel L Sussman, Disa Mhembere, Sephira Ryman, Rex Jung, R Jacob, Vogelstein, Randal Burns, Joshua T. Vogelstein, and Carey E. Priebe. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
  5. Robust Vertex Clustering of Massive Brain-Graphs via Lq-Likelihood. Yichen Qin, Disa Mhembere, Sephira Ryman, Rex Jung, R. Jacob Vogelstein, Randal Burns, Joshua T. Vogelstein, Randal Burns, Carey E. Priebe. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
  6. Feature Clustering from a Brain Graph for Voxel-to-Region Classification. N. Sismanis, D. L. Sussman, J. T. Vogelstein, W. Gray, R. J. Vogelstein, E. Perlman, Disa Mhembere, S. Ryman, R. Jung, R. Burns, C. E. Priebe, N. Pitsianis and X. Sun. Proceedings of the 2013 Greek Society of Biomedical Technology (ELEVIT) conference.
  7. Automatic cardiac & respiratory cycle detection of self-gated cardiac cine MRI navigator projections. Disa Mhembere, Liheng Guo, J. Andrew Derbyshire, Elliot R. McVeigh, Daniel A. Herzka. Proceedings of 2010 Meeting of the Biomedical Engineering Society conference (BMES).