About Me

Disa Mhembere, PhD

I received my PhD in Computer Science from the Johns Hopkins University. I was part of the Institute for Data Intensive Engineering and Science. My dissertation focused on reimagining infrastructure for large-scale semi-external memory graph analysis and unsupervised clustering. My primary advisor was Randal Burns, PhD. I obtained my Master's in Computer Science and Master's in Engineering Management, both from Hopkins. I also hold a Bachelor's in Electrical and Computer Engineering from Morgan State University.

Awards

In 2017, I received the UPE Academic Achievement award and the Best Presentation award at High-Performance Parallel and Distributed Computing (HPDC) Conference.
In 2014, I was awarded the Hopkins Computer Science Graduate (Paul V. Renoff) Fellowship, and the UPE Special Recognition award.

Interests

I love to work on highly-scalable tools, frameworks, systems and libraries for graph analytics and artificial intelligence applications.

I enjoy building scalable backend infrastructure, including RESTful web services and MCPs for SaaS delivery.

I also enjoy studying the effects of computing advancements on businesses and their co-evolution as AI and high-scale computing become ubiquitous in industry and academia alike.

Selected Experience

Parallel, semi-external and distributed memory community detection for clusterNOR.
Semi-external memory vertex-centric library development for FlashGraph.
Backend infrastructure and RESTful services for Neurodata's MRI portal.

Lecturer MSEM : Create, grade assignments and exams in Excel and Python for bootcamp and seminar.
Teaching Assistant: Create assignments, grade and offer office hours for Parallel Programming.

Selected Skills

Parallel & Distributed Computing

C++

Python

Backend Dev.

Management

My Projects

NeuroData Graph Database

Development and maintenance of back and front-end web-services that leverage Django. The project sits at the intersection of big-data and computational neuroscience as we deliver connectome building and exploratory graph analysis tools.

Star
Fork

Graphyti: Semi-external memory graph applications on FlashGraph

Development of a vertex-centric, semi-external memory, parallel graph analytics library. Graphyti is developed on FlashGraph and is a high-level Python wrapper over C++. Graphyti inherits its performance characteristics from FlashGraph and can outperform distributed frameworks on billion-node graphs on a single commodity server!

Star
Fork

clusterNOR: Parallel, Semi-external and Distributed Memory clustering framework

Development of a parallelized, NUMA-architecture aware clustering framework. We utilize fine-grained I/O manipulation, informed thread-binding, NUMA-aware task scheduling and other memory access latency reduction techniques. We achieve a 10–100x speedup when compared with commercial products such as Spark's MLlib, Turi (Formerly Dato, GraphLab) and H²O for k-means, and several-fold speedup for other algorithms.

Star
Fork

monya: A fast, scalable decision forests building and querying framework

Monya provides a unified programming interface for the creation and querying for forests of trees. Monya parallelizes tree construction and querying and contains scheduling, caching and memory (de)allocation optimizations.

Star
Fork

Teaching

At Hopkins, I previously taught parallel and distributed computing, covering OpenMP, MPI, Spark, GPU programming via CUDA, and distributed training techniques essential to modern AI and large-scale machine learning workloads. I also taught Python and Excel for data science and engineering bootcamps and seminars.

Tennis

I received a full scholarship to compete at the NCAA D1 level for Morgan State University as an undergraduate.

I am certified as a USTA P1 (highest level) high-performance tennis coach. I have trained players who have gone on to compete in NCAA D1 tennis at UPenn, Ohio State University, and Loyola University.

Publications

Graphyti: Engineering An Efficient Semi-External Memory Graph Library in FlashGraph. Disa Mhembere, Da Zheng, Carey E Priebe, Joshua T. Vogelstein and Randal Burns. In submission IEEE Transactions on Knowledge and Data Engineering 2019.
clusterNOR: A NUMA-Optimized Clustering Framework. Disa Mhembere, Da Zheng, Carey E Priebe, Joshua T. Vogelstein and Randal Burns. Under Review IEEE Transactions on Parallel and Distributed Systems 2019.
Forest Packing: Fast, Parallel Decision Forests. James Browne, Tyler Tomita, Disa Mhembere, Randal Burns and Joshua T. Vogelstein. SIAM International Conference on Data Mining (SDM19).
FlashR: parallelize and scale R for machine learning using SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2018 (Best Paper Nomination).
knor: A NUMA-optimized In-memory, Distributed and Semi-external-memory k-means Library. Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, and Randal Burns. High-Performance Parallel and Distributed Computing (HPDC) Proceedings 2017 (Best Presentation Award).
FlashMatrix: Parallel, Scalable Data Analysis with Generalized Matrix Operations using Commodity SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. Under Review: 2017.
Semi-External Memory Sparse Matrix Multiplication on Billion-node Graphs in a Multicore Architecture. Da Zheng, Disa Mhembere, Vince Lyzinski, Joshua T. Vogelstein, Carey E. Priebe, and Randal Burns. IEEE Transactions on Parallel and Distributed Systems 2017.
FlashGraph: Processing Billion-Node Graphs on an Array of Commodity SSDs. Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, Alexander Szalay, Randal Burns. FAST 2015.
Computing Scalable Multivariate Glocal Invariants of Large (Brain-) Graphs. Disa Mhembere, William Gray Roncal, Daniel Sussman, Carey E. Priebe, Rex Jung, Sephira Ryman, R. Jacob Vogelstein, Joshua T. Vogelstein, Randal Burns. Global Conference on Signal and Information Processing IEEE, 2013.
MIGRAINE: MRI Graph Reliability Analysis and Inference for Connectomics. William Gray Roncal, Zachary H Koterba, Disa Mhembere, Joshua T. Vogelstein, Randal Burns, R. Jacob Vogelstein. Global Conference on Signal and Information Processing IEEE, 2013.

Conference Abstracts

Spectral Clustering For Billion-Node Graphs. Da Zheng, Disa Mhembere, Youngser Park, Joshua T. Vogelstein, Carey E. Priebe, Randal Burns. SIAM Workshop on Network Science 2016.
MR Graph with Rich attribUTEs DataBase (Mr. GruteDB). Gregory Kiar, William Gray Roncal, Disa Mhembere, Eric Bridgeford, Shangsi Wang, Carey E. Priebe, Randal Burns, Joshua T. Vogelstein. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2016.
Multivariate Invariants from Massive Brain-Graphs. Disa Mhembere, Sephira Ryman, Daniel Sussman, Rex Jung, Joshua T. Vogelstein, R. Jacob Vogelstein, Carey E. Priebe, Randal Burns. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
Massive Diffusion MRI Graph Structure Preserves Spatial Information. Daniel L Sussman, Disa Mhembere, Sephira Ryman, Rex Jung, R Jacob Vogelstein, Randal Burns, Joshua T. Vogelstein, and Carey E. Priebe. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
Robust Vertex Clustering of Massive Brain-Graphs via Lq-Likelihood. Yichen Qin, Disa Mhembere, Sephira Ryman, Rex Jung, R. Jacob Vogelstein, Randal Burns, Joshua T. Vogelstein, Randal Burns, Carey E. Priebe. Proceedings of the Organization for Human Brain Mapping conference (OHBM) 2013.
Feature Clustering from a Brain Graph for Voxel-to-Region Classification. N. Sismanis, D. L. Sussman, J. T. Vogelstein, W. Gray, R. J. Vogelstein, E. Perlman, Disa Mhembere, S. Ryman, R. Jung, R. Burns, C. E. Priebe, N. Pitsianis and X. Sun. Proceedings of the 2013 Greek Society of Biomedical Technology (ELEVIT) conference.
Automatic cardiac & respiratory cycle detection of self-gated cardiac cine MRI navigator projections. Disa Mhembere, Liheng Guo, J. Andrew Derbyshire, Elliot R. McVeigh, Daniel A. Herzka. Proceedings of 2010 Meeting of the Biomedical Engineering Society conference (BMES).