SC07


SCHEDULE: NOV 10-16, 2007



Entire WeekSaturdaySundayMondayTuesdayWednesdayThursdayFriday
My Itinerary



ParaMEDIC: A Parallel Meta-data Environment for Distributed I/O and Computing

Session: Storage Challenge Finalists

Event Type: Challenge Finalist

Time: 1:30pm - 1:50pm

Session Chair: Raymond L. Paden

Team Member(s): Pavan Balaji, Wu-chun Feng, Jeremy Archuleta, Heshan Lin

Location: A10 / A11

Abstract:
mpiBLAST is an open-source parallelization of the BLAST genome sequence seach library. It uses database segmentation to allow different worker processors to search unique segments of the database and write the output to a shared filesystem. For distributed systems sharing a file-system through a low-bandwidth and/or high-latency network, writing this output can be a challenging task, eventually forming a performance bottleneck.

In this competition, we plan to demonstrate ParaMEDIC, an environment that decouples computation and I/O in applications and drastically reduces I/O overhead through metadata processing. Specifically, for mpiBLAST, ParaMEDIC partitions worker processes into compute and I/O workers. Compute workers convert their output to metadata, and send it to I/O workers. I/O workers process this metadata to re-create the actual output and write it to the filesystem. This allows ParaMEDIC to cut down on the I/O time, thus accelerating mpiBLAST several fold (demonstrated a 5-fold improvement on the teragrid).




Chair/Team Member Details:

Raymond L. Paden (Chair)
IBM

Pavan Balaji
Argonne National Laboratory

Wu-chun Feng
Virginia Tech

Jeremy Archuleta
Virginia Tech

Heshan Lin
North Carolina State University




     Home  |  About  |  Contact Us  |  Registration ACM    IEEE    The Computer Society