Speaker: Sung-Hou Kim (University of California, Berkeley)
Program Description
Two of the most reliable and fundamental molecular data types about living organisms are genome sequences and 3D structures of proteins and nucleic acids. With dramatic advances in genome sequencing and synchrotron diffraction technologies, unprecedented amounts of these data are available and accumulating rapidly. Genomes of about 10,000 species have been sequenced and more than 500,000 individual human genomes are expected to be sequenced within several years. About 100,000 protein structures have been determined, mostly with synchrotrons. We now may have enough data of both types to attempt obtaining global views of the “universe” of protein structures and genomes of all living organisms, especially since computing has also advanced to handle “big data”.