High-dimensional statistics, clustering, algorithms for finding underlying patterns in high-dimensional data, machine learning
Professor Sanjoy Dasgupta develops algorithms for the statistical analysis of high-dimensional data. Such data is now widespread, in domains ranging from environmental modeling to genomics to web search. The geometry of high-dimensional spaces presents unusual challenges; many traditional statistical procedures were developed with one- or two-dimensional data in mind and do not scale well to this modern context. Some of them are very inefficient; others give poor results because of counter-intuitive effects in high dimension. Dasgupta has developed the first provably correct, efficient algorithms for a variety of canonical statistical tasks, especially related to clustering (grouping) data. He is one of the few machine learning researchers whose work combines algorithmic theory with geometry and mathematical statistics. He adds a strong theoretical focus to UCSD's CSE artificial intelligence and bioinformatics groups.
Prior to joining the UCSD Jacobs School in 2002, Sanjoy Dasgupta was a senior member of the technical staff at AT&T Labs-Research, where his work focused on algorithms for data mining, with applications to speech recognition and to the analysis of business data. Prof Dasgupta received a Ph.D. in Computer Science in 2000 from UC Berkeley and a B.A. in Computer Science from Harvard in 1993. He is a member of the editorial boards of the Journal of Machine Learning Research, the Journal of Artificial Intelligence Research, and the Machine Learning Journal.