Maximizing science output from large datasets

The past decade has seen a boom in the generation of large survey datasets of many kinds, but with a large variety of levels of access and ease of use by the community at large. What services, data access, and data products are important to maximizing the science output from current and future large datasets? What kinds of training are needed to take advantage of large datasets? What are the challenges of supporting development of the underpinning software infrastructure?

Proposed by Knut Olsen (NOAO)