SDSS SkyServer DR14

Special Issue of Computing in Science and Engineering Dedicated to the SDSS Science Archive The January/February 2008 issue of the journal Computing in Science and Engineering (CiSE) - a joint publication of the American Institute of Physics and the IEEE Computer Society - was dedicated to the SDSS Science Archive. The issue featured several in-depth, peer-reviewed articles on various components of the SDSS-II Science Archive. For SDSS-III, the Data Archive Server (DAS) has been replaced with the Science Archive Server (SAS), whereas the Catalog Archive Server (CAS) continues (with significant enhancements and schema changes) to provide access to the catalog data via the SkyServer Web interface and the CasJobs batch query service. The November/December 2008 issue of CiSE also had a follow-up article on lessons learned from the SDSS-II CAS deployment. These articles are described below with links to the PDF for each article.

The Sloan Digital Sky Survey - Drinking From the Fire Hose Ani Thakar
The Sloan Digital Sky Survey Science Archive represents a thousand-fold increase in the total amount of data that astronomers have collected to date. The pioneering instrumentation technology that made this possible is matched by groundbreaking tools that let anyone in the world access terabytes of SDSS data online.

The Sloan Digital Sky Survey Data Archive Server Eric H. Neilsen Jr.
The Sloan Digital Sky Survey's Data Archive Server (DAS) provides public access to data files produced by the SDSS data reduction pipeline. This article discusses challenges in public distribution of data of this complexity and how the project addressed them.

The Catalog Archive Server Database Management System Ani Thakar, Alex Szalay, George Fekete, and Jim Gray
The multiterabyte Sloan Digital Sky Survey's (SDSS's) catalog data is stored in a commercial relational database management system with SQL query access and a built-in query optimizer. The SDSS Catalog Archive Server adds advanced data mining features to the DBMS to provide fast online access to the data.

The sqlLoader Data-Loading Pipeline Alex Szalay, Ani Thakar, and Jim Gray
Using a database management system (DBMS) is essential to ensure the data integrity and reliability of large, multidimensional data sets. However, loading multiterabyte data into a DBMS is a time-consuming and error-prone task that the authors have tried to automate by developing the sqlLoader pipeline--a distributed workflow system for data loading.

CasJobs and MyDB: A Batch Query Workbench Nolan Li and Ani Thakar
Catalog Archive Server Jobs (CasJobs) is an asynchronous query workbench service that lets users run unrestricted SQL queries against scientific catalog archives. After running queries in batch mode, users can save their results to a personal database called MyDB before downloading them, letting users manage their query workloads, results, and histories without causing network overloads.

Lessons Learned from the SDSS Catalog Archive Ani Thakar
The SDSS is one of the first very large archives in astronomy and other sciences, as we enter the era of data-intensive science. Here the authors summarize some of the important and generally applicable insights they have gained (often the hard way!) over the past decade of SDSS development.

SkyServer



Site Map
SkyServer paper
CiSE papers
- Overview
- DAS
- CAS
- sqlLoader
- CasJobs
- Lessons Learned
Site Traffic
skyserver.org
About the SDSS
Credits
SkyServer Team

Special Issue of Computing in Science and Engineering Dedicated to the SDSS Science Archive

The Sloan Digital Sky Survey - Drinking From the Fire Hose Ani Thakar

The Sloan Digital Sky Survey Data Archive Server Eric H. Neilsen Jr.

The Catalog Archive Server Database Management System Ani Thakar, Alex Szalay, George Fekete, and Jim Gray

The sqlLoader Data-Loading Pipeline Alex Szalay, Ani Thakar, and Jim Gray

CasJobs and MyDB: A Batch Query Workbench Nolan Li and Ani Thakar

Lessons Learned from the SDSS Catalog Archive Ani Thakar

The Sloan Digital Sky Survey - Drinking From the Fire Hose
Ani Thakar

The Sloan Digital Sky Survey Data Archive Server
Eric H. Neilsen Jr.

The Catalog Archive Server Database Management System
Ani Thakar, Alex Szalay, George Fekete, and Jim Gray

The sqlLoader Data-Loading Pipeline
Alex Szalay, Ani Thakar, and Jim Gray

CasJobs and MyDB: A Batch Query Workbench
Nolan Li and Ani Thakar

Lessons Learned from the SDSS Catalog Archive
Ani Thakar