Case Study:
Molecular Analysis of Genomic Data Sets
Requirements
- Ability to associate the data from highly annotated biospecimens and sophisticated molecular/genetic tests (e.g. gene expression CEL files)
- Ability to link tissue and other samples with pathology and patient history data
Functional Narrative
1. Researcher uses the repository XB-BIS to create XB groups which correspond to his cohorts based on diagnosis and optionally other demographic and clinical attributes.2. Researcher creates an analysis set containing the groups representing the cohorts to be analyzed.
3. Researcher launches the analysis and selects the statistical variables of interest into the working analysis matrix.
4. Researcher runs various analytical tools (heat map, scatter plot, diagnostic creation, etc) on the data in order to identify markers and patterns among the groups.
5. Researcher publishes discovery.
Solution Narrative
A Researcher is evaluating gene expression data sets for a group of tumors in his research project. The Researcher is evaluating the genomic data along with patient demographics data so that he can filter the list of patients on various attributes (e.g., age, outcome, gender, race, etc.) and view patterns in the expression data.
1. Dr. Parker uses XB-BIS to create XB groups that correspond to long-term and short-term survivors of mesothelioma. The figure below presents XB-BIS' filter criteria for short-term survivors.
Figure 1: Short-term survivors maximum survival 12 months
2. XB-BIS displays the short-term survivors. Dr. Parker creates a group containing this cohort.

Figure 2: Create group
3. Dr. Parker labels the cohort and assigns red as the color when plotting the group.

Figure 3: Label group
4. Dr. Parker creates an analysis set for a study to analyze the long-term and short-term mesothelioma survivors.

Figure 4: Create analysis set

Figure 5: Label analysis set
6. Dr. Parker launches the analysis and selects the statistical variables (U95A chip data and clinical data) into his working analysis matrix.

Figure 6: Run analysis

Figure 7: Add U95A variables to working matrix

Figure 8: Working matrix with molecular data
7. Dr. Parker runs scatter plot on the data in order to identify markers and patterns between the long-term and short-term survivors.

Figure 9: Scatter plot of molecular data differences between groups
8. Dr. Parker views the detailed results of the scatter and sorts the discriminators by p value.

Figure 10: Textual representation of scatter plot
9. Dr. Parker publishes his findings.
