RegStatGel: proteomic software for identifying differentially expressed proteins based on 2D gel images.

UNLABELLED
Image analysis of two-dimensional gel electrophoresis is a key step in proteomic workflow for identifying proteins that change under different experimental conditions. Since there are usually large amount of proteins and variations shown in the gel images, the use of software for analysis of 2D gel images is inevitable. We developed open-source software with graphical user interface for differential analysis of 2D gel images. The user-friendly software, RegStatGel, contains fully automated as well as interactive procedures. It was developed and has been tested under Matlab 7.01.


AVAILABILITY
The database is available for free at http://www.mediafire.com/FengLi/2DGelsoftware.


Background:
Two-dimensional gel electrophoresis (2D gel) remains an important tool in proteomics for protein separation and quantification. The discovery of interesting proteins relies on the analysis of 2D gel images [1]. However, there are large amount of variations contained in the gel images, which should be appropriately accounted for using statistical methods. Unlike analysis of microarray images, there are limited published research and freely available software packages on statistical differential analysis of 2D gel images. The main challenges are the discrimination between actual protein spots and noise, the quantification of protein spots thereafter, and spot matching for individual comparison [2,3]. Although there are commercial software packages for 2D gel image analysis (e.g. PDQuest, Dymension), considerable human intervention is needed for spot identification and matching. Moreover, the comparison of the quantitative features is either based on simple t-test or relies on external statistical tools for analysis. We developed open-source software with graphical user interface, RegStatGel, which is fast, fully automated and robust, with an emphasis on identifying differentially expressed proteins instead of striving for accurate quantification. Moreover, the RegStatGel software incorporates more advanced statistical tools for identifying differentially expressed proteins. It implements a novel analysis procedure as described elsewhere [4].

Methodology:
The software, RegStatGel, bypasses the spot-matching bottleneck by detecting spots using the mean gel image and comparing protein features within watershed boundaries. RegStatGel uses the watershed algorithm to automatically define the region surrounding a spot. The analysis procedure is described in detail [4]. It contains fully automated as well as interactive analysis procedures. Snapshots of the main interface of RegStatGel and of the sub-items of selected menus are shown in Figure 1. RegStatGel is implemented in Matlab 7.01. There are three modes of operation within RegStatGel: Fully automated mode: the user loads the images and then clicks on the menu to choose the fully automated analysis under the 'Gel Analysis' menu. Interactive automated mode: the user will be automatically prompted to choose the parameters for some procedures or skipping some procedures for exploratory purpose. The interactive mode can be accessed via the submenu under 'Gel Analysis'. Exploratory mode: the user can explore the protein regions sequentially by clicking the 'next' or 'previous' buttons. The user can also use the slider to choose the size of the image section that needs closer inspection. The fully automated analysis procedure implemented in RegStatGel contains the following key steps: (1) smoothing and rescaling of gel images; (2) construction of the master watershed map using the mean image; (3) segmentation of watershed regions; (4) quantification of watershed regions; (5) separation of protein regions into independent sets of correlated proteins; (6) statistical analysis using one way or two-way ANOVA or MANOVA; (7) selection of interesting proteins regions while controlling the false discovery rate at selected level.

Software Input:
RegStatGel can load gel images with various formats. At first run, the user will be asked some information about the experimental design such as how many experimental groups and how many images within each group.

Software Output:
During the automated mode, the software will display the status of each key step. At the end of the automated analysis, RegStatGel shows an image with differentially expressed proteins highlighted. The software provides menus to save results from different steps, such as the quantification of the proteins and the list of proteins detected. Under the exploration mode, the user can choose to investigate the detected spots individually by looking at their 3D shape.

Caveat & Future Development:
It should be noted that RegStatGel focuses on post-alignment analysis so there is no image alignment function. The software is not sensitive to misalignment as long as it is slight. The next version will have additional options for statistical analysis.