The Whole Genome Expression Analysis using Two Microarray Technologies to Identify Gene Networks That Mediate the Myocardial Phenotype of CD36 Deficiency

We have previously shown that CD36 is a membrane protein that facilitates long chain fatty acid (FA) transport by muscle tissues. We also documented the significant impact of muscle CD36 expression on heart function, skeletal muscle insulin sensitivity as well as on overall metabolism. To identify a comprehensive set of genes that are differentially regulated by CD36 expression in the heart, we used two microarray technologies (Affymetrix and Agilent) to compare gene expression in heart tissues from CD36 KnocK-Out (KO-CD36) versus wild type (WT-CD36) mice. The obtained results using the two technologies were similar with around 35 genes differentially expressed using both technologies. Absence of CD36 led to down-regulation of the expression of three groups of genes involved in pathways of FA metabolism, angiogenesis/apoptosis and structure. These data are consistent with the fact that the CD36 protein binds FA and thrombospondin 1 invoved respectively in lipid metabolism and anti-angiogenic activities. In conclusion, our findings led to validate our data analysis workflow and identify specific pathways, possibly underlying the phenotypic abnormalities in CD36 Knock -Out hearts.


Background:
CD36 gene encodes a membrane glycoprotein, and has been identified in wide variety cells types, including platelets, monocytes, and erythroblast, capillary endothelial and mammary epithelial cells [1][2][3][4][5][6]. CD36 (also known as platelet glycoprotein IV or IIIb) is also a membrane glycoprotein highly expressed in heart tissue. It was shown that CD36 works as receptor/transporter of long chain fatty acids (FA) in muscle tissue and is proposed as one of thrombospondine receptor in endothelial cells [5]. CD36-KO mice (with no expression of CD36 gene) exhibits defective FA uptake by the heart, which is paralleled by an increase in the heart/body index and by an enlargement of left ventricular space [1]. Two sets of studies were done to identify a comprehensive set of genes that are differentially regulated by CD36 expression in the heart. In 2002 and 2007, we used respectively the Affymetrix and the Agilent technologies to analyze CD36 involvement in the fatty acids uptake and heart hypertrophy [7, [3][4][5]8]. We propose to compare results obtained from the two microarrays technologies and investigate the consequences of CD36 absence on CD36-KO hearts. In this paper, we will describe the methodology used to identify the differentially expressed genes using the Affymetrix and the Agilent technologies. In the second section, we will use the classification results to determine the gene clusters. Finally, these classifications will be used to annotate the functional class of each cluster and characterize the molecular pathways involved in the myocardial phenotype of KO-CD36 mice.

Data analysis
In order To search for differentially regulated gene networks in the absence of CD36 gene, we performed a comprehensive gene analysis by hybridizing microarray chips with RNA probes prepared from mouse heart CD36-KO and CD36-WT. Two technologies were used. The Affymetrix GeneChip Murine Genome U74 which contains 36.000 probes (Affymetrix, Santa Clara, CA). Once the probe array had been hybridized, stained, and washed, it was scanned using a GeneArray scanner. A GeneChip Operating System, running on a PC workstation was used to control the functions of the scanner and collect fluorescent intensity data. The second approach used was the Whole Mouse Genome Microarray 4x44K (Agilent Technologies, Santa Clara, CA). Arrays were washed and dried out using a centrifuge according to manufacturer's instructions (One-Color Microarray-Based Gene Expression Analysis, Agilent Technologies). Arrays were scanned at 5 mm resolution on an Agilent DNA Microarray Scanner (GenePix 4000B, Agilent Technologies) using the default settings for 4x44k format one-color arrays. Images provided by the scanner were analyzed using Feature Extraction software v10.1.1.1 (Agilent Technologies). Raw data files were analyzed using the software R associated to packages of "Bioconductor" project [8]. Developed affymetrix workflows begin with data normalization using Robust Multichip Average Method (RMA) [9], which allows reduction of block effect done at the probset level. The second step consisted in selecting differentially expressed genes between CD36-WT" and CD36-KO using the Significance Analysis of Microarrays (SAM) algorithm [10] with the Fold Change and P-value Cutoffs respectively fixed to at 1.5 and 0.002. Agilent workflow begins with data normalization by using Lowess normalization [11] that applied to a two-color array expression dataset. The second step, as in the case of Affymetrix, SAM algorithm was used to identify differentially expressed genes with the same FC and P-value cuttoffs. Class discovery analyzed a given set of genes to produce subgroups that share common features. An analysis method often used for class discovery is ""cluster analysis"" or clustering. It is aimed at dividing the data points (genes or samples) into groups (clusters) using measures of similarity [12, 13] creating hierarchical clustering of co-regulated genes

Identification of differentially expressed genes
Using Affymetrix technology, we were able to identify 39 differentially expressed genes between CD36-KO and CD36-WT and when using the Agilent technology with the same parameters, we identified 35 differentially expressed genes. The comparison of the two lists of identified genes showed that 30 of them were common to the two technologies. Differentially expressed genes were clustered by hierarchical clustering (Figure 1). This type of classification and class discovery involves analyzing a given set of gene expression profiles with the goal of discovering subgroups that share common features.

Functional Analysis
After establishing a differentially expressed gene list common to the two technologies (Affymetrix and Agilent), functional annotation allowed the determination of each gene function in the list. Functional classes were extracted using Gene Ontology tools. Associations, biochemical pathway data were retrieved from the Gene Ontology consortium (GO) [ The results appear to be consistent with the role of the CD36 protein in cardiac muscle cells. The identification of the metabolism genes could be explained by the role of CD36 as a receptor / transporter for long-chain fatty acids in heart cells [3]. Indeed, Randle et al. [20] showed that fatty acids are the main energy source for the heart and have brought to light from in vitro experiments that long chain fatty acids is preferentially metabolized. He also demonstrated the existence of competition between glucose and fatty acids as heart fuel [20]. Moreover, Dyck et al. showed that CD36 plays an important role in the choice of substrate in the heart [21]. In the CD36-KO hearts, the shift to glucose substrate since less FA is available led to the expression and down regulation of genes involved in both metabolisms. Secondly, in endothelial heart cells; CD36 has been described as a thrombospondin membrane receptor (TSP-1) and plays a role in the inhibition of endothelial cell migration and apoptosis induction. The absence of CD36 led to down regulation of the expression of its ligand (Tsp1) and the expression of new signaling genes. Finally and in order for CD36-KO to go through heart hypertrophy, a set of remodeling genes is expressed and can be grouped into the category of structural genes as shown in Table 1.

Discussion:
In this study, we compared results obtained from two technologies using the same analysis workflow. We first evaluated variance among replicates within each of the platforms and found low levels of variance and high correlation among the two platforms eventhough the two technologies were used at different labs and time. Agilent oligonucleotide technology was used in 2007 at Georges Washington University (SL, MO) and the Affymetrix U74Av2 technology was used in 2002 at Stony Brook University (Stony Brook, NY). Using SAM, we were not able to find any significant differences among the two platforms looking at their ability to detect differential gene expression between WT-CD36 and KO-CD36. Technological differences may influence the results of transcriptional profiling and are important to consider while using published results. However, and based on our study and given high-quality arrays and the appropriate normalization, the primary factor determining variance is biological rather than technological. The biological conditions of the two experiments could explain the small diffrences in the obtained results. Questions remain regarding the importance of technology choice in evaluating the data generated and comparing among experiments from different laboratories. One of the objective of this comparative study was to elucidate whether gene expression profiles are more influenced by biology or by technological artifacts. Eventhough, the two platforms are based on two distinct manufacturing technologies; a two-color cDNA spotted arrays (Agilent) and in situ synthesized oligonucleotide chips (Affymetrix), our results showed comparable results.

Conclusion:
Our comparative study led to the validation of a data analysis workflow and the identification of, at least, 30 genes involved in the phenotype of CD36 heart hypertrophy. More biological studies are needed to validate the expression of the identified genes using the qRT-PCR. These studies will be complemented with a modeling project based on constructing a bioinformatic platform. It will be reproducing the behavior dynamic of the system under normal conditions and automatically predicting the involvement of different gene networks in the development of pathologies such as cardiomyopathy related to the absence of CD36 protein.