中国神经科学论坛

 找回密码
 注册

扫一扫,访问微社区

QQ登录

只需一步,快速开始

查看: 1876|回复: 0

利用excel表格搞定生物芯片数据之一

[复制链接]
labonchip 发表于 2006-1-6 15:15:00 | 显示全部楼层 |阅读模式

Cell conventions fficeffice" />

1.        Cells are referred to by their column letter and row number (ex: A2).

2.        A series of cells are referred to by showing the starting and ending cells, separated by a colon (ex: A2:A100).

3.        When cells are copied and pasted, the column and row are shifted to adjust to the position of the new cell. For example, if the formula "=B5" is copied and pasted one cell to the right, the formula in the new cell will change to "=C5".

4.        If you wish to maintain the position(s) of cell(s) in a formula, use the prefix "$" to create an absolute link to cells that won't change with copying an pasting. For example, if the formula "=$B$5" is copied and pasted one cell to the right, the formula in the new cell will still be "=$B$5". You may use the prefix "$" before the column and/or the row.

5.        Refer to cell(s) in another worksheet by prefixing the cells with the sheet name plus "!". Ex: "=data!B4" refers to cell B4 on a worksheet named "data". For this reason it's very helpful to name worksheets with short but informative names.

6.        Refer to cell(s) in another file by prefixing the cells with the file in square brackets and the sheet name plus "!". Ex: "=[anotherFile.xls]data!B4" refers to cell B4 of file "anotherFile.xls" on a worksheet named "data".

Copying and pasting cells

1.        Copying and pasting cells usually has the expected result. If not, check your use of "$" (to fix a location).

2.        When copying and pasting data generated by a formula, it may be best (to prevent re-calculations of data) to paste the actual value, rather than the underlying formula. To do this, copy the cell(s) as usual but then select "Paste Special > Values".

Using functions

1.        Excel contains a reasonable range of functions which can be used by in one of at least two ways:

o        Type the formula (preceded by "=") in the box next to "fx" near the top of the window, along with all of the required arguments (assuming you know what these are).

o        Select "Insert > Function" and then choose the function you wish. Select a category to get a partial or complete list of all functions. A box entitled "Function Arguments" will pop up, and you can fill in the empty boxes.

2.        At any time, entering the function name into the help box or selecting "Help on this function" usually provides a brief explanation of the function and some examples.

3.        If you make a mistake with a function, you might get a suggestion to correct it. If you agree to the correction, check the input and output!

4.        You may find that Excel calculates functions too often to your liking (especially if it takes a while). If that happens,

o        Windows: go to Tools > Options

o        Mac: go Excel > Preferences

and select the Calculation tab. Then select Manual under Calculation. Then Excel will only perform calculation on the file when you use F9 or on the active worksheet when you use SHIFT+F9.

5.        Some functions and a few supplementary data analyses can be accessed from "Tools > Data Analysis". If you don't see this option, you may have to install the tools. In that case, select "Tools > Add-Ins" and check Analysis ToolPak. Then try "Tools > Data Analysis" again.

Shortcut for selecting a large number of cells

1.        One can always use the scrollbars to select lots of cells, but with big spreadsheets, this isn't very efficient.

2.        To start the selection of a large matrix, start by selecting the bottom right cell of the desired matrix.

3.        Use "Control - Shift - Up arrow" to select everything above the original cell.

4.        Use "Control - Shift - Left arrow" to select everything to the left of the original cell.

5.        Use "Shift - Down arrow" to move down one row (to avoid the header line, if present).

6.        Use "Shift - Right arrow" to move to the right one column (to avoid any ID column(s), if present).

7.        Note: This should work with both Windows and Macintosh computers.

Functions for microarray analysis

1.        SUM

2.        AVERAGE

3.        MEDIAN

4.        LOG

5.        IF

6.        AND

7.        TTEST

8.        VLOOKUP

Introduction

You'll be using a sample of expression data from a study using Affymetrix (one color) Uffice:smarttags" />95A arrays that were hybridized to tissues from fetal and human liver and brain tissue. Each hybridization was performed in duplicate. Many other tissues were also profiled but won't be used for these exercises.

What we'll be doing to analyze these data:

  • normalize data from these eight chips
  • calculate expression ratios of genes between two different tissues
  • use a common statistical test to identify differentially expressed genes
  • flag low intensity data (most probably background noise)
  • cluster a differentially expressed subset of all genes to identify those with similar expression profiles
  • try to find what functions specific groups of genes (with similar expression profiles) have in common


You'll be using Excel to do most of the mathematical analyses, since this will show the exact formulas used to perform every step of the analysis pipeline. As a result, you'll need to use Excel functions and be familiar with some Excel conventions

Preliminary information: Image analysis and calculation of expression value

1.        As described in Su et al., 2002, human tissue samples were hybridized on Affymetrix (one-color) arrays and chips were scanned. For each tissue, at least two independendent samples were hybridized to separate chips.

2.        Scanned images were quantified (including measurement of background) using standard software.

3.        Data from a probeset (a series of oligos designed to a specific gene target) were used to calculate an expression values for that probeset using standard Affymetrix algorithms.

4.        See the manuals from Affymetrix for more information about these processes, and the Statistical Algorithms Description Document for the actual equations used.

5.        Note that these analysis protocols are generally specific to the chip type and its manufacturer.

Part I. Normalization of expression data

1.        Why normalize? Chips may have been hybridized to different amounts of RNA, for different amounts of time, with different batches of solutions, etc. Normalization should remove systematic biases and make any comparisons between chips more meaningful.

2.        Download the starting data for the exercises.

1.        Look at the expression data for a selected series of experiments

2.        Down the first column are the Affymetrix probeset IDs, each corresponding to a target gene.

  • Each other column shows a tissue name, followed by expression values in some arbitrary unit.
  • There should be two chips each for adult and fetal brain and liver (a total of 8 chips).
  • Note that since this is a one-color array, the expression measures are absolute values and not ratios. If you're more used to 2-color (cDNA) arrays, this would be similar to (but not exactly the same as) 4 chips, each with one channel/dye for fetal tissue and another for adult tissue.

1.        Calculate the median of all expression values from each chip.

o        Use the Excel "MEDIAN" function to calculate the median at the bottom of each column. Ex: =MEDIAN(B2:B12627)

1.        To perform global median normalization, on a new sheet ("norm") of the same file, scale each column of data so that each median is 100.

2.        Divide each expression signal in a chip by the median for the chip and multiply by 100. Ex: =(raw!B2/raw!B$12628)*100 [if B12628 contains the median for the chip]

  • To confirm that this was correctly done, calculate the median of each normalized chip.
  • Global mean normalization is also possible but more influenced by outliers. If you want to compare mean and median yourself [after the class], try both methods and then compare the results with scatterplots
您需要登录后才可以回帖 登录 | 注册

本版积分规则

小黑屋|手机版|Archiver|生物行[生物导航网] ( 沪ICP备05001519号 )

GMT+8, 2024-12-26 23:50 , Processed in 0.015509 second(s), 17 queries .

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表