normalizePlates           package:cellHTS2           R Documentation

_P_e_r-_p_l_a_t_e _d_a_t_a _t_r_a_n_s_f_o_r_m_a_t_i_o_n, _n_o_r_m_a_l_i_z_a_t_i_o_n _a_n_d _v_a_r_i_a_n_c_e _a_d_j_u_s_t_m_e_n_t

_D_e_s_c_r_i_p_t_i_o_n:

     Plate-by-plate normalization of the raw data stored in slot
     'assayData' of a 'cellHTS' object. Normalization is performed
     separately for each plate, replicate and channel. 'Log2' data
     transformation can be performed and variance adjustment can be
     performed in different ways (none, per-plate, per-batch or
     per-experiment).

_U_s_a_g_e:

     normalizePlates(object,  scale="additive", log=FALSE, method="median", varianceAdjust="none", posControls, negControls,...)

_A_r_g_u_m_e_n_t_s:

  object: a 'cellHTS' object that has already been configured. See
          details.

   scale: a character specifying the scale of the raw data: "additive"
          scale (default) or "multiplicative" scale.

     log: logical. If 'log=TRUE', raw data are 'log2' transformed.  If
          data are in additive scale ('scale="additive"'), 'log' can
          only be set to 'log=FALSE'.  The default is 'log=FALSE'.

  method: a character specifying the normalization method to use for
          performing the per-plate normalization. Allowed   values are
          '"median"' (default), '"mean"', '"shorth"', '"POC"', '"NPI"',
          '"negatives"',  'Bscore', 'loess' and 'locfit'. See details.

varianceAdjust: character vector of length one indicating the variance
          adjustment to perform. Allowed values are "none" (default),
          "byPlate", "byBatch" and "byExperiment". See details.

posControls: a vector of regular expressions giving the name of the
          positive control(s). See details.

negControls: a vector of regular expressions giving the name of the
          negative control(s). See details.

     ...: Further arguments that get passed on to the function
          implementing the normalization method chosen by 'method'.
          Currently, this is only used for 'Bscore' and 'loess' and
          'locfit'.

_D_e_t_a_i_l_s:

     Function 'normalizePlates' uses the content of 'assayData' slot of
     'object'. For dual-channel data, the user should first correct for
     plate effects using 'normalizePlates' function, then combine the
     two channels using function 'summarizeChannels', and finally, if
     necessary, normalize the summarized intensities calling
     'normalizePlates' again. 

     In this function, the normalization is performed in a
     plate-by-plate fashion, following this workflow:

        1.  Log transformation of the data (optional, if data are in
           multiplicative scale);

        2.  Per-plate normalization using the chosen method;

        3.  Variance adjustment of the plate intensity corrected data
           (optional).

     The argument 'scale' defines the scale of the data. If data are in
     multiplicative scale  ('scale="multiplicative"'), data can be
     'log2' transformed by setting 'log=TRUE'. This changes the scale
     of the data to "additive".

     In the next step of preprocessing, intensities are corrected in a
     plate-by-plate basis using the chosen normalization method:


        *  If 'method="median"' (median scaling), plates effects are
           corrected  by dividing each measurement by the median value
           across wells annotated as 'sample' in 'wellAnno(object)',
           for each plate and replicate.

        *  If 'method="mean"' (mean scaling), the average in the
           'sample' wells is consider instead.

        *  If 'method="shorth"' (scaling by the midpoint of the
           shorth), for each plate and replicate, the midpoint of the
           'shorth' of the distribution of values in the wells
           annotated  as 'sample' is calculated. Then, every
           measurement is divided by this value.

        *  If 'method="negatives"' (scaling by the negative controls),
           for each plate and replicate, each measurement is divided by
           the median of the measurements on the plate negative
           controls.

     NOTE: Depending on the scale of the data prior to normalization,
     the above per-plate correction factors are subtracted from each
     plate measurement, instead.

     Other available normalization methods are:

        *  'method="POC"' (percent of control): for each plate and
           replicate, each measurement is divided by the average of the
           measurements on the plate positive controls, and multiplied
           by 100.

        *  'method="NPI"' (normalized percent inhibition): each
           measurement is subtracted from the average of the
           intensities on the plate positive controls, and this result
           is divided by the difference between  the means of the
           measurements on the positive and the negative controls.

        *  'method="Bscore"' (B score): for each plate and replicate,
           the 'B score method' (based on a 2-way median polish) is
           applied to remove plate effects and row and column biases.

        *  'method="locfit"' (robust local fit regression): for each
           plate and replicate, spatial effects are removed by fitting
           a bivariate local regression (see 'spatial normalization
           function').

        *  'method="loess"' (loess regression): for each plate and
           replicate, spatial effects are removed by fitting a loess
           curve (see 'spatial normalization function').

     In the final preprocessing step, variance of plate-corrected
     intensities can be adjusted as follows:


        *  'varianceAdjust="byPlate"': per plate normalized intensities
           are divided by the per-plate median absolute deviations
           (MAD) in "sample" wells. This is done separately for each
           replicate and channel;

        *  'varianceAdjust="byBatch"': using the content of slot
           'batch', plates are split according to assay batches and the
           individual normalized intensities in each group of plates
           (batch) are divided by the per-batch of plates MAD values
           (calculated based on "sample" wells). This is done
           separately for each replicate and channel;

        *  'varianceAdjust="byExperiment"': each normalized measurement
           is divided by the overall MAD of normalized values in wells
           containing "sample". This is done separately for each
           replicate and channel;

     By default, no variance adjustment is performed
     ('varianceAdjust="none"').

     The arguments 'posControls' and 'negControls' are required for
     applying the normalization methods based on the control
     measurements (that is, when 'method="POC"', or 'method="NPI"', or
     'method="negatives"').  'posControls' and 'negControls' should be
     vectors of regular expression patterns specifying the name of the
     positive(s) and negative(s) controls, respectivey, as provided in
     the plate configuration file (and accessed via
     'wellAnno(object)'). The length of these vectors should be equal
     to the current number of channels in 'object' (i.e. to the
     'dim(Data(object))[3]'). By default, if 'posControls' is not
     given, _pos_ will be taken as the name for the wells containing
     positive controls. Similarly, if 'negControls' is missing, by
     default _neg_ will be considered as the name used to annotate the
     negative controls.  The content of 'posControls' and 'negControls'
     will be passed to 'regexpr' for pattern matching within the well
     annotation given in the featureData slot of 'object' (which can be
     accessed via 'wellAnno(object)') (see examples for
     'summarizeChannels'). The arguments 'posControls' and
     'negControls' are particularly useful in multi-channel data since
     the controls might be reporter-specific, or after normalizing
     multi-channel data.

     See the Examples section for an example on how this function can
     be used to apply a robust version of the Z score method, whereby
     the measurements of each plate and replicate are substracted by
     the per-plate median (at sample wells) and then divided by the
     per-plate MAD (at sample wells).

_V_a_l_u_e:

     An object of class 'cellHTS' with the normalized data stored in
     slot 'assayData' (its previous contents were overridden). The
     processing status of the 'object' is updated in the slot 'state'
     to 'object@state[["normalized"]]=TRUE'.

     Additional slots of 'object' may be updated if 'method="Bscore"',
     or 'method="loess"' or 'method="locfit"'. Please refer to the help
     page of the 'Bscore' function and 'spatialNormalization' function.

_A_u_t_h_o_r(_s):

     Ligia Bras ligia@ebi.ac.uk, Wolfgang Huber huber@ebi.ac.uk

_R_e_f_e_r_e_n_c_e_s:

     Boutros, M., Bras, L.P. and Huber, W. (2006) Analysis of
     cell-based RNAi screens, _Genome Biology_ *7*, R66.

_S_e_e _A_l_s_o:

     'Bscore',  'spatialNormalization', 'summarizeChannels'

_E_x_a_m_p_l_e_s:

         data(KcViabSmall)
         # per-plate median scaling of intensities
         x1 <- normalizePlates(KcViabSmall, scale="multiplicative", log=FALSE, method="median", varianceAdjust="none")
         # per-plate median subtraction of log2 transformed intensities  
         x2 <- normalizePlates(KcViabSmall, scale="multiplicative", log=TRUE, method="median", varianceAdjust="none")
         ## Not run: 
         x3 <- normalizePlates(KcViabSmall, scale="multiplicative", log=TRUE, method="Bscore", varianceAdjust="none", save.model=TRUE)
         
     ## End(Not run)

         ## robust Z score method (plate intensities are subtracted by the per-plate median on sample wells and divided by the per-plate MAD on sample wells):
         xZ <- normalizePlates(KcViabSmall, scale="additive", log=FALSE, method="median", varianceAdjust="byPlate")

         ## an example to illustrate the use of slot 'batch':
        ## Not run: 
        try(xnorm <- normalizePlates(KcViabSmall, scale="multiplicative", method="median", varianceAdjust="byBatch"))
        
        # It doesn't work because we need to have slot 'batch'!
        # For example, we will suppose that a different lot of reagents was used for plate 1:
        pp <- plate(KcViabSmall)
        fData(KcViabSmall)$"reagent" <- "lot B"
        fData(KcViabSmall)$"reagent"[pp==1] <- "lot A"
        fvarMetadata(KcViabSmall)["reagent",] <- "Lot of reagent used"

        bb <- as.factor(fData(KcViabSmall)$"reagent")
        batch(KcViabSmall) <- array(as.integer(bb), dim=dim(Data(KcViabSmall)))
        ## check number of batches:
        nbatch(KcViabSmall)
        x1 <- normalizePlates(KcViabSmall, scale="multiplicative", log = FALSE, method="median", varianceAdjust="byBatch")
     ## End(Not run)

