proteinProperties {yeastExpData} | R Documentation |
A data frame which details 33 properties of proteins in the Yeast Genome
data(proteinProperties)
A data frame with 6718 observations on the following 33 variables.
yORF
Q0010
, Q0017
, etc. SGDID
molwt
pi
cai
length
nterm
MAAACIC
MAAAPWY
, etc.cterm
AAAAMLL
AAADKKT
, etc. codonBias
The next set of columns, designated by amino acids, is the number of
times that particular residue appears in the protein sequence. For
example, if the ALA column is 2, then the protein contains 2 alanines.
These columns (should) add up to the length
column.
ALA
ARG
ASN
ASP
CYS
GLN
GLU
GLY
HIS
ILE
LEU
LYS
MET
PHE
PRO
SER
THR
TRP
TYR
VAL
The remaining columns are:
fop
gravy
aromaticity
type
ORF|Dubious
ORF|Uncharacterized
ORF|Verified
ORF|Verified|silenced_gene
pseudogene
transposable_element_gene
This data frame is downloaded directly from SGD. It contains 33 characteristics for 6714 open reading frames (ORFS). From the SGD README:
“Contains basic protein information about each ORF in SGD. This file does not include information on deleted or merged ORFs. Note, however, that it includes ORFs of all other classifications (Verified, Uncharacterized, and Dubious).”
For more details see http://www.yeastgenome.org/help/protein_page.html.
ftp://genome-ftp.stanford.edu/pub/yeast/protein_info/protein_properties.tab. This file is updated weekly (Saturday). The version used here was downloaded on 2006-11-03.
data(proteinProperties) pairs(proteinProperties[, c("molwt", "pi", "cai", "gravy", "aromaticity")], pch = ".", col = as.numeric(proteinProperties$type))