[R] help with linear model

Ista Zahn istazahn at gmail.com
Mon Oct 26 11:23:23 CET 2009


I'm not familiar with microarray data, so I hope I'm not off base here.

Data frames are structured so that variables appear in the columns and
cases in the rows. From your formula it looks like you're trying to
fit a model using rows as variables and columns as cases. There is
probably a way to do this, but It might be easier to just flip your
data. One way to do this is

dataNew <- as.data.frame(t(data))
row.names(dataNew) <- names(data)
names(dataNew) <- paste("I",row.names(data), sep="") #variable names
should start with a letter

(note that naming your data "data" is not a good practice.)

Now you should be able to run your model as before (prefixing "I" to
the variable names to match the new naming scheme):

m1 = lm(norm ~ I206427_s_at + I205338_s_at + I209848_s_at + I205694_at
+ I201909_at + I208894_at + I216512_s_at + I205337_at + I201850_at +
I210982_s_at, data=dataNew)

Hope it helps,
Ista
On Mon, Oct 26, 2009 at 5:48 AM, Eleni Christodoulou
<elenichri at gmail.com> wrote:
> Dear list,
>
> I have been searching for a week to fit a simple linear model to my data. I
> have looked into the previous posts but I haven't found anything relevant to
> my problem. I guess it is something simple...I just cannot see it.
> I have the following data frame, named "data", which is a subset of a
> microarray experiment. The columns are the samples and the rows are the
> probes. I binded the first line, called "norm", which represents the
> estimated output. I want to create a linear model which shows the
> relationship between the gene expressions (rows) and the output (norm).
>
>  *data*
>            GSM276723.CEL GSM276724.CEL GSM276725.CEL GSM276726.CEL
> norm             0.897000      0.590000      0.683000      0.949000
> 206427_s_at      5.387205      6.036506      8.824783     10.864122
> 205338_s_at      6.454779     13.143095      6.123212     12.726562
> 209848_s_at      6.703062      7.783330     12.175654      9.339651
> 205694_at        5.894131      5.794516     12.876555     11.534664
> 201909_at       12.616538     12.913255     12.275182     12.767743
> 208894_at       13.049286      9.317874     12.873516     13.527182
> 216512_s_at      6.324789     12.783791      6.216932     12.013404
> 205337_at        6.175940     12.158796      6.117519     12.041078
> 201850_at        6.633013      6.465900      6.535434      7.749985
> 210982_s_at     12.444791      8.597388     12.197696     12.963449
>            GSM276727.CEL GSM276728.CEL GSM276729.CEL GSM276731.CEL
> norm             0.302000      0.597000      0.270000      0.530000
> 206427_s_at      5.690357      8.014055     13.034753      5.493977
> 205338_s_at      5.757048      7.706341     13.258410      5.562588
> 209848_s_at      6.461028      7.036515     13.633649      5.874098
> 205694_at        5.519552      5.297107      6.498811      5.146150
> 201909_at       12.814454     11.592632      6.594229      6.650796
> 208894_at       13.835359     13.028096      5.839909      6.045578
> 216512_s_at      6.033096      7.273650     12.669054      5.946932
> 205337_at        5.879028      7.381713     12.633829      5.379559
> 201850_at        9.684397      6.560014      8.523229      6.573052
> 210982_s_at     13.342729     12.470517      5.903681      5.658115
>            GSM276732.CEL GSM276735.CEL GSM276736.CEL GSM276737.CEL
> norm              0.43400      0.647000      0.113000      1.000000
> 206427_s_at      12.80257      5.645002      6.519554     13.572480
> 205338_s_at      13.38057      5.804107     11.090690     14.024922
> 209848_s_at      13.27718      6.490851      9.784199     14.101162
> 205694_at        11.37717      5.802105      7.944963     14.060492
> 201909_at        13.24126     12.263899     12.578315      6.443491
> 208894_at        12.29916      7.563361      9.971493      7.094214
> 216512_s_at      13.00303      5.905789     10.512761     13.647573
> 205337_at        12.63560      5.430138     10.707242     13.020312
> 201850_at        12.71874      6.275480      6.987962     12.354580
> 210982_s_at      11.53559      7.225199      9.322706      6.617615
>            GSM276738.CEL GSM276739.CEL GSM276740.CEL GSM276742.CEL
> norm              0.35700      0.967000      0.823000      1.000000
> 206427_s_at      13.33764     13.607918     13.190551     12.387189
> 205338_s_at      13.65492     12.812950     12.237476     12.912605
> 209848_s_at      13.48525     13.435389     13.851347     12.540495
> 205694_at         7.70928     10.045331     13.391456     11.103841
> 201909_at        12.47093     11.937344      6.631023      7.160071
> 208894_at        12.20508      8.892181      6.478889      5.927860
> 216512_s_at      13.42313     12.151691     11.620552     12.341763
> 205337_at        12.67544     12.036528     11.641203     12.275845
> 201850_at        11.85481     13.172666     12.964316     12.156142
> 210982_s_at      11.49940      8.380404      6.121762      5.921634
>            GSM276743.CEL GSM276744.CEL GSM276745.CEL GSM276747.CEL
> norm             0.899000      0.927000      0.754000      0.437000
> 206427_s_at     12.665097     12.604673     11.446630     13.000295
> 205338_s_at     13.261141     12.448096     13.185698     12.510952
> 209848_s_at     13.396711     13.882529     13.040600     12.984137
> 205694_at       10.888474      7.094063      8.630120     12.321685
> 201909_at       12.100560      6.666787     12.330600      6.572282
> 208894_at        7.741437      8.348155     10.106442      6.009902
> 216512_s_at     12.830373     11.504074     12.300163     11.525958
> 205337_at       12.264569     11.676281     11.940917     11.618351
> 201850_at       11.055564     12.202366      7.327056     12.853055
> 210982_s_at      7.285289      8.129298      9.577032      5.924993
>            GSM276748.CEL GSM276752.CEL GSM276754.CEL GSM276756.CEL
> norm             0.321000      0.620000      0.155000      0.946000
> 206427_s_at      9.081283     11.446978      8.191261     13.192507
> 205338_s_at     13.737773     13.698520     12.983830     10.948681
> 209848_s_at     13.234025     12.956672     10.644642     13.176656
> 205694_at        7.953865      7.397013      7.170732     13.618932
> 201909_at       12.533684      7.049442      6.804030      7.135974
> 208894_at       11.868729      8.558455      6.629858      6.850639
> 216512_s_at     13.589290     12.781853     12.060414     10.143297
> 205337_at       13.084386     12.442617     12.104849     10.364035
> 201850_at        6.615453      8.104145      7.058739      6.514298
> 210982_s_at     11.058085      7.891520      6.516261      6.532226
>            GSM276758.CEL GSM276759.CEL
> norm             0.767000      0.218000
> 206427_s_at      5.742074     11.232337
> 205338_s_at      6.375289     13.406557
> 209848_s_at      6.226996      6.835458
> 205694_at        5.864042     11.218719
> 201909_at        6.907489      7.316435
> 208894_at       12.596987     12.408412
> 216512_s_at      6.308256     12.318892
> 205337_at        6.063775     12.389912
> 201850_at        6.816491      6.602764
> 210982_s_at     11.985288     11.853911
>
> *What I did is the following:*
>>fm1=as.formula((norm) ~ "206427_s_at" + "205338_s_at" + "209848_s_at" +
> "205694_at" + "201909_at" + "208894_at" + "216512_s_at" + "205337_at" +
> "201850_at" + "210982_s_at")
>>lm1=lm(fm1,data1new)
>
> And I receive the following error:
> Error in terms.formula(formula, data = data) :
>  invalid model formula in ExtractVars
>
>
> *I have also tried:*
>>cols=rownames(data3)  %%%%Where data3 is the same data frame with data
> above, but without the "norm" row binded yet
> thus: > cols
>  [1] "206427_s_at" "205338_s_at" "209848_s_at" "205694_at"   "201909_at"
>  [6] "208894_at"   "216512_s_at" "205337_at"   "201850_at"   "210982_s_at"
>
>> lm1=lm(fm1,data1new)
>
> and in this case Ireceive the following error:
> Error in model.frame.default(formula = fm1, data = data1new,
> drop.unused.levels = TRUE) :
> variable lengths differ (found for 'cols')
>
> Could anyone help me with this?
>
> Thank you very much in advance,
> Eleni
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org




More information about the R-help mailing list