MLE for proteomics data imputation

Dear Team, 

MLE is one of the imputation options, which calls the `em.norm` and `imp.norm `functions from the `norm` package. And implemented by Margin ==2 . 

I think Margin ==2 is a reasonable setting since the p*n original data matrix (features in rows and samples in columns) would be transposed before sending to the EM algorithm. Therefore when doing EM each feature would be the actual genes/proteins/peptides. 

But the issue is proteomics data is always p>>n. We would have ~20000 proteins and a dozen of samples in TMT global proteome data set for example. Then with as good number of features, EM algorithm is so expensive. 


I am trying this data set  (10k * 24) with the impute_mle function and haven't got any results yet. 

```
dtmt = fread("ccRCC_prot_abundance_MD_3plex.tsv",
          stringsAsFactors = F, data.table = F)
dd = as.matrix(dtmt[,-c(1:5)])
dtmt_res = MsCoreUtils::impute_mle(dd)
```


Do you have any insights on this issue?

Thank you very much!




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MLE for proteomics data imputation #109

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MLE for proteomics data imputation #109

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions