-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to reproduce the best fit PRS for plink #27
Comments
what did you get?
I haven't keep the tutorial up to date lately and I know for example, the
pre-QCed data for the subsequent data weren't updated.
…On Thu, Sep 30, 2021 at 10:38 AM Alva Rani James ***@***.***> wrote:
Hi Sam,
Thanks for the great tutorial. I have been trying PLINK for the polygenic
risk score. However, with the height dataset and EUR plink files, I am not
able to reproduce the results. Especially, the one for best-PRS using
linear regression model in R script.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#27>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYV76BRV7IUG6RLYMV3UERY6TANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
So for example. The best PRS according to the tutorial is 0.3 and what I have is 0.5 |
If I repeat the analysis stated in the tutorial using the provided data set
(I re-downloaded everything to ensure it is correct), I still got the same
result stated in the tutorial
Threshold R2 P BETA SE
5 0.3 0.1612372 2.77407e-25 45316.19 4107.777
And if I use PRSice with info filtering disabled, I will also get the same
result. So you might want to double check
Sam
…On Thu, Sep 30, 2021 at 11:18 AM Alva Rani James ***@***.***> wrote:
So for example. The best PRS according to the tutorial is 0.3 and what I
have is 0.5
prs.result[which.max(prs.result$R2),] Threshold R2 P BETA SE 7 0.5
0.1634566 9.256151e-26 55830.85 5004.534
Ok, I see. I just wanna make sure that the whole steps mentioned are
appropriate for analysis. I am following the steps for our in-house
datasets. So before that as a validation of all steps, I used the provided
GWAS summary file and plink datasets.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYWJALFQQFJDDY2KJXTUER5WDANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Ok, thanks a lot for the update and for double-checking this. I appreciate your time and help. |
Hi Sam, |
You should never use the same sample for both the base and target
And the base data from the tutorial was from GIANT consortium with some
modification
On Mon, 4 Oct 2021 at 6:49 AM, Alva Rani James ***@***.***> wrote:
Hi Sam,
I could now validate my output with what is documented. Thanks for your
time and patience.
I have a question. Do the base and target datasets are some different
individual or same individuals/samples? I read they are from two sources
target data is simulated from 1000 genome and base is from your own lab. I
have understood the phenotype (base) dataset should correspond to the
phenotype-genotype datasets (target) set, isn't it?
I have base and target datasets from the same patients, does that make
sense?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYV2CGJ7L7AD24JZCLDUFGBCBANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Dr Shing Wan Choi
Instructor
Genetics and Genomic Sciences
Icahn School of Medicine, Mount Sinai, NYC
|
I am confused then in our case we do not have a different base and target datasets.
The 1 base dataset is Gwas output from plink on the same cohort.
The target is the same cohort as well
How does this similarity make a problem in the result?
Also, we do not have a continuous phenotype we have the binary phenotype. So in
that case is it fine to use our logistic regression for finding the best PRS
fit?
On Mon 4. Oct 2021 at 13:30, Shing Wan Choi ***@***.***>
wrote:
You should never use the same sample for both the base and target
And the base data from the tutorial was from GIANT consortium with some
modification
On Mon, 4 Oct 2021 at 6:49 AM, Alva Rani James ***@***.***>
wrote:
> Hi Sam,
> I could now validate my output with what is documented. Thanks for your
> time and patience.
> I have a question. Do the base and target datasets are some different
> individual or same individuals/samples? I read they are from two sources
> target data is simulated from 1000 genome and base is from your own lab.
I
> have understood the phenotype (base) dataset should correspond to the
> phenotype-genotype datasets (target) set, isn't it?
> I have base and target datasets from the same patients, does that make
> sense?
>
> —
> You are receiving this because you commented.
>
>
> Reply to this email directly, view it on GitHub
> <
#27 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJTRYV2CGJ7L7AD24JZCLDUFGBCBANCNFSM5FCQA5LA
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
>
--
Dr Shing Wan Choi
Instructor
Genetics and Genomic Sciences
Icahn School of Medicine, Mount Sinai, NYC
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4I6JPVLOUA7JHQFYIMNYTUFGF5LANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Sent from my iPad
|
See pitfall 1 in this paper: https://www.nature.com/articles/nrg3457
Yes, logistic regression for binary traits
|
Thanks a lot for the paper. I have another question, is it possible to have a gene-based polygenic score than on each variant within each patient? |
Do you mind elaborating? Do you mean you want to calculate PRS using only
one gene?
You can use PRSet to calculate pathway specific scores, but that might be a
bit different from a "gene" based PRS?
|
Yes what I mean is we need a score for each gene. A weighted score.
Currently from both tools we have score for each patients in each
variants/SNP. If we collapse the genes based on their variants and run the
analysis would that make sense?
Or simply apply the formula for polygenic risk from Wikipedia on the
collapse gene would that still make sense
https://wikimedia.org/api/rest_v1/media/math/render/svg/7da94c1dc4f882b5cb293ac8415cf9d94f8639b7
At the end we need score for each gene within each sample/individual
I would like to hear your opinion on this ?
Thanks again for your valuable remarks.
Can be still used as polygenic risk score?
On Mon 4. Oct 2021 at 21:58, Shing Wan Choi ***@***.***> wrote:
Do you mind elaborating? Do you mean you want to calculate PRS using only
one gene?
You can use PRSet to calculate pathway specific scores, but that might be a
bit different from a "gene" based PRS?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4I6JKVFCRHTH7BDNHIEUDUFIBPVANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Sent from my iPad
|
It is implemented as PRSet. You can check our webpage.
Problem with going down to gene level is that each of the gene will likely
explain such small amount of the phenotypic variance that it will likely
not be useful. If you group that into pathway / gene sets, that might
provide more power.
Sam
On Mon, Oct 4, 2021 at 4:14 PM Alva Rani James ***@***.***>
wrote:
… Yes what I mean is we need a score for each gene. A weighted score.
Currently from both tools we have score for each patients in each
variants/SNP. If we collapse the genes based on their variants and run the
analysis would that make sense?
Or simply apply the formula for polygenic risk from Wikipedia on the
collapse gene would that still make sense
https://wikimedia.org/api/rest_v1/media/math/render/svg/7da94c1dc4f882b5cb293ac8415cf9d94f8639b7
At the end we need score for each gene within each sample/individual
I would like to hear your opinion on this ?
Thanks again for your valuable remarks.
Can be still used as polygenic risk score?
On Mon 4. Oct 2021 at 21:58, Shing Wan Choi ***@***.***>
wrote:
> Do you mind elaborating? Do you mean you want to calculate PRS using only
> one gene?
>
> You can use PRSet to calculate pathway specific scores, but that might
be a
> bit different from a "gene" based PRS?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <
#27 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AB4I6JKVFCRHTH7BDNHIEUDUFIBPVANCNFSM5FCQA5LA
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
>
--
Sent from my iPad
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJTRYXVUTYSELLFPYISTRDUFIDMBANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Thanks again for your suggestions and time.
Using for pathways enrichment meaning using those genes with a specific
threshold for pathway enrichment analysis gives us more meaningful results?
Is that you mean?
Also what you specifically mean by “small amount “ of phenotypic risk score?
On Mon 4. Oct 2021 at 22:21, Shing Wan Choi ***@***.***>
wrote:
It is implemented as PRSet. You can check our webpage.
Problem with going down to gene level is that each of the gene will likely
explain such small amount of the phenotypic variance that it will likely
not be useful. If you group that into pathway / gene sets, that might
provide more power.
Sam
On Mon, Oct 4, 2021 at 4:14 PM Alva Rani James ***@***.***>
wrote:
> Yes what I mean is we need a score for each gene. A weighted score.
> Currently from both tools we have score for each patients in each
> variants/SNP. If we collapse the genes based on their variants and run
the
> analysis would that make sense?
> Or simply apply the formula for polygenic risk from Wikipedia on the
> collapse gene would that still make sense
>
>
>
https://wikimedia.org/api/rest_v1/media/math/render/svg/7da94c1dc4f882b5cb293ac8415cf9d94f8639b7
>
> At the end we need score for each gene within each sample/individual
> I would like to hear your opinion on this ?
>
> Thanks again for your valuable remarks.
>
>
> Can be still used as polygenic risk score?
>
> On Mon 4. Oct 2021 at 21:58, Shing Wan Choi ***@***.***>
> wrote:
>
> > Do you mind elaborating? Do you mean you want to calculate PRS using
only
> > one gene?
> >
> > You can use PRSet to calculate pathway specific scores, but that might
> be a
> > bit different from a "gene" based PRS?
> >
> > —
> > You are receiving this because you authored the thread.
> > Reply to this email directly, view it on GitHub
> > <
>
#27 (comment)
> >,
> > or unsubscribe
> > <
>
https://github.com/notifications/unsubscribe-auth/AB4I6JKVFCRHTH7BDNHIEUDUFIBPVANCNFSM5FCQA5LA
> >
> > .
> > Triage notifications on the go with GitHub Mobile for iOS
> > <
>
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
> >
> > or Android
> > <
>
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
> >.
> >
> >
> --
> Sent from my iPad
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <
#27 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AAJTRYXVUTYSELLFPYISTRDUFIDMBANCNFSM5FCQA5LA
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4I6JOQINREUZOBOEN7CG3UFIEERANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Sent from my iPad
|
Use pathway (collection of gene based on biochemical signalling or other
biological processes) instead of individual genes
For most genome wide PRS, an R2 of 0.3 is already really nice. If you are
using gene, which represent X% of the genome, your R2 is likely 0.3 * X%
(maybe slightly higher than that). When you go down to gene level, X is
going to be very small, thus your resulting R2 is likely to be too small
to be useful
|
Thanks a lot. Makes sense to me
Thanks again for your time and
Patience.
On Mon 4. Oct 2021 at 22:50, Shing Wan Choi ***@***.***> wrote:
Use pathway (collection of gene based on biochemical signalling or other
biological processes) instead of individual genes
For most genome wide PRS, an R2 of 0.3 is already really nice. If you are
using gene, which represent X% of the genome, your R2 is likely 0.3 * X%
(maybe slightly higher than that). When you go down to gene level, X is
going to be very small, thus your resulting R2 is likely to be too small
to be useful
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4I6JPN45ZA6T7HYW7RR23UFIHSDANCNFSM5FCQA5LA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
--
Sent from my iPad
|
By the way, where can find the reference to the plink pRS score formula
mentioned in the documentation?
I have searched for it in plink’s manuel could not find. Would be great if
you could share the source
Thanks
On Mon 4. Oct 2021 at 22:53, alva james ***@***.***> wrote:
Thanks a lot. Makes sense to me
Thanks again for your time and
Patience.
On Mon 4. Oct 2021 at 22:50, Shing Wan Choi ***@***.***>
wrote:
> Use pathway (collection of gene based on biochemical signalling or other
> biological processes) instead of individual genes
>
> For most genome wide PRS, an R2 of 0.3 is already really nice. If you are
> using gene, which represent X% of the genome, your R2 is likely 0.3 * X%
> (maybe slightly higher than that). When you go down to gene level, X is
> going to be very small, thus your resulting R2 is likely to be too small
> to be useful
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#27 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AB4I6JPN45ZA6T7HYW7RR23UFIHSDANCNFSM5FCQA5LA>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
>
--
Sent from my iPad
--
Sent from my iPad
|
our website has it prsice.info
|
Hi Sam,
Thanks for the great tutorial. I have been trying PLINK for the polygenic risk score. However, with the height dataset and EUR plink files, I am not able to reproduce the results. Especially, the one for best-PRS using linear regression model in R script.
The text was updated successfully, but these errors were encountered: