Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry About Relax Analysis Results #1760

Open
aaannaw opened this issue Nov 19, 2024 · 2 comments
Open

Inquiry About Relax Analysis Results #1760

aaannaw opened this issue Nov 19, 2024 · 2 comments

Comments

@aaannaw
Copy link

aaannaw commented Nov 19, 2024

Dear author,
I am conducting a RELAX analysis on 13,057 orthologous genes and have encountered a question regarding the interpretation of the results. Below, I outline the details of my analysis:
hyphy relax --alignment /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/07.deletestopcodon/evm.model.ptg000001l.100.fa --tree /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/00.tree/evm.model.ptg000001l.100.phy.treefile.mark --test FG2 --reference FG1 --output /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000001l.100.json > /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000001l.100.relax.log 2>&1;
I used the unrooted tree and the tree is:
(Fan{FG1},((((((((Hcr,Cgu),(Tsw,Pty)),Hgl{FG1}),(Gca{FG2},Bsu{FG2})),Cho{FG1}),Fme{FG1}),Fda{FG1}),Fdm{FG1}),Fmi{FG1});
I specified --test FG2 and --reference FG1. The results indicate that the test branches (FG2) show intensified selection (K > 1, P < 0.05).

My question is as follows:

1.If the test branches show intensified selection, can I interpret this as the reference branches (FG1) showing relaxed selection?
2.Would the results remain consistent if I reversed the designations, using --test FG1 and --reference FG2?
I would greatly appreciate any insights or suggestions you may have regarding the interpretation of these results.

Thank you for your time and guidance.

Best regards,
Na Wan

@spond
Copy link
Member

spond commented Nov 20, 2024

Dear @aaannaw,

  1. RELAX makes always a relative statement. So if FG1 is intensified relative to FG2 then FG2 is relaxed relative to FG1.
  2. Generally, this should be the case. However, RELAX may have convergence issues on smaller datasets or small test/reference sets of branches, especially if the model is too complex for the dataset, or there are other stability issues (ω that is too high for example)

As an example (this is the dataset from Figure 4E in the RELAX paper), attached here as well. Please note that --models Minimal in most cases improves stability. This is because the general descriptive model may be biased by one or two outlier branches. Using test and reference as defined, K=0.17, reversing them gives K=4.84 (so not quite 1/K but close) and significant LRT results.

$hyphy relax --alignment tests/data/Fig4E.nex --reference R --test T --models Minimal

...

### Fitting the alternative model to test K != 1
* Log(L) = -4984.57, AIC-c = 10132.56 (81 estimated parameters)
* Relaxation/intensification parameter (K) =     0.17
* The following rate distribution was inferred for **test** branches

|          Selection mode           |     dN/dS     |Proportion, %|               Notes               |
|-----------------------------------|---------------|-------------|-----------------------------------|
|        Negative selection         |     0.627     |    8.545    |                                   |
|        Negative selection         |     0.681     |   91.427    |                                   |
|      Diversifying selection       |     1.799     |    0.029    |                                   |

...
----
## Test for relaxation (or intensification) of selection [RELAX]
Likelihood ratio test **p =   0.0000**.
>Evidence for *relaxation of selection* among **test** branches _relative_ to the **reference** branches at P<=0.05
----

....


$hyphy relax --alignment tests/data/Fig4E.nex --reference T --test R --models Minimal

...
### Fitt
[Fig4E.nex.zip](https://github.com/user-attachments/files/17831635/Fig4E.nex.zip)
ing the alternative model to test K != 1
* Log(L) = -4984.75, AIC-c = 10132.92 (81 estimated parameters)
* Relaxation/intensification parameter (K) =     4.74
* The following rate distribution was inferred for **test** branches

|          Selection mode           |     dN/dS     |Proportion, %|               Notes               |
|-----------------------------------|---------------|-------------|-----------------------------------|
|        Negative selection         |     0.106     |   98.029    |                                   |
|        Negative selection         |     0.148     |    1.945    |                                   |
|      Diversifying selection       |    30.916     |    0.026    |                                   |


----
## Test for relaxation (or intensification) of selection [RELAX]
Likelihood ratio test **p =   0.0000**.
>Evidence for *intensification of selection* among **test** branches _relative_ to the **reference** branches at P<=0.05

Best,
Sergei
Fig4E.nex.zip

@aaannaw
Copy link
Author

aaannaw commented Nov 26, 2024

Dear author,
I apologize for the delayed response.

I recently exchanged the reference and test branches in my analysis using the tree:`(Gca{FG2},Bsu{FG2})),Cho{FG1}),Fme{FG1}),Fda{FG1}),Fdm{FG1}),Fmi{FG1});`` . However, I did not observe the expected reciprocal relationship between the K values (1/K) in the results.

For one of the genes, when I ran the following command using FG2 as the test branches and FG1 as the reference branches:

hyphy relax --alignment /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/07.deletestopcodon/evm.model.ptg000013l.612.fa --tree /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/00.tree/evm.model.ptg000013l.612.phy.treefile.mark --test FG2 --reference FG1 --output /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000013l.612.json > /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000013l.612.relax.log 2>&1

The JSON output included:

"test results":{
   "LRT":5.250662813217787,
   "p-value":0.0219384127989084,
   "relaxation or intensification parameter": 0.7065734954061154
  },

Subsequently, I swapped the roles, using FG1 as the test branches and FG2 as the reference branches:

/data/01/p1/user157/software/hyphy/bin/hyphy relax --alignment /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/07.deletestopcodon/evm.model.ptg000013l.612.fa --tree /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/00.mark.tree/evm.model.ptg000013l.612.fa.treefile.mark --test FG1 --reference FG2 --output /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.socialtest.relax.out/evm.model.ptg000013l.612.json > /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.socialtest.relax.out/evm.model.ptg000013l.612.log 2>&1

The JSON output in this case showed:

"test results":{
   "LRT":2.923764289269457,
   "p-value":0.08728402693497728,
   "relaxation or intensification parameter":0.0693699098262564
  },

The "0.7065734954061154" and "0.0693699098262564" are not reciprocal of each other.

I suspect that the imbalance in the number of target branches between FG1 (Hgl, Cho, Fdm, Fan, Fmi, Fme, Fda) and FG2 (Bsu, Gca) might influence the results.

I have attached the relevant JSON and log files for your reference. Could you kindly provide guidance on this matter or clarify whether my expectation about the reciprocal relationship is correct under these circumstances?

evm.model.ptg000013l.612.FG1test.json
evm.model.ptg000013l.612.FG1test.log
evm.model.ptg000013l.612.FG2test.json
evm.model.ptg000013l.612.FG2test.log

Thank you for your time and assistance. I look forward to your insights.

Best wishes,
Na Wan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants