change papers

Ericsson · Dec 22, 2023 · de14833 · de14833
1 parent 4e90992
commit de14833
Show file tree

Hide file tree

Showing 13 changed files with 507 additions and 268 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,6 @@
+# folders
+.vscode/
+
 _site
 .sass-cache
 .jekyll-cache

diff --git a/LICENSE b/LICENSE
diff --git a/_cite/plugins/orcid.py b/_cite/plugins/orcid.py
@@ -20,7 +20,7 @@ def main(entry):
 
     # query api
     @log_cache
-    @cache.memoize(name=__file__, expire=1 * (60 * 60 * 24))
+    @cache.memoize(name=__file__, expire=1 * (60 * 60 * 24)) #expire=1 * (60 * 60 * 24)
     def query(_id):
         url = endpoint.replace("$ORCID", _id)
         request = Request(url=url, headers=headers)

diff --git a/_data/citations.yaml b/_data/citations.yaml
@@ -1,247 +1,74 @@
 # DO NOT EDIT, GENERATED AUTOMATICALLY
 
-- id: doi:10.1093/nar/gkad1082
-  title: "The Monarch Initiative in 2024: an analytic platform integrating phenotypes,\
-    \ genes\_and diseases across species"
+- id: doi:10.48550/arXiv.2311.08118
+  title: Evaluating Neighbor Explainability for Graph Neural Networks
   authors:
-  - Tim E Putman
-  - Kevin Schaper
-  - Nicolas Matentzoglu
-  - "Vincent\_P Rubinetti"
-  - "Faisal\_S Alquaddoomi"
-  - Corey Cox
-  - J Harry Caufield
-  - Glass Elsarboukh
-  - Sarah Gehrke
-  - Harshad Hegde
-  - "Justin\_T Reese"
-  - Ian Braun
-  - "Richard\_M Bruskiewich"
-  - Luca Cappelletti
-  - Seth Carbon
-  - "Anita\_R Caron"
-  - "Lauren\_E Chan"
-  - "Christopher\_G Chute"
-  - "Katherina\_G Cortes"
-  - "Vin\xEDcius De\_Souza"
-  - Tommaso Fontana
-  - "Nomi\_L Harris"
-  - "Emily\_L Hartley"
-  - Eric Hurwitz
-  - "Julius\_O B Jacobsen"
-  - Madan Krishnamurthy
-  - "Bryan\_J Laraway"
-  - "James\_A McLaughlin"
-  - "Julie\_A McMurry"
-  - "Sierra\_A T Moxon"
-  - "Kathleen\_R Mullen"
-  - "Shawn\_T O\u2019Neil"
-  - "Kent\_A Shefchek"
-  - Ray Stefancsik
-  - Sabrina Toro
-  - "Nicole\_A Vasilevsky"
-  - "Ramona\_L Walls"
-  - "Patricia\_L Whetzel"
-  - David Osumi-Sutherland
-  - Damian Smedley
-  - "Peter\_N Robinson"
-  - "Christopher\_J Mungall"
-  - "Melissa\_A Haendel"
-  - "Monica\_C Munoz-Torres"
-  publisher: Nucleic Acids Research
-  date: '2023-11-24'
-  link: https://doi.org/gs6kmr
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1101/2023.10.11.560955
-  title: Integration of 168,000 samples reveals global patterns of the human gut microbiome
-  authors:
-  - Richard J. Abdill
-  - Samantha P. Graham
-  - Vincent Rubinetti
-  - Frank W. Albert
-  - Casey S. Greene
-  - Sean Davis
-  - Ran Blekhman
-  publisher: Cold Spring Harbor Laboratory
-  date: '2023-10-11'
-  link: https://doi.org/gsvf5z
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1093/nar/gkad289
-  title: 'MyGeneset.info: an interactive and programmatic platform for community-curated
-    and user-created collections of genes'
-  authors:
-  - Ricardo Avila
-  - Vincent Rubinetti
-  - Xinghua Zhou
-  - Dongbo Hu
-  - Zhongchao Qian
-  - Marco Alvarado Cano
-  - Everaldo Rodolpho
-  - Ginger Tsueng
-  - Casey Greene
-  - Chunlei Wu
-  publisher: Nucleic Acids Research
-  date: '2023-04-18'
-  link: https://doi.org/gr5hb5
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1101/2023.01.05.522941
-  title: Hetnet connectivity search provides rapid insights into how two biomedical
-    entities are related
-  authors:
-  - Daniel S. Himmelstein
-  - Michael Zietz
-  - Vincent Rubinetti
-  - Kyle Kloster
-  - Benjamin J. Heil
-  - Faisal Alquaddoomi
-  - Dongbo Hu
-  - David N. Nicholson
-  - Yun Hao
-  - Blair D. Sullivan
-  - Michael W. Nagle
-  - Casey S. Greene
-  publisher: Cold Spring Harbor Laboratory
-  date: '2023-01-07'
-  link: https://doi.org/grmcb9
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1093/gigascience/giad047
-  title: Hetnet connectivity search provides rapid insights into how biomedical entities
-    are related
-  authors:
-  - Daniel S Himmelstein
-  - Michael Zietz
-  - Vincent Rubinetti
-  - Kyle Kloster
-  - Benjamin J Heil
-  - Faisal Alquaddoomi
-  - Dongbo Hu
-  - David N Nicholson
-  - Yun Hao
-  - Blair D Sullivan
-  - Michael W Nagle
-  - Casey S Greene
-  publisher: GigaScience
-  date: '2022-12-28'
-  link: https://doi.org/gsd85n
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1101/2022.02.18.461833
-  title: 'MolEvolvR: A web-app for characterizing proteins using molecular evolution
-    and phylogeny'
-  authors:
-  - Jacob D Krol
-  - Joseph T Burke
-  - Samuel Z Chen
-  - Lo Sosinski
-  - Faisal S Alquaddoomi
-  - Evan P Brenner
-  - Ethan P Wolfe
-  - Vince P Rubinetti
-  - Shaddai Amolitos
-  - Kellen M Reason
-  - John B Johnston
-  - Janani Ravi
-  publisher: Cold Spring Harbor Laboratory
-  date: '2022-02-22'
-  link: https://doi.org/gstx7j
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1186/s13059-020-02021-3
-  title: Compressing gene expression data using multiple latent space dimensionalities
-    learns complementary biological representations
-  authors:
-  - Gregory P. Way
-  - Michael Zietz
-  - Vincent Rubinetti
-  - Daniel S. Himmelstein
-  - Casey S. Greene
-  publisher: Genome Biology
-  date: '2020-05-11'
-  link: https://doi.org/gg2mjh
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1371/journal.pcbi.1007128
-  title: Open collaborative writing with Manubot
-  authors:
-  - Daniel S. Himmelstein
-  - Vincent Rubinetti
-  - David R. Slochower
-  - Dongbo Hu
-  - Venkat S. Malladi
-  - Casey S. Greene
-  - Anthony Gitter
-  publisher: PLOS Computational Biology
-  date: '2020-12-04'
-  link: https://doi.org/c7np
-  orcid: 0000-0002-4655-3773
-  plugin: sources.py
-  file: sources.yaml
+  - Oscar Llorente Gonzalez
+  - "P\xE9ter Vaderna"
+  - "S\xE1ndor Laki"
+  - "Roland Kotrocz\xF3"
+  - Rita Csoma
+  - "J\xE1nos M\xE1rk Szalai-Gindl"
+  publisher: arXiv
+  date: '2023-11-14'
+  link: https://doi.org/10.48550/arxiv.2311.08118
   type: paper
-  description: Lorem ipsum _dolor_ **sit amet**, consectetur adipiscing elit, sed
-    do eiusmod tempor incididunt ut labore et dolore magna aliqua.
-  image: https://journals.plos.org/ploscompbiol/article/figure/image?size=inline&id=info:doi/10.1371/journal.pcbi.1007128.g001&rev=2
-  buttons:
-  - type: manubot
-    link: https://greenelab.github.io/meta-review/
-  - type: source
-    text: Manuscript Source
-    link: https://github.com/greenelab/meta-review
-  - type: website
-    link: http://manubot.org/
+  image: images/sa.png
+  description: Explainability in Graph Neural Networks (GNNs) is a new field growing
+    in the last few years. In this publication we address the problem of determining
+    how important is each neighbor for the GNN when classifying a node and how to
+    measure the performance for this specific task. To do this, various known explainability
+    methods are reformulated to get the neighbor importance and four new metrics are
+    presented. Our results show that there is almost no difference between the explanations
+    provided by gradient-based techniques in the GNN domain. In addition, many explainability
+    techniques failed to identify important neighbors when GNNs without self-loops
+    are used.
   tags:
-  - open science
-  - collaboration
-  repo: greenelab/meta-review
-- id: doi:10.1101/573782
-  title: Sequential compression of gene expression across dimensionalities and methods
-    reveals no single best method or dimensionality
-  authors:
-  - Gregory P. Way
-  - Michael Zietz
-  - Vincent Rubinetti
-  - Daniel S. Himmelstein
-  - Casey S. Greene
-  publisher: Cold Spring Harbor Laboratory
-  date: '2019-03-11'
-  link: https://doi.org/gfxjxf
-  orcid: 0000-0002-4655-3773
-  plugin: orcid.py
-  file: orcid.yaml
-- id: doi:10.1016/j.csbj.2020.05.017
-  title: Constructing knowledge graphs and their biomedical applications
-  authors:
-  - David N. Nicholson
-  - Casey S. Greene
-  publisher: Computational and Structural Biotechnology Journal
-  date: '2020-01-01'
-  link: https://doi.org/gg7m48
-  image: https://ars.els-cdn.com/content/image/1-s2.0-S2001037020302804-gr1.jpg
+  - GAI Lab
+  - Ericsson GAIA
+  - Ericsson Research
+  buttons:
+  - type: paper
+    text: Manuscript
+    link: https://arxiv.org/abs/2311.08118
+  - type: github
+    text: Source Code
+    link: EricssonResearch/gnn-neighbors-xai
   plugin: sources.py
   file: sources.yaml
-- id: doi:10.7554/eLife.32822
-  title: Sci-Hub provides access to nearly all scholarly literature
+- id: doi:10.48550/arXiv.2310.19573
+  title: Model Uncertainty based Active Learning on Tabular Data using Boosted Trees
   authors:
-  - Daniel S Himmelstein
-  - Ariel Rodriguez Romero
-  - Jacob G Levernier
-  - Thomas Anthony Munro
-  - Stephen Reid McLaughlin
-  - Bastian Greshake Tzovaras
-  - Casey S Greene
-  publisher: eLife
-  date: '2018-03-01'
-  link: https://doi.org/ckcj
-  image: https://iiif.elifesciences.org/lax:32822%2Felife-32822-fig8-v3.tif/full/863,/0/default.webp
+  - Sharath M Shankaranarayana
+  publisher: arXiv
+  date: '2023-10-30'
+  link: https://doi.org/10.48550/arxiv.2310.19573
+  type: paper
+  description: Supervised machine learning relies on the availability of good labelled
+    data for model training. Labelled data is acquired by human annotation, which
+    is a cumbersome and costly process, often requiring subject matter experts. Active
+    learning is a sub-field of machine learning which helps in obtaining the labelled
+    data efficiently by selecting the most valuable data instances for model training
+    and querying the labels only for those instances from the human annotator. Recently,
+    a lot of research has been done in the field of active learning, especially for
+    deep neural network based models. Although deep learning shines when dealing with
+    image\textual\multimodal data, gradient boosting methods still tend to achieve
+    much better results on tabular data. In this work, we explore active learning
+    for tabular data using boosted trees. Uncertainty based sampling in active learning
+    is the most commonly used querying strategy, wherein the labels of those instances
+    are sequentially queried for which the current model prediction is maximally uncertain.
+    Entropy is often the choice for measuring uncertainty. However, entropy is not
+    exactly a measure of model uncertainty. Although there has been a lot of work
+    in deep learning for measuring model uncertainty and employing it in active learning,
+    it is yet to be explored for non-neural network models. To this end, we explore
+    the effectiveness of boosted trees based model uncertainty methods in active learning.
+    Leveraging this model uncertainty, we propose an uncertainty based sampling in
+    active learning for regression tasks on tabular data. Additionally, we also propose
+    a novel cost-effective active learning method for regression tasks along with
+    an improved cost-effective active learning method for classification tasks.
+  buttons:
+  - type: paper
+    text: Manuscript
+    link: https://arxiv.org/abs/2310.19573
   plugin: sources.py
   file: sources.yaml
diff --git a/_data/orcid.yaml b/_data/orcid.yaml
@@ -1 +1 @@
-- orcid: 0000-0002-4655-3773
+
diff --git a/_data/sources.yaml b/_data/sources.yaml
@@ -1,23 +1,36 @@
-- id: doi:10.1371/journal.pcbi.1007128
+- id: doi:10.48550/arXiv.2311.08118
+  date: '2023-11-14'
   type: paper
-  description: Lorem ipsum _dolor_ **sit amet**, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
-  date: 2020-12-4
-  image: https://journals.plos.org/ploscompbiol/article/figure/image?size=inline&id=info:doi/10.1371/journal.pcbi.1007128.g001&rev=2
-  buttons:
-    - type: manubot
-      link: https://greenelab.github.io/meta-review/
-    - type: source
-      text: Manuscript Source
-      link: https://github.com/greenelab/meta-review
-    - type: website
-      link: http://manubot.org/
+  image: images/sa.png
+  authors:
+  - Oscar Llorente Gonzalez
+  - "P\xE9ter Vaderna"
+  - "S\xE1ndor Laki"
+  - "Roland Kotrocz\xF3"
+  - Rita Csoma
+  - "J\xE1nos M\xE1rk Szalai-Gindl"
+  description: Explainability in Graph Neural Networks (GNNs) is a new field growing in the last few years. In this publication we address the problem of determining how important is each neighbor for the GNN when classifying a node and how to measure the performance for this specific task. To do this, various known explainability methods are reformulated to get the neighbor importance and four new metrics are presented. Our results show that there is almost no difference between the explanations provided by gradient-based techniques in the GNN domain. In addition, many explainability techniques failed to identify important neighbors when GNNs without self-loops are used.
   tags:
-    - open science
-    - collaboration
-  repo: greenelab/meta-review
-
-- id: doi:10.1016/j.csbj.2020.05.017
-  image: https://ars.els-cdn.com/content/image/1-s2.0-S2001037020302804-gr1.jpg
+  - GAI Lab
+  - Ericsson GAIA
+  - Ericsson Research
+  buttons:
+  - type: paper
+    text: Manuscript
+    link: https://arxiv.org/abs/2311.08118
+  - type: github
+    text: Source Code
+    link: EricssonResearch/gnn-neighbors-xai
 
-- id: doi:10.7554/eLife.32822
-  image: https://iiif.elifesciences.org/lax:32822%2Felife-32822-fig8-v3.tif/full/863,/0/default.webp
+- id: doi:10.48550/arXiv.2310.19573
+  date: '2023-10-30'
+  type: paper
+  # image: images/sa.png
+  authors:
+  - Sharath M Shankaranarayana
+  description: Supervised machine learning relies on the availability of good labelled data for model training. Labelled data is acquired by human annotation, which is a cumbersome and costly process, often requiring subject matter experts. Active learning is a sub-field of machine learning which helps in obtaining the labelled data efficiently by selecting the most valuable data instances for model training and querying the labels only for those instances from the human annotator. Recently, a lot of research has been done in the field of active learning, especially for deep neural network based models. Although deep learning shines when dealing with image\textual\multimodal data, gradient boosting methods still tend to achieve much better results on tabular data. In this work, we explore active learning for tabular data using boosted trees. Uncertainty based sampling in active learning is the most commonly used querying strategy, wherein the labels of those instances are sequentially queried for which the current model prediction is maximally uncertain. Entropy is often the choice for measuring uncertainty. However, entropy is not exactly a measure of model uncertainty. Although there has been a lot of work in deep learning for measuring model uncertainty and employing it in active learning, it is yet to be explored for non-neural network models. To this end, we explore the effectiveness of boosted trees based model uncertainty methods in active learning. Leveraging this model uncertainty, we propose an uncertainty based sampling in active learning for regression tasks on tabular data. Additionally, we also propose a novel cost-effective active learning method for regression tasks along with an improved cost-effective active learning method for classification tasks.
+  buttons:
+  - type: paper
+    text: Manuscript
+    link: https://arxiv.org/abs/2310.19573
+
diff --git a/_includes/citation.html b/_includes/citation.html
@@ -14,7 +14,7 @@
       class="citation-image"
       aria-label="{{ citation.title | default: "citation link" }}"
     >
-      <img
+      <img 
         src="{{ citation.image | relative_url }}"
         alt="{{ citation.title | default: "citation image" }}"
         loading="lazy"