Skip to content
This repository has been archived by the owner on Dec 30, 2023. It is now read-only.
/ citation_map Public archive
forked from jaks6/citation_map

Create a Gephi Citation Graph based on Simplistic Text Analysis of PDFs

Notifications You must be signed in to change notification settings

bibliometrics/citation_map

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Create a Citation Graph based on Simplistic Text Analysis

Inspired by A.R. Siders' R Script from this ResearchGate question

Based on dpapathanasiou's example script for pdfminer

Takes Zotero .CSV Article collections and creates Gephi-compatible files for Graph Edges and Nodes based on citations

screenshot

Principle:

  • Let A be a set of known articles
  • For any a in A, let title_a be its title, and text_a be its text content
  • For some x in A and y in A, x!=y:
    • cites(x,y) is true if title_y appears in text_a

For the above to work, we do some text normalization (removing puncutation, whitespace, special characters) and assume that the title_y would only appear in text_x if it appears in the references section...

Usage:

  1. Export list of articles as .csv from Zotero, (articles should have File attachments)
  2. Run analyze_papers.py zotero_file.csv
  3. Script should produce have two files: Edges_titles.csv and Nodes_titles.csv in
  4. Load them into Gephi with "Load Spreadsheet"

Used libraries:

python 2.7 pdfminer

About

Create a Gephi Citation Graph based on Simplistic Text Analysis of PDFs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%