Skip to content

Commit

Permalink
Updated Docs for 1.2.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
jexp committed May 30, 2018
1 parent 0df34a9 commit aa4af9b
Show file tree
Hide file tree
Showing 8 changed files with 893 additions and 343 deletions.
71 changes: 42 additions & 29 deletions README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,16 @@ Neo4j ETL allows importing data from relational databases into Neo4j.

== Features

// * Wizard UI in Neo4j Desktop
* Manage multiple rdbms connections
* Wizard UI in Neo4j Desktop
* Manage multiple RDBMS connections
* automatically extract database metadata from relational database
* derive graph model
* visually edit labels, relationship-types, property-names and types
* visualize current model as a graph
* persist mapping as json
* dump relevant CSV from relational databases
* retrieve relevant CSV data from relational databases
* run import via neo4j-import, bolt-connector, cypher-shell, neo4j-shell
* bundles mysql, postgres, allows custom jdbc driver with Neo4j Enterprise
* bundles MySQL, PostgreSQL, allows custom JDBC driver with Neo4j Enterprise

== License

Expand Down Expand Up @@ -55,21 +55,9 @@ Download & unzip the latest https://github.com/neo4j-contrib/neo4j-etl/releases/

For detailed usage see also the: http://neo4j-contrib.github.io/neo4j-etl#neo4j-etl-cli[tool documentation].

////
== Neo4j-Desktop

You can add Neo4j ETL to Neo4j Desktop by adding the following lines to your `$DESKTOP/Application/graphApps.json`
[source,json]
----
[
{
"appId": "neo4j-etl-ui",
"appName": "ETL App",
"packageUrl": "https://neo.jfrog.io/neo/api/npm/npm/neo4j-etl-ui"
}
]
----
You can add Neo4j ETL to Neo4j Desktop by adding the appropriate application key.

Then the next time you start Neo4j Desktop you'll see Neo4j ETL as a UI to be used interactively.

Expand All @@ -85,32 +73,57 @@ Then the next time you start Neo4j Desktop you'll see Neo4j ETL as a UI to be us
| image:{img}/import-data.jpg[width=200]
|===

////
.Location of $DESKTOP
|===
| macOS | ~/Library/Application Support/Neo4j Desktop |
| Windows | %APPDATA%/Neo4j Desktop |
| Linux | ~/.config/Neo4j Desktop |
|===
////

== JDBC Drivers

Neo4j ETL bundles drivers for MySQL and PostgreSQL, for other database if you use Neo4j Enterprise you can specify your own JDBC driver.
The drivers for MySQL and PostgreSQL are bundled with the Neo4j-ETL tool.

[[jdbc-drivers]]
To use other JDBC drivers use these download links and JDBC URLs.
Provide the JDBC driver jar-file to the command line tool or Neo4j-ETL application.
And use the JDBC-URL with the `--rdbms:url` parameter or in the JDBC-URL input field.

[options="header",cols="a,3m,a"]
|===
|Vendor |JDBC Driver URL
|Database | JDBC-URL | Driver Source

|Oracle
|jdbc:oracle:thin:<user>/<pass>@<host>:<port>/<service_name>
|http://www.oracle.com/technetwork/database/features/jdbc/index.html[Oracle JDBC Driver]

|MS SQLServer
|jdbc:sqlserver://;servername=<servername>;databaseName=<database>;user=<user>;password=<pass>
|https://www.microsoft.com/en-us/download/details.aspx?id=11774[SQLServer Driver]

|IBM DB2
|jdbc:db2://<host>:<port/5021>/<database>:user=<user>;password=<pass>;
|http://www-01.ibm.com/support/docview.wss?uid=swg21363866[DB2 Driver]

|Derby
|jdbc:derby:derbyDB
|Included since JDK6

|Cassandra
|jdbc:cassandra://<host>:<port/9042>/<database>
|link:https://github.com/adejanovski/cassandra-jdbc-wrapper#installing[Cassandra JDBC Wrapper]

|https://www.mysql.com/[MySql]
|http://dev.mysql.com/downloads/connector/j/
|SAP Hana
|jdbc:sap://<host>:<port/39015>/?user=<user>&password=<pass>
|https://www.sap.com/developer/tutorials/hxe-connect-hxe-using-jdbc.html[SAP Hana ngdbc Driver]

|http://www.postgresql.com/[PostgreSQL]
|https://jdbc.postgresql.org/download.html
|MySQL
|jdbc:mysql://<hostname>:<port/3306>/<database>?user=<user>&password=<pass>
|http://dev.mysql.com/downloads/connector/j/[MySQL Driver]

|https://www.oracle.com/[Oracle]
|http://www.oracle.com/technetwork/database/features/jdbc/default-2280470.html
|PostgreSQL
|jdbc:postgresql://<hostname>/<database>?user=<user>&password=<pass>
|https://jdbc.postgresql.org/download.html[PostgreSQL JDBC Driver]

|https://www.microsoft.com/en-us/sql-server/[Microsoft SQL Server]
|https://www.microsoft.com/en-us/download/details.aspx?id=55539
|===
41 changes: 23 additions & 18 deletions docs/index.adoc
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
= Neo4j ETL
:img: img

== Overview

include::../README.adoc[lines=2..-1,leveloffset=+1]
include::../README.adoc[lines=6..-1,leveloffset=+1]

== Introduction

Expand All @@ -12,7 +13,7 @@ It applies some simple rules for transforming the relational model.
The process as outlined below:

. Read database metadata and generate mapping.json
. Optionally edit mapping.json with the `neo4j-etl-ui`
. Optionally edit mapping.json with the `neo4j-etl-ui` in Neo4j Desktop
. Export relational data to CSV
. Generate Mapping Headers
. Import into Neo4j using
Expand All @@ -29,36 +30,37 @@ image::neo4j-etl-architecture.png[]

* Command-Line tools
* Java API/library
* Infer Schema to mapping file
* Infer Schema and save in mapping file
* Filter and merge strategies
* Describe indexes
* Non-trivial datatypes (dates, binary)
// * Describe indexes
// * Non-trivial datatypes (dates, binary)
* Read mapping file to export data from other databases then
* Import into Neo via different tools (`neo4j-import`, `neo4j-shell`, `cypher-shell`, `java bolt driver`)
* Work in offline and online mode
* Import in both an empty (initial load) and not-empty graph (incremental)
* Build indexes and constraints
* Non-trivial datatypes (dates, binary)
// * Non-trivial datatypes (dates, binary)
* Support on Unix-like and Microsoft Operating Systems
* Support for most popular relational databases like MySQL, PostgreSQL, Oracle and Microsoft SQL
* Support user specified JDBC drivers
* UI tool to modify mappings
* UI tool to visually modify mappings

=== Plans for the Future

* Custom Mapping Rules + Transformations for names, data, links
* Exemplary integration into a 3rd party ETL pipeline
* More data types (binary, datetime, geo)

=== Who is it for

* Developer learning or playing with Neo4j for initial data import
* Developer learning to work with Neo4j for initial data import
* Partners providing data integration with Neo4j
* Enterprise developers building applications based on well modeled relational data

=== Open Questions

* Date and binary datatypes
* Security (secure connections, handling of passwords, encrypting data?)
* Security (secure connections, handling of passwords, encrypting data)

include::neo4j-etl.adoc[]

Expand All @@ -68,15 +70,18 @@ include::neo4j-etl.adoc[]

* Generic relational database mapping based on the following rules
* A _table_ with a foreign key is treated as a _Join_ and imported as a _node_ with a _relationship_
* Ex: `+*Person* -> Address+` is imported as `+*(Person)-[:ADDRESS_ID]->*(Address)+`
* Ex: `+Person -> Address+` is imported as `+(Person)-[:ADDRESS_ID]->(Address)+`
* A _table_ that has two foreign keys is imported as a _JoinTable_ and imported as a _relationship_
* Ex: `+Student <- *Student_Course* -> Course+` is imported as +
`+(Student) -[*:STUDENT_COURSE*]-> (Course)+`
* Ex: `+Student <- Student_Course -> Course+` is imported as +
`+(Student) -[:STUDENT_COURSE]-> (Course)+`
* A _table_ that has more than two foreign keys is treated as an _intermediate node_ and imported as _node with multiple relationships_
* Ex: `+*Order_Detail* -> Shipping_Address, *Order_Detail* -> Payment_Information, *Order_Detail* -> Shipment_Instructions+` is imported as, +
`+(Shipping_Address) *-[:SHIPPING]-> (Order_Detail)*#` +
`+(Payment_Information) *-[:PAYMENT]-> (Order_Detail)*+` +
`+(Shipment_Instructions) *-[:SHIPMENT]-> (Order_Detail)*+` +
* Ex: `+Order_Detail -> Shipping_Address, Order_Detail -> Payment_Information, Order_Detail -> Shipment_Instructions+` is imported as

----
(Shipping_Address) -[:SHIPPING]-> (Order_Detail)
(Payment_Information) -[:PAYMENT]-> (Order_Detail)
(Shipment_Instructions) -[:SHIPMENT]-> (Order_Detail)
----


* Resolve relationships through composite keys.
Expand All @@ -87,8 +92,8 @@ include::neo4j-etl.adoc[]
** _Decimal to be confirmed._

* Relationship names can either take _column name_ or the _table that is being referred to_
** `--relationship-name=table` then a `Person->Address` will become `(Person)-[*:ADDRESS*]->(Address)`
** `--relationship-name=column` will become `(Person)-[*:ADDRESS_ID*]->(Address)`
** `--relationship-name=table` then a `+Person->Address+` will become `+(Person)-[:ADDRESS]->(Address)+`
** `--relationship-name=column` will become `+(Person)-[:ADDRESS_ID]->(Address)+`

* Filter tables that you want to include or exclude using `--include` and `--exclude`

Expand Down
Loading

0 comments on commit aa4af9b

Please sign in to comment.